## Importing Libraries

Installing Caffe

In [None]:
!git clone https://github.com/BVLC/caffe

In [None]:
import os

os.environ['CAFFE_ROOT'] = "/kaggle/working/caffe"
os.environ['PYTHONPATH'] = "/kaggle/working/caffe/python:/kaggle/lib/kagglegym:/kaggle/lib"
!echo $PYTHONPATH

In [None]:
! apt-get update
# caffe is pre-installed on kaggle. if it wasn't, you can install using this:
# ! apt install caffe-cuda

Imoprting other necessary libraries

In [None]:
#Files
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import cv2
import glob

#DATA
from keras.preprocessing.sequence import pad_sequences
from keras.preprocessing.text import one_hot
from keras.utils.np_utils import to_categorical
from sklearn.model_selection import train_test_split
import tensorflow as tf

## Preprocessing data

For machine learning subjects we normally split the raw dataset into two sub-dataset, one for training and one for validation. For this dataset we already have a test dataset ready for evalutaion but its used for final score calculated by kaggle, se we still need to write a function to seperate the train dataset into a train and a validation used for our own testing.

As you see the datas are already seperated into each class: one directory for images with label c0, another directory for images with label c1, etc.
For loading the dataset thus we don't need the `.csv` file provided, we can go to the directories one by one and load the images, and for each image we record the parent directory name (c0, c1, ...) as the label for that image. 

In [None]:
def _prepareData(path): 
    '''
    This function splits raw dataset into training and validation sub-datasets.
    
    parameters: path(str) of the directory and flag(int) to know if we prepare data of training or testing
    return: (list) of images of the dataset and the (list) of labels
    
    For training:
    -Read images of every directory and extract all images
    -Resize to (128,128,3)
    -Read the directory name and asign as a class
    '''
    imgs_list = []
    labels = []
    # For each class directory in imgs/train/*
    for directory in sorted(glob.glob(os.path.join(path, '*')), key = lambda k: k.split("/")[-1]):
            # Read all the images in this class
            for img in glob.glob(os.path.join(directory,'*.jpg')):
                img_cv = cv2.imread(img)
                img_cv_r = cv2.resize(img_cv,(128,128))   # Resizing images makes them faster to read
                imgs_list.append(img_cv_r)
                labels.append(int(directory.split("/")[-1].replace('c','')))  # Reading parent dir name for label
    
    # Leaving 80% of train images for training and saving 20% of them for validation
    X_Train, X_Test, Y_Train, Y_Test = train_test_split(imgs_list, labels, test_size = 0.2)
    
    # Keras has API to do one-hot-encoding the categorical datas. 
    # If you don't know what categorical data is, look here:
    # https://machinelearningmastery.com/how-to-prepare-categorical-data-for-deep-learning-in-python/
    Y_Train = tf.keras.utils.to_categorical(Y_Train, num_classes=10)
    Y_Test = tf.keras.utils.to_categorical(Y_Test, num_classes=10)

    return np.array(X_Train), np.array(X_Test), Y_Train, Y_Test

## Get Data

Here we use the function we just defined above to load the data. Reading can take a few minutes as we are loading 102k jpg files into RAM, that's a downside for loading all the images once into python lists. If you don't like the slow loading time, you can go for the classic way of working with `.csv` files. 

In [None]:
#Paths
path_train_images = "/kaggle/input/state-farm-distracted-driver-detection/imgs/train/"
path_submission_images =  "/kaggle/input/state-farm-distracted-driver-detection/imgs/test/"

# List of Images for Train and Test
X_Train, X_Test, Y_Train, Y_Test = _prepareData(path_train_images)

print("Size X_Train: {}, Size Y_Train: {}".format(len(X_Train),len(Y_Train)))
print("Size X_Test: {}, Size Y_Test: {}".format(len(X_Test),len(Y_Test)))

## Check data integrity

### Classes:
* c0: safe driving
* c1: texting - right
* c2: talking on the phone - right
* c3: texting - left
* c4: talking on the phone - left
* c5: operating the radio
* c6: drinking
* c7: reaching behind
* c8: hair and makeup
* c9: talking to passenger

In [None]:
print(len(X_Train))
print(X_Train[202].shape)

im = X_Train[202]
RGB_im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
plt.imshow(RGB_im)
plt.show()
print("Class: {}".format(Y_Train[202]))

## Check data distribution

Extracting some data from the `.csv` file to see how much of each class do we have.

In [None]:
data_file = pd.read_csv("/kaggle/input/state-farm-distracted-driver-detection/driver_imgs_list.csv")
# print(data_file.head())
data_x = list(pd.unique(data_file['classname']))

# Clustring all images of each class together
data_classes = data_file.loc[:,['classname','img']].groupby(by='classname').count().reset_index()
# print(data_classes)
data_y =list(data_classes['img'])

# Plotting them using matplot
plt.rcParams.update({'font.size': 22})
plt.figure(figsize=(30,10))
plt.bar(data_x, data_y, color=['cornflowerblue', 'lightblue', 'steelblue'])  
plt.ylabel('Count classes')
plt.title('Classes')
plt.xticks(rotation=45)

# Caffe Overview

Caffe is a deep learning framework developed by the Berkeley Vision and Learning Center (BVLC). It is written in C++ and has Python and Matlab interfaces.

There are 4 steps in training a CNN using Caffe:

*   Step 1 - Data preparation: In this step, we get the images and store them in a format that can be used by Caffe. Here we will write a Python script that will handle image storage.
  
*   Step 2 - Model definition: In this step, we choose a CNN architecture and we define its parameters in a configuration file with extension `.prototxt`.
  
*   Step 3 - Solver definition: The solver is responsible for model optimization. We define the solver parameters in a configuration file with extension `.prototxt`.
  
*   Step 4 - Model training: We train the model by executing `caffe` command from the terminal. After training the model, we will get the trained model in a file with extension `.caffemodel`.
  
After the training phase, we will use the `.caffemodel` trained model to make predictions of new unseen data.

# Data Preparation

Caffe has multiple ways of reading data. Here we tried to read the images as standard Caffe way, which is using efficient LMDB databases. You can find more ways [here](http://caffe.berkeleyvision.org/tutorial/layers.html).

In [None]:
import caffe
from caffe.src.caffe import proto


def make_datum(img, label):
    """
    For making a LMDB database we first need to make each image into a datum object.
    This function does that.
    parameters: 
        img: numpy.ndarray (BGR instead of RGB)
        label: int
    """
    #
    return proto.caffe_pb2.Datum(
        channels=3,
        width=128,
        height=128,
        label=label,
        data=np.rollaxis(img, 2).tostring())

In [None]:
import lmdb


def make_lmdb(lmdb_path, x_data, y_data):
    """
    Get the path for making the database,
    and then read the dataset images and send them to database one by one
    """
    in_db = lmdb.open(lmdb_path, map_size=int(1e12))
    with in_db.begin(write=True) as in_txn:
        for idx, img in enumerate(x_data):
            datum = make_datum(img, y_data[idx])  # Making datum object
            in_txn.put('{:0>5d}'.format(in_idx), datum.SerializeToString())
            print '{:0>5d}'.format(in_idx) + ':' + img_path
    in_db.close()
    return None

In [None]:
train_lmdb = 'input/train_lmdb'
val_lmdb = 'input/validation_lmdb'

make_lmdb(train_lmdb, X_Train, Y_Train)
make_lmdb(val_lmdb, X_Test, Y_Test)

# Create architecture

Caffe philosophy is expressivity and speed. For that we use text files to define networks, instead of code API like Keras. Coding is possible in Caffe too, but highly discoureged.

After deciding on the CNN architecture, we need to define its parameters in a `.prototxt` file. Here is the details of the defined network structure in my git repo.

## 1. Data Layer

Data enters Caffe through data layers: they lie at the bottom of nets. Data can come from efficient databases (LevelDB or LMDB), directly from memory, or, when efficiency is not critical, from files on disk in HDF5 or common image formats.
Parameters we have in data layer:
* source: the path to the datas it needs to read
* backend: specifies the data type that we read
* batch_size: specifies the size of image batches to read at each step


    layer {
      name: “data”
      type: “Data”
      include {
        phase: TRAIN   # Or TEST
      }
      data_param {
        source: "/kaggle/working/input/train_lmdb"
        backend: LMDB
        batch_size: 265
      }
      top: “data”
      top: “label”
    }

## 2. Convolution layer

This layer recieves the data blob from last layer and produces conv1 blob. Convolution layers in neural networks generally convolve the input image with a set of learnable filters, each producing one feature map in the output image.

this layer produces 20 filters and kernel size is 5 with the stride of 1 done on input. Fillers help us initialize weight and bias values randomly. Here we use Xavier algorithm to automatically initialize weights based on the number of input and output neurons. And for bias we use a simple constant number of zero. `lr_mult` is also the settings for learning rate, here we set the learning rate for weights same as the resolver in runtime and the learning rate for bias twice of that.


    layer {
      name: "conv1"
      type: "Convolution"
      param { lr_mult: 1 }
      param { lr_mult: 2 }
      convolution_param {
        num_output: 20
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
      bottom: "data"
      top: "conv1"
    }

## 3. Pooling layer

We set the `pool` to max so it does max pooling operation on convolution outputs.

    layer {
      name: "pool1"
      type: "Pooling"
      pooling_param {
        kernel_size: 2
        stride: 2
        pool: MAX
      }
      bottom: "conv1"
      top: "pool1"
    }

## 4. Dense layer

This layer is similar to previous layers too. Dense layers are knows as InnerProduct layers in Caffe. Here we have a dense layer which has 500 output and parameters is same as previous layers explained.

    layer {
      name: "ip1"
      type: "InnerProduct"
      param { lr_mult: 1 }
      param { lr_mult: 2 }
      inner_product_param {
        num_output: 500
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
      bottom: "pool2"
      top: "ip1"
    }

## 5. ReLU layer

Since ReLU is element-wise we can do the operation once and not waste memory. This can be done with defining one name for top and bottom layers. Note that we can not have same names for blob of other layers and this is pecuilar for this layer.

    layer {
      name: "relu1"
      type: "ReLU"
      bottom: "ip1"
      top: "ip1"
    }
   
After ReLU we define another Dense layer with `bottom: "ip1"` and `top: "ip2"`

## 6. Loss

We define loss as follow:

    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "ip2"
      bottom: "label"
    }

# Caffe Solver

The solver is responsible for model optimization. We define the solver's parameters in a `.prototxt` file. You can find my solver here: `CaffeCNN/caffe_models/caffe_model_1/solver_1.prototxt`. Below is a copy of the same.

This solver computes the accuracy of the model using the validation set every 1000 iterations. The optimization process will run for a maximum of 40000 iterations and will take a snapshot of the trained model every 5000 iterations.

`base_lr`, `lr_policy`, `gamma`, `momentum` and `weight_decay` are hyperparameters that we need to tune to get a good convergence of the model.

I chose `lr_policy: "step"` with `stepsize: 2500`, `base_lr: 0.00`1 and `gamma: 0.1`. In this configuration, we will start with a learning rate of 0.001, and we will drop the learning rate by a factor of ten every 2500 iterations.

There are different strategies for the optimization process. For a detailed explanation, you can read Caffe's [solver documentation](http://caffe.berkeleyvision.org/tutorial/solver.html).


    net: "/kaggle/working/CaffeCNN/caffe_models/caffe_model_1/caffe_model.prototxt"
    test_iter: 1000
    test_interval: 1000
    base_lr: 0.001
    lr_policy: "step"
    gamma: 0.1
    stepsize: 2500
    display: 50
    max_iter: 40000
    momentum: 0.9
    weight_decay: 0.0005
    snapshot: 5000
    snapshot_prefix: "/kaggle/working/CaffeCNN/caffe_models/caffe_model_1/caffe_model_1"
    solver_mode: GPU

# Getting model ready

I defined the network and solver in git repo, to get them we clone it. We have two `.prototxt` files, one for model and the other for solver.

In [None]:
! git clone https://github.com/Sadiqush/CaffeCNN

In [None]:
model_path = "CaffeCNN/caffe_models/caffe_model_1/caffe_model.prototxt"
solver_path = "CaffeCNN/caffe_models/caffe_model_1/solver_1.prototxt"

# Train

After defining the model and the solver, we can start training the model by executing the command below:

In [None]:
! caffe train --solver "/kaggle/working/CaffeCNN/caffe_models/caffe_model_1/solver_1.prototxt" 2>&1 | tee /kaggle/working/CaffeCNN/caffe_models/caffe_model_1/model_1_train.log

The training logs will be stored `CaffeCNN/caffe_models/caffe_model_1/model_1_train.log`.

During the training process, we need to monitor the loss and the model accuracy. We can stop the process at anytime by pressing Ctrl+c. Caffe will take a snapshot of the trained model every 5000 iterations, and store them under `caffe_model_1` folder.

The snapshots have `.caffemodel` extension. For example, 10000 iterations snapshot will be called: `caffe_model_1_iter_10000.caffemodel`.

# Prediction

Now that we have a trained model, we can use it to make predictions on new unseen data.

In [None]:
import caffe

net = caffe.Net('/kaggle/working/CaffeCNN/caffe_models/caffe_model_1/caffenet_deploy_1.prototxt',
                '/kaggle/working/CaffeCNN/caffe_models/caffe_model_1/caffe_model_1_iter_500.caffemodel',
                caffe.TEST)

In [None]:
out = net.forward()
pred_probas = out['prob']
print(pred_probas.argmax())

# Building Using Transfer Learning

Caffe comes with a repository that is used by researchers and machine learning practitioners to share their trained models. This library is called [Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo).

Using this command we download the CaffeNet network structure, trained on ImageNet dataset.

In [None]:
!wget http://dl.caffe.berkeleyvision.org/bvlc_reference_caffenet.caffemodel

# Model Training

After defining the model and the solver, we can start training the model by executing the command below. The model and solver configuration files are stored under `CaffeCNN/caffe_models/caffe_model_2`.

Note that we pass the trained model's weights by using the argument `--weights`.

In [None]:
!caffe train --solver="/kaggle/working/CaffeCNN/caffe_models/caffe_model_2/solver_2.prototxt" --weights "/kaggle/working/caffe/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel" 2>&1 | tee "/kaggle/working/CaffeCNN/caffe_models/caffe_model_2/model_2_train.log"

Prediction is similar to the previous section, the manually defined network.

## Save submission file

In [None]:
df.to_csv('submission_file.csv',index = False)