# Convolutinal Neural Network for Classifying Cars

### Background
Stanford AI has developed a dataset of cars with make, model and year. The aim of this project is to classify cars as accurately as possible using a convolutional neural network. We will use the Keras package with Tensorflow backend to run model training, and we will validate and evaluate the accuracy of the model based on the parameters.

### Contents
1. Build the classifier layers
2. Load training and test data
3. Run training
4. Evaluate model accuracy

Improving:https://github.com/CihanBosnali/Real-Time-Cars-Classification-Using-Keras/blob/master/model_trainer.py

#### Load packages

In [1]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os

from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPool2D
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers import Dense
from keras.optimizers import SGD

from keras.preprocessing.image import ImageDataGenerator

Using TensorFlow backend.


## Model A: Classifying Make and Model
<b>Aim</b>: Classify a car as part of the 196 classes in our dataset in terms of both make and model year. 

## 1. Build the Classifier

We will use 3 convolution layers for the sake of computing power, with all layers using the relu activation function. We use this function because of it's non-linear (compared to sigmoid, for example, which can cause neurons to 'vanish'). 

In [3]:
config = {'img_pixels': 256,
          'n_filters': 64,
          'layer_nodes': 512,
          'batchsize': 32,
          'epochs': 5,
          'kernel_size': (4,4),
          'pool_size': (2,2),
          'dropout':0.2,
          'steps_per_epoch': 100,
          'validation_steps': 5
          }

# config
img_pixels = config['img_pixels']
n_filters = config['n_filters']
layer_nodes = config['layer_nodes']
batchsize = config['batchsize']
epochs = config['epochs']
kernel_size = config['kernel_size']
pool_size = config['pool_size']
dropout = config['dropout']
steps_per_epoch = config['steps_per_epoch']
validation_steps = config['validation_steps']

In [5]:
folders = 0
fs = 0
for _, dirnames, filenames in os.walk("../scraped_images_2020/train"):
  # ^ this idiom means "we won't be using this value"
    folders += len(dirnames)
    fs += len(filenames)
    

In [7]:
folders

616

In [6]:
fs

39783

In [61]:
car_classifier = Sequential()
#Adding 1st Convolution and Pooling Layer
car_classifier.add(Conv2D(n_filters,kernel_size=kernel_size,input_shape=(img_pixels,img_pixels,3),activation='relu'))
car_classifier.add(MaxPool2D(pool_size=pool_size))
car_classifier.add(Dropout(dropout))
#Adding 2nd Convolution and Pooling Layer
car_classifier.add(Conv2D(n_filters,kernel_size=kernel_size,activation='relu'))
car_classifier.add(MaxPool2D(pool_size=pool_size))
car_classifier.add(Dropout(dropout))
#Adding 3rd Convolution and Pooling Layer
car_classifier.add(Conv2D(n_filters,kernel_size=kernel_size,activation='relu'))
car_classifier.add(MaxPool2D(pool_size=pool_size))
car_classifier.add(Dropout(dropout))
#Adding 4th Convolution and Pooling Layer
car_classifier.add(Conv2D(n_filters,kernel_size=kernel_size,activation='relu'))
car_classifier.add(MaxPool2D(pool_size=pool_size))
car_classifier.add(Dropout(dropout))
#Adding 5th Convolution and Pooling Layer
car_classifier.add(Conv2D(n_filters,kernel_size=kernel_size,activation='relu'))
car_classifier.add(MaxPool2D(pool_size=pool_size))
car_classifier.add(Dropout(dropout))

#Flatten
car_classifier.add(Flatten())

#Adding Input and Output Layer
car_classifier.add(Dense(units=layer_nodes,activation='relu'))
car_classifier.add(Dense(units=layer_nodes,activation='relu'))
car_classifier.add(Dense(units=layer_nodes,activation='relu'))
car_classifier.add(Dense(units=folders,activation='softmax'))

sgd = SGD(lr=0.01, clipvalue=0.5)
car_classifier.compile(optimizer = 'sgd', loss = 'categorical_crossentropy', metrics = ['accuracy'])

In [62]:
# Data agumentation
train_datagen = ImageDataGenerator(rescale=1./255,shear_range=0.2,zoom_range=0.2,horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)

train_data = train_datagen.flow_from_directory('../scraped_images_2020/train',
                                               target_size=(img_pixels,img_pixels),
                                               batch_size=batchsize,
                                               class_mode='categorical',
                                               shuffle=True,
                                               seed=42)

test_data = test_datagen.flow_from_directory('../scraped_images_2020/test',
                                             target_size=(img_pixels,img_pixels),
                                             batch_size=1,
                                             class_mode='categorical',
                                             shuffle=True,
                                             seed=42)

Found 39783 images belonging to 616 classes.
Found 12577 images belonging to 616 classes.


In [None]:
history = car_classifier.fit_generator(train_data,
                                       steps_per_epoch=steps_per_epoch,
                                       epochs=epochs,
                                       validation_data=test_data,
                                       validation_steps=validation_steps
                                      )

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5

E0310 09:19:35.043719 56064 ultratb.py:149] Internal Python error in the inspect module.
Below is the traceback from this internal error.

E0310 09:19:35.423724 56064 ultratb.py:149] Internal Python error in the inspect module.
Below is the traceback from this internal error.

E0310 09:19:35.566728 56064 ultratb.py:149] Internal Python error in the inspect module.
Below is the traceback from this internal error.



In [None]:
metrics = pd.DataFrame.from_dict(history.history)
metrics = pd.concat([pd.Series(range(0,30),name='epochs'),metrics],axis=1)
metrics.tail()

In [None]:
metrics = metrics.reset_index().drop('epochs', axis=1).rename(columns={'index': 'epochs'})
metrics['config'] = str(config)

In [None]:
metrics

#### Saving model to disk

In [None]:
import json

# serialize model to JSON
model_json = car_classifier.to_json()
with open("../models/cars_classifier_tuned_100eP_50ba_1ba(val).json", "w") as json_file:
    json_file.write(model_json)
    
# serialize weights to HDF5
car_classifier.save_weights("../models/cars_classifier_tuned_100eP_50ba_1ba(val).h5")
print("Saved model to disk")

In [None]:
# Save metrics
metrics.to_csv("../models/cars_classifier_metrics2.csv", index=False)

## 2. Test model on a single image

#### Loading Model from disk

In [None]:
from keras.models import model_from_json

# load json and create model
json_file = open('models/cars_classifier_untuned.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("models/cars_classifier_untuned.h5")
print("Loaded model from disk")

In [None]:
import numpy as np
from keras.preprocessing import image

#### Load single prediction image

In [None]:
# load image
test_image = image.load_img('../car_data/single_prediction/bmw_3series.jpg', target_size=(img_pixels, img_pixels))
test_image

In [None]:
# Convert to array and expand_dims because we are only doing 1 image prediction
test_image_array = image.img_to_array(test_image)
test_image_expand = np.expand_dims(test_image_array, axis=0)

#### Make Prediction

In [None]:
classes = car_classifier.predict(test_image_expand, batch_size=1)

results = {}
iterator = 0
for key in train_data.class_indices:
    results.setdefault(key, classes[0][iterator])
    iterator+=1

In [None]:
pd.DataFrame.from_dict(results, orient='index').sort_values(0, ascending=False).head(10)

## 3. Visualize Accuracy of Results

In [5]:
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.tools import FigureFactory as FF
from plotly.tools import get_embed
from plotly import offline

#### Read the results

In [6]:
results = pd.read_csv("../models/cars_classifier_metrics.csv")

results.drop('epochs', axis=1, inplace=True)
results.rename(columns={'Unnamed: 0': 'epochs'}, inplace=True)

In [7]:
results.head()

Unnamed: 0,epochs,val_loss,val_acc,loss,acc
0,0,5.262903,0.0104,5.279155,0.005632
1,1,5.170683,0.01336,5.202109,0.008648
2,2,5.142595,0.0118,5.143253,0.010056
3,3,5.115322,0.017409,5.115203,0.012671
4,4,5.052454,0.016802,5.062903,0.016383


#### Training vs Validation accuracy

In [12]:
random_x = results['epochs']
random_y0 = results['acc']
random_y1 = results['val_acc']

# Create traces
trace0 = go.Scatter(
    x = random_x,
    y = random_y0,
    mode = 'lines',
    name = 'Training Accuracy'
)
trace1 = go.Scatter(
    x = random_x,
    y = random_y1,
    mode = 'lines',
    name = 'Validation Accuracy'
)

layout = go.Layout(
    title=go.layout.Title(
        text='Training vs Validation Accuracy',
        xref='paper',
        x=0
    ),
    xaxis=go.layout.XAxis(
        title=go.layout.xaxis.Title(
            text='Epochs',
            font=dict(
                family='Courier New, monospace',
                size=18,
                color='#7f7f7f'
            )
        )
    ),
    yaxis=go.layout.YAxis(
        title=go.layout.yaxis.Title(
            text='Accuracy',
            font=dict(
                family='Courier New, monospace',
                size=18,
                color='#7f7f7f'
            )
        )
    )
)

data = [trace0, trace1]
acc_fig = go.Figure(data=data, layout=layout)

py.iplot(acc_fig, filename='line-mode')

In [18]:
offline.plot(acc_fig, filename='../plots/train_val_acc_algo1.html')
offline.plot(acc_fig, filename='../../../hm9464.github.io/site/plots/train_val_acc_algo1.html')

'../../../hm9464.github.io/site/plots/train_val_acc_algo1.html'

In [13]:
random_x = results['epochs']
random_y0 = results['loss']
random_y1 = results['val_loss']

# Create traces
trace0 = go.Scatter(
    x = random_x,
    y = random_y0,
    mode = 'lines',
    name = 'Training Loss'
)
trace1 = go.Scatter(
    x = random_x,
    y = random_y1,
    mode = 'lines',
    name = 'Validation Loss'
)

layout = go.Layout(
    title=go.layout.Title(
        text='Training vs Validation Loss',
        xref='paper',
        x=0
    ),
    xaxis=go.layout.XAxis(
        title=go.layout.xaxis.Title(
            text='Epochs',
            font=dict(
                family='Courier New, monospace',
                size=18,
                color='#7f7f7f'
            )
        )
    ),
    yaxis=go.layout.YAxis(
        title=go.layout.yaxis.Title(
            text='Loss',
            font=dict(
                family='Courier New, monospace',
                size=18,
                color='#7f7f7f'
            )
        )
    )
)

data = [trace0, trace1]
loss_fig = go.Figure(data=data, layout=layout)

py.iplot(loss_fig, filename='line-mode')

In [17]:
offline.plot(loss_fig, filename='../plots/train_val_loss_algo1.html')
offline.plot(loss_fig, filename='../../../hm9464.github.io/site/plots/train_val_loss_algo1.html')

'../../../hm9464.github.io/site/plots/train_val_loss_algo1.html'

# ---------

## Model B: Classifying Make only

<b>Aim</b>: Preidct the make of a car using the input image (e.g. BMW, Honda, etc).

For this model, we will need to aggregate our existing training and prediction dataset into make ONLY.