# Building the Vehicle Recognition Project

1. The Inception Module in Tensorflow
2. Building/Sourcing the dataset
3. Constructing the model
4. Training and improving the accuracy
5. Final Testing
6. Exporting the model
7. Running via Android Studio (OpenCV + Tensorflow libraries)

## 1. The Inception Module

* Below is an introduction to GoogLeNet and the inception module

### GoogLeNet
* Developed by Google, architecture won first place in 2014
* 21 times lighter than VGG-16 above
#### The Inception Module
* The key breakthrough of the GoogLeNet architecture was the development of the **inception module**
    * A sub-network that can perform parallel processing
    * The input is put through two layers and the output of each is concatenated together in a process called **depth concatenation**
##### Versions of the Inception Module
![Inception Modules](https://miro.medium.com/max/2698/1*aq4tcBl9t5Z36kTDeZSOHA.png)
* 1x1 Convolutions are called **bottlenecks** that perform compression in order to reduce parametric weight
##### Average Pooling
* Introducing average pooling after the convolutional block further reduces parametric weight and the computational advantage is incredible --> You do lose some info, but not enough to make the addition of AvgPooling not worth it

### Contructing the Network with Tensorflow and Keras
* Taking an object-oriented approach to the implementation of the Inception Module using the Functional API of Keras

## 2. Building/Sourcing the Dataset
* Following the traditional process of constructing a Tensorflow Dataset using images sourced from the SmartCNNs project on Github


In [1]:
directory = 'animals' # Directory of the animal images
animals = ['cat', 'butterfly', 'dog', 'sheep', 'spider', 'chicken', 'horse', 'squirrel', 'cow', 'elephant']
num_classes = len(animals)

# Tensorflow version I plan on using for this project
import tensorflow as tf
import os
print(tf.__version__)

2.1.0


In [2]:
list_ds = tf.data.Dataset.list_files(str(directory + '/*/*')) # /*/* go down to the files
for f in list_ds.take(5):
  print(f.numpy())

IMG_HEIGHT = 224
IMG_WIDTH = 224

b'animals/chicken/OIP-mFN57MsrJHRtpCcbq51anQHaCh.jpeg'
b'animals/sheep/OIP-i-ni93yhowK-kkjLzFNJCQHaEK.jpeg'
b'animals/squirrel/OIP-1L7YAkx0PfjhEef7yXhtPgHaFh.jpeg'
b'animals/spider/OIP-Y_-Ao5Q1BENkq32g--lUcQHaE9.jpeg'
b'animals/squirrel/OIP-1SVdvhAO-68BYuw0hj-3kAAAAA.jpeg'


In [3]:
def decode_img(img):
  # convert the raw string into a 3d tensor
  img = tf.image.decode_jpeg(img, channels=3)
  # Use `convert_image_dtype` to convert to floats in the [0,1] range.
  img = tf.image.convert_image_dtype(img, tf.float32)
  # resize the image to the desired size.
  return tf.image.resize(img, [IMG_WIDTH, IMG_HEIGHT])

def get_label_image_pair(file_path):
    
    # Find the class name -----------------------------
    segments = tf.strings.split(file_path, os.path.sep)
    # The second to last is the directory name
    tensor = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    mask = segments[-2] == animals
    label = tf.boolean_mask(tensor, mask) # CONVERT TO ONE-HOT
    
    # Get the image in raw format ---------------------
    img = tf.io.read_file(file_path)
    img = decode_img(img)
    return img, label

labeled_ds = list_ds.map(get_label_image_pair) #num_parallel_calls=tf.data.experimental.AUTOTUNE)

labeled_ds = labeled_ds.shuffle(buffer_size=1000).batch(32)
    
for image, label in labeled_ds.take(1):
    print("Image shape: ", image.numpy().shape)
    print("Label: ", len(label.numpy()), label.dtype)

Image shape:  (32, 224, 224, 3)
Label:  32 <dtype: 'int32'>


In [4]:
import matplotlib.pyplot as plt

def central_crop_transform(image, label):
  image = tf.image.central_crop(image, central_fraction=0.6)
  image = tf.image.resize(image, [IMG_WIDTH, IMG_HEIGHT])
  return image, label

def rotations_transform(image, label):
  image = tf.image.flip_up_down(image)
  image = tf.image.resize(image, [IMG_WIDTH, IMG_HEIGHT])
  return image, label

def brightness_transform(image, label):
  image = tf.image.adjust_brightness(image, 0.25)
  image = tf.image.resize(image, [IMG_WIDTH, IMG_HEIGHT])
  return image, label
    
all_datasets = [labeled_ds, labeled_ds.map(central_crop_transform), 
                labeled_ds.map(rotations_transform), labeled_ds.map(brightness_transform)]


## 2.5 Implementing the Inception Module using Keras Functional API
* The Naive Inception Module, other versions have the same applied theory with slight alterations

In [10]:
from tensorflow.keras.layers import Conv2D, MaxPooling2D, concatenate

def naive_inception_module(prev_layer, filters = [64, 128, 32]):
    conv_1by1 = Conv2D(filters[0], kernel_size = (1,1), padding = 'same', activation = 'relu')(prev_layer)
    conv_3by3 = Conv2D(filters[1], kernel_size = (3,3), padding = 'same', activation = 'relu')(prev_layer)
    conv_5by5 = Conv2D(filters[2], kernel_size = (5,5), padding = 'same', activation = 'relu')(prev_layer)
    conv_pooled = MaxPooling2D((3,3), strides=(1,1), padding='same')(prev_layer)
    return concatenate([conv_1by1, conv_3by3, conv_5by5, conv_pooled], axis = -1) # -1 for depth concat


## 3. Constructing the model

I am going to use the inception modules to create a CNN for this project, utilizing both bottlenecks and dropout to improve model accuracy and reduce computational complexity

In [11]:
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Flatten, Input

# Built using Functional so I can use the naive inception module defined above
model_inputs = Input(shape = [input_shape])
conv1 = Conv2D(32, kernel_size=(5,5), input_shape=input_shape)(model_inputs)
conv2 = Conv2D(32, kernel_size=(3,3), input_shape=input_shape)(conv1)
conv3 = Conv2D(32, kernel_size=(1,1), input_shape=input_shape)(conv2)
pooling = MaxPooling2D(pool_size=(2,2))(conv3)
naive = naive_inception_module(pooling)
conv4 = Conv2D(32, kernel_size=(3,3), input_shape=input_shape)(naive)
conv5 = Conv2D(32, kernel_size=(1,1), input_shape=input_shape)(conv4)
pooling2 = MaxPooling2D(pool_size=(2,2))(conv5)
predictions = Dense(num_classes, activation='softmax')(Flatten()(pooling2))
model = Model(inputs = model_inputs, outputs = predictions)

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(), 
              metrics=['accuracy'])

In [13]:
model.fit(all_datasets[0], epochs=5, verbose=1)

Train for 819 steps
Epoch 1/5
 24/819 [..............................] - ETA: 1:17:38 - loss: 4.7542 - accuracy: 0.1780

KeyboardInterrupt: 