# Mobile Nets

Mobile NEts are a class of small low-power low latency models thar can be used for things like:
- classification
- detection

Because of their small size thse models are considered great for mobile devices.

Model | Size MB | Parameters (Million) | Accuracy
----| ----|----| -----
VGG16| 533 | 138 | High
Mobile Net |17 | 4.2| Almost High
    
Paper that compares the MobileNet accuracy, [here](https://arxiv.org/pdf/1704.04861.pdf).

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model
from tensorflow.keras.applications import imagenet_utils
from sklearn.metrics import confusion_matrix
import itertools
import os
import shutil
import random
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# check the GPU
physical_devices = tf.config.experimental.list_physical_devices('GPU')
print("Num GPUs Available: ", len(physical_devices))
tf.config.experimental.set_memory_growth(physical_devices[0], True)

In [None]:
# Load model
mobile = tf.keras.applications.mobilenet.MobileNet()

# Data Preparation

In [None]:
# save the images on the path specified insie of this function
def prepare_image(file):
    
    # path of where you saved your images
    img_path = 'data/MobileNet-samples/'
    # pich the image and resize it to (224,224)
    img = image.load_img(img_path + file, target_size=(224, 224))
    # tranform the image into an array format
    img_array = image.img_to_array(img)
    # expand the dimensions required by MobileNet miodel
    img_array_expanded_dims = np.expand_dims(img_array, axis=0)
    
    # return the image preprocessed by the MobileNet model
    return tf.keras.applications.mobilenet.preprocess_input(img_array_expanded_dims)

# Display Images and analyse model Prediction Results

**Image 1**

In [None]:
# Display an image from our image folder
from IPython.display import Image
Image(filename='data/MobileNet-samples/1.PNG', width=300)

In [None]:
# Pass that image to the MobileNet preprocess
preprocessed_image = prepare_image'(1.PNG')

# Make a prediction using the model
predictions = mobile.predict(preprocessed_image)

# save the results
results= imagenet_utils.decode_predictions(predictions)

# display results
results

# check the prediction is corrected
assert results[0][0][1] == 'American_chameleon'

**Image 2**

In [None]:
Image(filename='data/MobileNet-samples/2.PNG', width=300)

# Pass that image to the MobileNet preprocess
preprocessed_image = prepare_image'(2.PNG')

# Make a prediction using the model
predictions = mobile.predict(preprocessed_image)

# save the results
results= imagenet_utils.decode_predictions(predictions)

# display results
results

# check the prediction is corrected
assert results[0][0][1] == 'espresso'

**Image 3**

In [None]:
Image(filename='data/MobileNet-samples/3.PNG', width=300)

# Pass that image to the MobileNet preprocess
preprocessed_image = prepare_image'(3.PNG')

# Make a prediction using the model
predictions = mobile.predict(preprocessed_image)

# save the results
results= imagenet_utils.decode_predictions(predictions)

# display results
results

# check the prediction is corrected
assert results[0][0][1] == 'strawberry'

# MobileNet Fine-Tune

The dataset that we are working here is very different from the dataset the MobileNet was trained which used the ImageNet library. 

We are working with a dataset of images of sign language digits. here we have ten classes from 0 to 9. Each classes containes images of the aprticular sign for that digit. The images are grayscale

Download dataset here()
- [kaggle](https://www.kaggle.com/ardamavi/sign-language-digits-dataset): grayscale images
- [github](https://github.com/ardamavi/Sign-Language-Digits-Dataset): RGB images (we'll use this)


# Image preparation

- 10 classes(digits 0-9)
- Class data:
    - class 0: 205 images
    - class 1: 206 images
    - class 2: 206 images
    - class 3: 206 images
    - class 4: 207 images
    - class 5: 207 images
    - class 6: 207 images
    - class 7: 206 images
    - class 8: 208 images
    - class 9: 204 images

## organzie data intro train, valid, test dirs

In [None]:
# Put the folder path into memory
os.chdir('data/Sign-Language-Digits-Dataset')

# If not find the folder create folders in disk
if os.path.isdir('train/0') is False:
    os.mkdir('train')
    os.mkdir('valid')
    os.mkdir('test')
    
    # loop for each class
    for i in range(0,10):
        # move each class data to train
        shutil.move(f'{i}', 'train')
        # create a folder for each class
        os.mkdir(f'valid/{i}')
        os.mkdir(f'test/{i}')
        
        # take randomly 30 records from train and move it on validation folder
        valid_samples = random.sample(os.listdir(f'train/{i}'), 30)
        for j in valid_samples: 
            shutil.move(f'train/{i}/{j}', f'valid/{i}')
        
        # take randomly 5 records from train and move it on validationtest folder
        test_samples = random.sample(os.listdir(f'train/{i}'), 5)
        for j in test_samples: 
            shutil.move(f'train/{i}/{j}', f'valid/{i}')
            
os.chdir('../..')

In [None]:
# check the number of files in each folder
for i in range(0,10):
    assert len(os.listdir(f'data/Sign-Language-Digits-Dataset/valid/{i}')) == 30
    assert len(os.listdir(f'data/Sign-Language-Digits-Dataset/test/{i}')) == 5

In [None]:
train_path = 'data/Sign-Language-Digits-Dataset/train'
valid_path = 'data/Sign-Language-Digits-Dataset/valid'
test_path = 'data/Sign-Language-Digits-Dataset/test'

## Preprocessing data

In [None]:
train_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.mobilenet.preprocess_input).flow_from_directory(
    directory=train_path, target_size=(224,224), batch_size=10)
valid_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.mobilenet.preprocess_input).flow_from_directory(
    directory=valid_path, target_size=(224,224), batch_size=10)
test_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.mobilenet.preprocess_input).flow_from_directory(
    directory=test_path, target_size=(224,224), batch_size=10, shuffle=False)

In [None]:
# to check if we have the right amount of images
assert train_batches-n == 1712
assert train_batches-n == 300
assert train_batches-n == 50
assert train_batches.num_classes = valid_batches.num_classes = test_batches.num_classes = 2

# Modify Model

In [None]:
# download moel to your disk : requires internt connection
mobile = tf.keras.applications.mobilenet.MobileNet()

In [None]:
# check the model architecture 
mobile.summary()

In [None]:
# check if the model is corrected downloaded evaluating some model parameters
params = count_params(mobile)
assert params['non_trainable_params'] == 21888
assert params['trainable_params'] == 4231976

## Fune-Tuning process

This process start out with us getting all of the layers up to the 6th to last layer. So the last five layers are not included

So all these layers are what we are going to keep and transfer into a new model (new fine-tuned model) and we are not going to include the last FIVE layers.

This is a choice after a tille experimentating and testing the number of layers that you choose to include versus not include whenever you're fine tunning a model is going to come through experimentation and personal choice. 

In [None]:
# take the layers of interest
x = mobile.layers[-6].output
# add thoses layers to the output layer. 
# This is a functional model and this is why it can appear a litle strange
output = Dense(units=10, activation='softmax')(x)

In [None]:
# takes the original input of the model and the output is the output that we already specified 
model = Model(inputs=model.input, outputs=output)

In [None]:
# freeze the weights and the bias of the model because we don't want to train again the model
# the choice to train the last 23 layers was achieved by experimentation
for layer in model.layers[:-23]:
    layer.trainable = False

In [None]:
model.summary()

In [None]:
params = count_params(model)
assert params['non_trainable_params'] == 1365184
assert params['trainable_params'] == 1873930

# Train Model

In [None]:
model.compile(optimizer=Adam(learning_rate=0.0001), 
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# fit model to our data
# run for more epochs (~30) to see better results
model.fit(x=train_batches, validation=valid_batches, epochs=30, verbose=2)

Improvements that we can do:
- change the number of freeze layers
- change the epochs number

# Predicts sign language digits

In [None]:
tets_lables = test_batches.classes

In [None]:
predictions = model.predict(x=test_batches, verbose=0)

In [None]:
cm = confusion_matrix(y_true=test_labels, y_pred=predictions.argmax(axis=1))

In [None]:
test_batches.class_indices

In [None]:
# Train the model for more epochs to see better results
cm_plot_results = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
plot_confusion_matrix(cm=cm, classes=cm_plot_labels, title='Confusion Matrix')