<a href="https://colab.research.google.com/github/daalopezm/Sign-Lenguaje-DL/blob/main/Sign_Lenguaje_Model_Trainner.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from google.colab import drive
#drive.mount('/content/gdrive/MyDrive/Portafolio/SignLenguaje/')
drive.mount('/content/gdrive/')

Mounted at /content/gdrive/


<h1>Loadding the database</h1>
<p>Here we can upload the database that we load in a csv. Each row, is a pixel hence we must do a reshape.

To do that, we must import pandas to load the csv. </p>

In [2]:
import pandas as pd
import os
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

WORK_DIRECTORY='/content/gdrive/MyDrive/Portafolio/SignLenguaje/'

file_train = os.path.join(WORK_DIRECTORY,'sign_mnist_train/sign_mnist_train.csv')
file_test = os.path.join(WORK_DIRECTORY,'sign_mnist_test/sign_mnist_test.csv')

data_set = {'TRAIN': pd.read_csv(file_train), 'TEST': pd.read_csv(file_test)}

<h1>Dataset Generators</h1>
<p>Memory has a limit, hence it is better to use a python generator. To give a good use to such tool, we must separate the images by their category.</p>

In [3]:
from PIL import Image
from uuid import uuid4
import string

In [4]:
paths = {
    'TRAIN_LOCATION_IMAGES': os.path.join('data_train'),
    'TEST_LOCATION_IMAGES': os.path.join('data_test')
}

# Creating test-train directories
for path in paths.values():
  if not os.path.exists(path):
    os.makedirs(path)

# Defining letters.
alphabetic_letters = np.array([char for char in string.ascii_lowercase if char != 'j' if char != 'z']
)
numeric_letters = np.unique(data_set['TRAIN']['label'])

In [5]:
# Allocating images by category
dataset_types = ['TRAIN', 'TEST']
for dataset_type in dataset_types:

  for letter in numeric_letters:

    letra = alphabetic_letters[numeric_letters==letter][0]
    
    if not os.path.exists(os.path.join(paths[f'{dataset_type}_LOCATION_IMAGES'],f'{letra}')):
      os.makedirs(os.path.join(paths[f'{dataset_type}_LOCATION_IMAGES'],f'{letra}'))

    df_letter = data_set[dataset_type][data_set[dataset_type]['label']==letter]
    for i in range(df_letter['pixel1'].size):
      image = np.array(df_letter.iloc[i,1:785], dtype=np.uint8).reshape(28,28)
      img = Image.fromarray(image)
      img.save(os.path.join(paths[f'{dataset_type}_LOCATION_IMAGES'],f'{letra}',f'{uuid4()}_{letra}.jpg'))

In [6]:
import matplotlib.image as mpimg
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [7]:
train_dir = paths['TRAIN_LOCATION_IMAGES']
test_dir = paths['TEST_LOCATION_IMAGES']

train_datagen = ImageDataGenerator(rescale = 1/255)
test_datagen = ImageDataGenerator(rescale = 1/255, validation_split=0.2)

In [8]:
train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size = (28,28),
    batch_size = 128,
    class_mode = 'categorical',
    color_mode = 'grayscale',
    subset = 'training'
)

validation_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size = (28,28),
    batch_size = 128,
    class_mode = 'categorical',
    color_mode = 'grayscale',
    subset = 'validation'
)

test_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size = (28,28),
    batch_size = 128,
    class_mode = 'categorical',
    color_mode = 'grayscale'
)

Found 27455 images belonging to 24 classes.
Found 1425 images belonging to 24 classes.
Found 7172 images belonging to 24 classes.


<h1>Neural Network Model</h1>
<p>Here we create our model. Because we will work with many models, lets call the first as <i>model_base</i>.

The layers should be stacked in a list. The first layer is the entry.

This entry is:
<i>tf.keras.layers.Flatten</i>
<b>Flatten</b> indicates that the images, which in this case are $28\times 28$, can be seen as a column vector.

The size of the output, should be of the number of classes, which in our case is:
*len(alphabetic_letters)*
</p>

<h3>Optimizer</h3>
<p>Here we choose the our optimizer, namely how, the gradient descent will be applied. Addam optimizer varies the learning rate, it is computationally demmand, but fastter in some cases. 
</p>

<h3>Loss function</h3>
<p>categorical_crossentropy is widely used to classify, hence, we select such loss function for our problem, due its own characteristics.</p>

<h3>Keras Tuner</h3>
<p>In order to play with the hyperparameters, and obtain the best configuration acconding the accuracy, we use the keras tuner tool. This tool should be installing.</p>

In [9]:
!pip install keras-tuner --upgrade

Collecting keras-tuner
  Downloading keras_tuner-1.1.2-py3-none-any.whl (133 kB)
[?25l[K     |██▌                             | 10 kB 15.1 MB/s eta 0:00:01[K     |█████                           | 20 kB 17.3 MB/s eta 0:00:01[K     |███████▍                        | 30 kB 11.3 MB/s eta 0:00:01[K     |█████████▉                      | 40 kB 9.8 MB/s eta 0:00:01[K     |████████████▎                   | 51 kB 5.0 MB/s eta 0:00:01[K     |██████████████▊                 | 61 kB 5.9 MB/s eta 0:00:01[K     |█████████████████▏              | 71 kB 6.0 MB/s eta 0:00:01[K     |███████████████████▋            | 81 kB 4.3 MB/s eta 0:00:01[K     |██████████████████████          | 92 kB 4.8 MB/s eta 0:00:01[K     |████████████████████████▌       | 102 kB 5.3 MB/s eta 0:00:01[K     |███████████████████████████     | 112 kB 5.3 MB/s eta 0:00:01[K     |█████████████████████████████▍  | 122 kB 5.3 MB/s eta 0:00:01[K     |███████████████████████████████▉| 133 kB 5.3 MB/s eta 0:0

In [18]:
def model_constructor(hp):
  model = tf.keras.models.Sequential()
  model.add(tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape = (28,28,1)))
  model.add(tf.keras.layers.MaxPool2D((2,2)))
  model.add(tf.keras.layers.Flatten())

  hp_units = hp.Int('units', min_value = 32, max_value = 512, step = 32)
  model.add(tf.keras.layers.Dense(units=hp_units, activation='relu',kernel_regularizer=tf.keras.regularizers.L2(l2=1e-5)))
  model.add(tf.keras.layers.Dropout(0.22))
  model.add(tf.keras.layers.Dense(units=128, activation='relu',kernel_regularizer=tf.keras.regularizers.L2(l2=1e-5)))
  model.add(tf.keras.layers.Dropout(0.22))
  model.add(tf.keras.layers.Dense(units=len(numeric_letters), activation='softmax'))

  hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4])

  model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=hp_learning_rate),loss = 'categorical_crossentropy', metrics = ['accuracy'] )
  
  return model  

In [11]:
import keras_tuner as kt

In [20]:
auto_tuner = kt.Hyperband(
    model_constructor,
    objective='val_accuracy',
    max_epochs = 10,
    factor=3,
    directory = 'models/',
    project_name = 'Sign_Lenguaje_Model_Trainner'
    )

In [21]:
auto_tuner.search(train_generator, epochs=10, validation_data=validation_generator)

best_hps = auto_tuner.get_best_hyperparameters(num_trials=2)

Trial 30 Complete [00h 03m 41s]
val_accuracy: 0.8582456111907959

Best val_accuracy So Far: 0.8771929740905762
Total elapsed time: 00h 38m 46s
INFO:tensorflow:Oracle triggered exit


In [22]:
best_hyperparameters = auto_tuner.get_best_hyperparameters(num_trials=2)[0]

In [27]:
callback_early_stopping = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3, mode='min')

In [None]:
hypermodel=auto_tuner.hypermodel.build(best_hyperparameters)

<h1>Saving the model</h1>
<p>Here we learn to save the model, first saving the architecture in a jason format, which is basically a dictionary, and the weights into a .cktp file</p>

<h3>Saving the architecture</h3>

In [29]:
hypermodel_architecture = hypermodel.get_config() 

In [30]:
import json

with open(os.path.join(WORK_DIRECTORY,'hypermodel_architecture.json'), 'w') as fp:
    json.dump(hypermodel_architecture, fp)

<h3> Saving the weights</h3>

In [31]:
from tensorflow.keras.callbacks import ModelCheckpoint

In [34]:
checkpoint_path = os.path.join(WORK_DIRECTORY, 'checkpoints')
checkpoint_weights = ModelCheckpoint(
    filepath = checkpoint_path,
    frequency = 'epoch',
    save_weights_only = True,
    verbose = 1
)

In [35]:
trainner_hypermodel = hypermodel.fit(
    train_generator,
    epochs = 20,
    callbacks = [callback_early_stopping, checkpoint_weights],
    validation_data = validation_generator
)

Epoch 1/20
Epoch 1: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
Epoch 2/20
Epoch 2: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
Epoch 3/20
Epoch 3: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
Epoch 4/20
Epoch 4: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
Epoch 5/20
Epoch 5: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
Epoch 6/20
Epoch 6: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
Epoch 7/20
Epoch 7: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
Epoch 8/20
Epoch 8: saving model to /content/gdrive/MyDrive/Portafolio/SignLenguaje/checkpoints
