<a href="https://colab.research.google.com/github/AntoniaCarrizo/Machine-learning-projects-artificial-intelligence/blob/main/Classifying_Images_Convolutional_network_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Imports

We import the necessary libraries

In [None]:
!pip install -U tensorflow_datasets

Collecting tensorflow_datasets
  Downloading tensorflow_datasets-4.4.0-py3-none-any.whl (4.0 MB)
[K     |████████████████████████████████| 4.0 MB 18.6 MB/s 
Installing collected packages: tensorflow-datasets
  Attempting uninstall: tensorflow-datasets
    Found existing installation: tensorflow-datasets 4.0.1
    Uninstalling tensorflow-datasets-4.0.1:
      Successfully uninstalled tensorflow-datasets-4.0.1
Successfully installed tensorflow-datasets-4.4.0


In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow_datasets as tfds
import math
tfds.disable_progress_bar()
import os
import matplotlib.pyplot as plt
import numpy as np
import logging
logger = tf.get_logger()
logger.setLevel(logging.ERROR)

We import the encrypt data set and immediately separate the images for training and the images for testing.

In [None]:
dataset, metadata = tfds.load(name='cifar10', as_supervised=True, with_info=True)

[1mDownloading and preparing dataset 162.17 MiB (download: 162.17 MiB, generated: 132.40 MiB, total: 294.58 MiB) to /root/tensorflow_datasets/cifar10/3.0.2...[0m
[1mDataset cifar10 downloaded and prepared to /root/tensorflow_datasets/cifar10/3.0.2. Subsequent calls will reuse this data.[0m


In [None]:
train_dataset, test_dataset = dataset['train'], dataset['test']

#Descriptive data analysis

We see the amount of data we have for testing and training. In this case it was 50,000 of train and 10,000 of test.

In [None]:
num_train_examples = metadata.splits['train'].num_examples
num_test_examples = metadata.splits['test'].num_examples
print("Number of training examples: {}".format(num_train_examples))
print("Number of test examples:     {}".format(num_test_examples))

Number of training examples: 50000
Number of test examples:     10000


The images are 32 $\times$ 32 arrays, with pixel values in the range `[0, 255]`. The *labels* are an array of integers, in the range `[0, 9]`. These correspond to the *class* :

<table>
  <tr>
    <th>Label</th>
    <th>Class</th>
  </tr>
  <tr>
    <td>0</td>
    <td>Airplane/top</td>
  </tr>
  <tr>
    <td>1</td>
    <td>Car</td>
  </tr>
    <tr>
    <td>2</td>
    <td>Bird</td>
  </tr>
    <tr>
    <td>3</td>
    <td>Cat</td>
  </tr>
    <tr>
    <td>4</td>
    <td>Deer</td>
  </tr>
    <tr>
    <td>5</td>
    <td>Dog</td>
  </tr>
    <tr>
    <td>6</td>
    <td>Frog</td>
  </tr>
    <tr>
    <td>7</td>
    <td>Horse</td>
  </tr>
    <tr>
    <td>8</td>
    <td>Ship</td>
  </tr>
    <tr>
    <td>9</td>
    <td>Truck</td>
  </tr>
</table>

We relate each value of the labels to a word to understand it better.

In [None]:
class_names = ['Airplane', 'Car', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']

In [None]:
EPOCHS = 20
BATCH_SIZE = 32
train_dataset = train_dataset.cache().repeat().shuffle(num_train_examples).batch(BATCH_SIZE)
test_dataset = test_dataset.cache().batch(BATCH_SIZE)

We create a preview of a small set of images

In [None]:
sample_training_images, labels = next(iter(train_dataset))
def plotImages(images_arr,labels):
    fig, axes = plt.subplots(1, 10, figsize=(20,20))
    axes = axes.flatten()
    for img, ax, lb in zip(images_arr, axes, labels):
        ax.imshow(img)
        ax.set_xlabel(class_names[lb])       
    plt.tight_layout()   
    plt.show()
plotImages(sample_training_images[:10],labels[:10])

#Preprocessing

We see that the range of our images is greater than 1, it is not normalized

In [None]:
sample_training_images, labels = next(iter(train_dataset))
print(sample_training_images[:2])

To carry out the processing, we only normalize the images, we do not use generators because the dataset is already ready from tensorflow, the images already come with a defined size of 32x32.

The value of each pixel in the image data is an integer in the range `[0,255]`. For the model to work properly, these values need to be normalized to the range `[0,1]`. So here we create a normalization function, and then apply it to each image in the test and train datasets. It ensures that each input pixel has a similar data distribution. This makes convergence faster while training the network. We will divide each element of training and test by the number of pixels, that is, 255.

This is what we do with the following two lines of code:

In [None]:
def normalize(images, labels):
  images = tf.cast(images, tf.float32)
  images /= 255
  return images, labels

# The map function applies the normalize function to each element in the train
# and test datasets
train_dataset =  train_dataset.map(normalize)
test_dataset  =  test_dataset.map(normalize)

# The first time you use the dataset, the images will be loaded from disk
# Caching will keep them in memory, making training faster
train_dataset =  train_dataset.cache()
test_dataset  =  test_dataset.cache()

We check that the images are in a range from 0 to 1:

In [None]:
sample_training_images, labels = next(iter(train_dataset))
print(sample_training_images[:2])

# Define convolutional Neural Network mode
The model consist of 3 cnn layers with an average pooling on each of them. Then a fully connected layer with a relu activation function.

The activation function used is relu, because it is the most common to use in convolutional layers. Rapid learning offers much better performance and generalizability in deep learning, all values less than zero are set to zero.
The function used in the output is softmax, because we are working with categories, softmax takes the input values and transforms them into values between 0 and 1, so that they can be interpreted as probabilities.

What happens if we change the kernel number?
- 3x3:
- 5x5:
- 7x7:

What happens if we change the pooling?
- Max pooling:
- Average pooling:

What happens when adding more number of layers and number of neurons?

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (5,5), activation='relu', input_shape=(32, 32, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),

    tf.keras.layers.Conv2D(64, (5,5), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    tf.keras.layers.Dropout(0.5),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(10)
])

##Compile the model

Parameters used:
- Loss functions: Sparse categorical crossentropy, computes the crossentropy loss between the labels and predictions, it is used when we have two or more categories, in this case we have 10 so it works perfectly for us.
- Batch parameters: The optimal value we obtained was 32, since with a higher value, for example 64, the accuracy decreased, and the other metrics were worse than with 32.
- Epochs: the highest number that did not cause overfitting was
- Optimizer: Adam is generally used because it is the simplest, it achieves good results quickly when having a large amount of data.
- Metrics:
  * Categorical accuracy: Calculate how often the predictions match the labels.
  * Accuracy: It helps us since it is good at classification problems. It helps us to evaluate the model since it is the proportion of true results among the total number of cases examined, the higher this number is, it means that there are more correct predictions.
  * Mean absolute error: Calculates the mean absolute error between labels and predictions.

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy', 'CategoricalAccuracy', 'MeanAbsoluteError'])

We look at the table of our model, we see that the parameters are increasing more and more due to the '' zoom '' caused by the convolutional layers:

In [None]:
model.summary()

##Train the model 

In [None]:
history=model.fit(train_dataset, epochs=EPOCHS, steps_per_epoch=math.ceil(num_train_examples/BATCH_SIZE), validation_data=test_dataset, validation_steps=math.ceil(num_test_examples/BATCH_SIZE))

##Results and statistics

We evaluate the accuracy

We compare how the model works on the test data set. Use all the examples that we have in the test data set to assess the precision. The idea is that the value obtained is close to the accuracy obtained previously, thus we verify that there was no overfitting.

In [None]:
test_loss, test_accuracy, test_categorical_accuracy, test_mean_absolute_error = model.evaluate(test_dataset, steps=math.ceil(num_test_examples/BATCH_SIZE))
print('Accuracy on test dataset:', test_accuracy)

We show the statistics training and validation accuracy and training and validation loss, this will help us to see the results better, the idea is that these lines are not so different.

In [None]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(EPOCHS)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.savefig('./foo.png')
plt.show()

We show some results:

In [None]:
for test_images, test_labels in test_dataset.take(1):
  test_images = test_images.numpy()
  test_labels = test_labels.numpy()
  predictions = model.predict(test_images)

In [None]:
predictions.shape

In [None]:
def plot_image(i, predictions_array, true_labels, images):
  predictions_array, true_label, img = predictions_array[i], true_labels[i], images[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
  
  plt.imshow(img[...,0], cmap=plt.cm.binary)

  predicted_label = np.argmax(predictions_array)
  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'
  
  plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
  predictions_array, true_label = predictions_array[i], true_label[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions_array, color="#777777")
  plt.ylim([0, 1]) 
  predicted_label = np.argmax(predictions_array)
  
  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')

We show some images with their corresponding predictions, the images are in black and white to facilitate the code to show these results, however, this does not affect the final results.

In [None]:
# Plot the first X test images, their predicted label, and the true label
# Color correct predictions in blue, incorrect predictions in red
num_rows = 6
num_cols = 4
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
  plt.subplot(num_rows, 2*num_cols, 2*i+1)
  plot_image(i, predictions, test_labels, test_images)
  plt.subplot(num_rows, 2*num_cols, 2*i+2)
  plot_value_array(i, predictions, test_labels)

##Comparing the network previous and post dropout

##Test model with requested image

In [None]:
import matplotlib.image as mpimg
import cv2

We mount drive to get the image

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
img_path = '/content/drive/MyDrive/Colab Notebooks/sample_image-1.png'

In [None]:
original_img = mpimg.imread(img_path)[:,:,:3]

See if the image is normalized

In [None]:
print(original_img)

In [None]:
plt.imshow(original_img, interpolation='none')
plt.show()

We modify the size to be 32x32

In [None]:
res = cv2.resize(original_img, dsize=(32, 32), interpolation=cv2.INTER_CUBIC)

We make the prediction

In [None]:
test = np.array([original_img])
prediction = model.predict(np.array([res]))[0]

In [None]:
prediction.shape

In [None]:
np.argmax(prediction)

In [None]:
class_names[np.argmax(prediction)]