# L09 - DNN

## Author - Rodolfo Lerma

# Problem:
Using the CIFAR-10 dataset, create a new notebook to build a TensorlLow model.


# Abstract:
You start working for a new startup building the next generation search engine. The search engine provides the ability to search images with their content. You are tasked to build a machine learning model that is able to identify the objects in images. The model you are building will help in providing the capability to search for 10 objects. Download the L09_ImageClasses.pdf to see a list of the classes in the dataset and 10 random images from each class.

For this project you will use the CIFAR-10 dataset, which consists of 60000 32x32 color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The analysis is is divided the following way:

### Data Exploration
- **Visual Exploration of the variables**
    - Categorical Variable

### Analysis
- **SVC review**
    - Split Data Set
        
### Summary of Findings

- Read CIFAR-10 dataset from Keras.
- Explore data
- Preprocess and prepare data for classification
- Build a TensorFlow model using a single dense hidden layer
- Apply model to test set and evaluate accuracy
- Perform 3 adjusts to the number of layers and activation functions to improve accuracy
- Summarize your findings regarding the different iterations and any insights gained

# Data Exploration 

In [None]:
# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

#import numpy and matplotlib
import numpy as np
import matplotlib.pyplot as plt

In [None]:
data = tf.keras.datasets.cifar10

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

The classes in the data set represent the following:
- airplane
- automobile
- bird
- cat
- deer
- dog
- frog
- horse
- ship
- truck

The classes are completely mutually exclusive. There is no overlap between automobiles and trucks. "Automobile" includes sedans, SUVs, things of that sort. "Truck" includes only big trucks. Neither includes pickup trucks. 

In [None]:
(x_train, y_train), (x_test, y_test) = data.load_data()

## Loading data

In [None]:
plt.figure()
plt.imshow(x_train[0])
plt.colorbar()
plt.grid(False)
plt.show()

We notice that the image is 255 x 255 pixels. As a result, we will scale the values to range between 0 and 1, and thus we will divide by 255.0.

In [None]:
x_train, x_test = x_train / 255.0, x_test / 255.0

In [None]:
plt.figure()
plt.imshow(x_train[0])
plt.colorbar()
plt.grid(False)
plt.show()

## Formatting names of target variable

Based on the documentation for the used data set from Keras more descriptive names were added.

In [None]:
def names_function(y_train):
    names = []
    for i in y_train:
        if i == 1:
            j = 'car'
        elif i == 2:
            j = 'bird'
        elif i == 3:
            j = 'cat'
        elif i == 4:
            j = 'deer'
        elif i == 5:
            j = 'dog'
        elif i == 6:
            j = 'frog'
        elif i == 7:
            j = 'horse'
        elif i == 8:
            j = 'ship'
        elif i == 9:
            j = 'truck'
        elif i == 0:
            j = 'airplane'
        names.append(j)
    return names

In [None]:
names_train = names_function(y_train)

In [None]:
names_test = names_function(y_test)

## Example of the data for the model

In [None]:
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(x_train[i], cmap=plt.cm.binary)
    plt.xlabel(names_train[i])
plt.show()

# Analysis

For this assignment/example even though a better approach would have been to do a **grid search** to find the optimal hyperparameters, just to show and exemplify how the model will vary based on these values and the importance of it 3 slightly different cases/models + 1 baseline (1 Hidden layer neural network) would be evaluated.

## Models

From the **TensorFlow** library these are some of the options available:

### Optimizer options:

- `sgd`: Gradient Descent with momentum
- `rmsprop`: Optimizer that implements the RMSprop algorithm
- `optimizer`: Base class for Keras
- `nadam`: NAdam algorithm
- `ftrl`: FTRL algorithm
- `adam`: Adam algorithm
- `adagrad`: Adagrad algorithm
- `adadelta`: Adadelta algorithm


### Loss options (for classification):

- `BinaryCrossentropy` class
- `CategoricalCrossentropy` class
- `SparseCategoricalCrossentropy` class
- `Poisson` class
- `binary_crossentropy` function
- `categorical_crossentropy` function
- `sparse_categorical_crossentropy` function
- `poisson` function
- `KLDivergence` class
- `kl_divergence` function

### Metrics (accuracy):

- `Accuracy` class
- `BinaryAccuracy` class
- `CategoricalAccuracy` class
- `TopKCategoricalAccuracy` class
- `SparseTopKCategoricalAccuracy` class

**Others:**

- `AUC` class
- `Precision` class
- `Recall` class
- `TruePositives` class
- `TrueNegatives` class
- `FalsePositives` class
- `FalseNegatives` class
- `PrecisionAtRecall` class
- `SensitivityAtSpecificity` class
- `SpecificityAtSensitivity` class

### Based Model

In [None]:
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(32, 32, 3)),
    keras.layers.Dense(100, activation='tanh'),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer= 'sgd',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
x = model.fit(x_train, y_train, epochs=10)

### Adjusted Model

For this model 3 small changes were done:
- The number of layers
- The number of neurons
- The activation function from `tanh` to `relu`
- The optimizer from `sgb` to `nadam`

**Note:** 
As mentioned above ideally a more robust approach should be taken such as grid search or random search to find the best hyperparameters as well as the number of layers and neurons per layer and the type of layer, but since the purpose this assignment is to exemplify the use of tensorflow and keras a more simplistic approach would be followed.

In [None]:
from tensorflow.keras import regularizers

model3 = keras.Sequential([
    keras.layers.Flatten(input_shape=(32, 32, 3)),
    keras.layers.Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    keras.layers.Dense(500, activation='relu'),
    keras.layers.Dense(200, activation='relu'),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])

model3.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model3.fit(x_train, y_train, epochs=2)

model1 = keras.Sequential([
    keras.layers.Flatten(input_shape=(32, 32, 3)),
    keras.layers.Dense(1000, activation='tanh'),
    keras.layers.Dense(500, activation='tanh'),
    keras.layers.Dense(10, activation='softmax')
])

model1.compile(optimizer= 'sgd',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model1.fit(x_train, y_train, epochs=2)

### Adjustment 2

In [None]:
model2 = keras.Sequential([
    keras.layers.Flatten(input_shape=(32, 32, 3)),
    keras.layers.Dense(1000, activation='relu'),
    keras.layers.Dense(500, activation='relu'),
    keras.layers.Dense(500, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

model2.compile(optimizer='nadam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model2.fit(x_train, y_train, epochs=2)

### Adjustment 3

In [None]:
from tensorflow.keras import regularizers

model3 = keras.Sequential([
    keras.layers.Flatten(input_shape=(32, 32, 3)),
    keras.layers.Dense(1000, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    keras.layers.Dense(500, activation='relu'),
    keras.layers.Dense(200, activation='relu'),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])

model3.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model3.fit(x_train, y_train, epochs=2)

In [None]:
test_loss, test_acc = model.evaluate(x_test, y_test)

print('Test accuracy:', test_acc)

In [None]:
predictions = model.predict(x_test)

In [None]:
predictions[0]

In [None]:
np.argmax(predictions[0])

In [None]:
y_test[0]

In [None]:
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(x_test[i], cmap=plt.cm.binary)
    plt.xlabel("{} {:2.0f}% ({})".format(np.argmax(predictions[i]), 100*np.max(predictions), names_test[i]))
plt.show()

# Summary of Findings

