# Machine Learning 101

This is a jump start exercise in machine learning using TensorFlow 2.0. In this exercise, we will build, train, and valdate a Neural Network model which will say whether a given image is of a Cat or not!

## Install the necessary packages.

Python packages can be installed by pip. To install a package, type "pip install package_name" at the terminal. 

You can execute commands on terminal from Jupyter notebook simply by prepending ! to the command.

Warning: Do not run Jupyter Notebooks from untrusted sources.

In [None]:
!pip install numpy h5py matplotlib scikit-learn

Import the necessary libraries

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
import numpy as np
import h5py
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score

In [None]:
# load data
train = h5py.File('data/cat/train_catvnoncat.h5', 'r')
x_train = np.array(train["train_set_x"][:])
y_train = np.array(train["train_set_y"][:])

test = h5py.File('data/cat/test_catvnoncat.h5', 'r')
x_test = np.array(test["test_set_x"][:])
y_test = np.array(test["test_set_y"][:])

### Explore the data

In [None]:
print('Number of training examples: {}'.format(x_train.shape[0]))
print('Number of test examples: {}'.format(x_test.shape[0]))
print('Dimension of an image is: {}'.format(x_train[0].shape))
print('Shape of train set features : {}'.format(x_train.shape))
print('Shape of train set values : {}'.format(y_train.shape))
print('Shape of test set features : {}'.format(x_test.shape))
print('Shape of test set values : {}'.format(y_test.shape))

In [None]:
def plot_images(images_arr, labels_arr):
    """
    This function will plot images in the form of a grid with 1 row and 5 columns 
    where images are placed in each column.
    
    credits to TensorFlow Team
    https://www.tensorflow.org/tutorials/images/classification
    """
    fig, axes = plt.subplots(1, 5, figsize=(20,20))
    axes = axes.flatten()
    for img, label, ax in zip(images_arr, labels_arr, axes):
        ax.imshow(img)
        ax.axis('off')
        ax.set_title('Cat' if label else "Not a Cat")
    plt.show()


In [None]:
plot_images(x_train[:5], y_train[:5])

In [None]:
# reshape data
image_flat_dim = 64 * 64 * 3

x_train = x_train.reshape(-1, image_flat_dim)/255  # Normalize and reshape the input
y_train = y_train.reshape(-1, 1)

x_test = x_test.reshape(-1, image_flat_dim)/255
y_test = y_test.reshape(-1, 1)

In [None]:
print('Number of training examples: {}'.format(x_train.shape[0]))
print('Number of test examples: {}'.format(x_test.shape[0]))
print('Dimension of an image is: {}'.format(x_train[0].shape))
print('Shape of train set features : {}'.format(x_train.shape))
print('Shape of train set values : {}'.format(y_train.shape))
print('Shape of test set features : {}'.format(x_test.shape))
print('Shape of test set values : {}'.format(y_test.shape))

## Create and train model

The model consists of two hidden layers with 128 neurons in each layer. Relu activation function is used for the hidden layers and sigmoid is used for output layer.

In [None]:
model = Sequential([
    Dense(32, input_shape=(image_flat_dim,), activation='relu'),
    Dense(16, activation='relu'),
    Dense(4, activation='relu'),
    Dense(1, activation='sigmoid')
])

## Compile the model
For this model, we will use SGD as the optimizer with default learing rate of 0.001. Binary cross entropy is used as loss function.

In [None]:
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# print summary 
model.summary()

### Fit the model

In [None]:
epochs = 2500
history = model.fit(x_train, y_train, epochs=epochs, verbose=2)

### Visulaizing the training results

In [None]:
acc = history.history['accuracy']

loss = history.history['loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.legend(loc='lower right')
plt.title('Training Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.legend(loc='upper right')
plt.title('Training Loss')
plt.show()


### Predict 

In [None]:
y_pred = model.predict(x_test) > 0.5

In [None]:
print(y_pred[:5])

In [None]:
print("Accuracy of prediction is: {}".format(accuracy_score(y_test, y_pred)))

In [None]:
def plot_images(images_arr, labels_arr_true, labels_arr_pred):
    """
    This function will plot images in the form of a grid with 1 row and 5 columns 
    where images are placed in each column.
    
    credits to TensorFlow Team
    https://www.tensorflow.org/tutorials/images/classification
    """
    fig, axes = plt.subplots(1, 5, figsize=(20,20))
    axes = axes.flatten()
    for img, lbl_true, lbl_pred, ax in zip(images_arr, labels_arr_true, labels_arr_pred, axes):
        img = np.reshape(img, (64, 64, 3))
        ax.imshow(img)
        ax.axis('off')
        ax.set_title('{0}, {1}'.format('Cat' if lbl_pred else 'Not a Cat', 'Cat' if lbl_true else 'Not a Cat'),
                    color='g' if lbl_pred==lbl_true else 'r')
    plt.show()

In [None]:
inds = np.random.choice(np.arange(50), 5)
print(inds)
plot_images(x_test[inds], labels_arr_true=y_test[inds], labels_arr_pred=y_pred[inds])

Since there are only 50 images in the test set, lets print them all.

In [None]:
for i in range(10):
    inds = np.arange(i*5, (i*5)+5)
    plot_images(x_test[inds], labels_arr_true=y_test[inds], labels_arr_pred=y_pred[inds])