# MMAI 894 - Exercise 2
## Convolutional artificial neural network : Image classification
The goal of this excercise is to build a convolutional neural network using the tensorflow/keras library. We will be using the MNIST dataset.
Submission instructions:

- You cannot edit this notebook directly. Save a copy to your drive, and make sure to identify yourself in the title using name and student number
- Do not insert new cells before the final one (titled "Further exploration") 
- Verify that your notebook can _restart and run all_. 
- Select File -> Download as .py (important! not as ipynb)
- Rename the file: `studentID_lastname_firstname_ex2.py`
- The mark will be assessed on the implementation of the functions with #TODO
- **Do not change anything outside the functions**  unless in the further exploration section
- As you are encouraged to explore the network configuration, 20% of the mark is based on final accuracy achieving greater than 98.5% on the test set.
- Note: You do not have to answer the questions in thie notebook as part of your submission. They are meant to guide you.

- You should not need to use any additional libraries other than the ones listed below. You may want to import additional modules from those libraries, however.

In [1]:
# Import modules
# Add modules as needed
from sklearn.datasets import fetch_openml
import numpy as np
from sklearn.model_selection import train_test_split


# For windows laptops add following 2 lines:
import matplotlib
matplotlib.use('agg')

import matplotlib.pyplot as plt

import tensorflow as tf
import tensorflow.keras as keras


### Data preparation

#### Import data

In [2]:
def load_data():
    # Import MNIST dataset from openml
    dataset = fetch_openml('mnist_784', version=1, data_home=None)

    # Data preparation
    raw_X = dataset['data']
    raw_Y = dataset['target']
    return raw_X, raw_Y

raw_X, raw_Y = load_data()

## Consider the following
- Same as excercise 1
- what shape should x be for a convolutional network?

In [3]:
def clean_data(raw_X, raw_Y):
    # Convert Y to integers
    cleaned_Y = raw_Y.astype(int)

    # Convert X to numpy array 
    cleaned_X = np.array(raw_X) / 255

    # Reshape X 
    cleaned_X = cleaned_X.reshape(-1, 28, 28, 1)

    # Shuffle X and Y in unison
    indices = np.random.permutation(len(cleaned_X))
    cleaned_X = cleaned_X[indices]
    cleaned_Y = cleaned_Y[indices]

    # QA check
    assert cleaned_X.shape[0] == cleaned_Y.shape[0]

    return cleaned_X, cleaned_Y



cleaned_X, cleaned_Y = clean_data(raw_X, raw_Y)

#### Data split

- Split your data into a train set (50%), validation set (20%) and a test set (30%). You can use scikit-learn's train_test_split function.

In [4]:
def split_data(cleaned_X, cleaned_Y):
    # Split data into train and test sets
    X_train, X_test, Y_train, Y_test = train_test_split(cleaned_X, cleaned_Y, test_size=0.3, random_state=42)

    # Split train set into train and validation sets
    X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size=0.2857, random_state=42)
    
    return X_val, X_test, X_train, Y_val, Y_test, Y_train


X_val, X_test, X_train, Y_val, Y_test, Y_train = split_data(cleaned_X, cleaned_Y)

### Model

#### Neural network structure

This time, the exact model architecture is left to you to explore.  
Keep the number of parameters below 2,000,000

In [5]:
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=(28, 28, 1)),
        tf.keras.layers.MaxPooling2D(pool_size=2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=128, activation='relu'),
        tf.keras.layers.Dense(units=10, activation='softmax')
    ])
    
    return model

def compile_model(model):
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

def train_model(model, X_train, Y_train, X_val, Y_val):
    history = model.fit(X_train, Y_train, epochs=18,
                        validation_data=(X_val, Y_val),verbose=1)
    return model, history

def eval_model(model, X_test, Y_test):
    test_loss, test_accuracy = model.evaluate(X_test, Y_test,verbose=0)
    print('Test loss:', test_loss)
    print('Test accuracy:', test_accuracy)
    return test_loss, test_accuracy




In [6]:
## You may use this space (and add additional cells for exploration)

model = build_model()
model = compile_model(model)
model, history = train_model(model, X_train, Y_train, X_val, Y_val)
test_loss, test_accuracy = eval_model(model, X_test, Y_test)

Epoch 1/18
Epoch 2/18
Epoch 3/18
Epoch 4/18
Epoch 5/18
Epoch 6/18
Epoch 7/18
Epoch 8/18
Epoch 9/18
Epoch 10/18
Epoch 11/18
Epoch 12/18
Epoch 13/18
Epoch 14/18
Epoch 15/18
Epoch 16/18
Epoch 17/18
Epoch 18/18
Test loss: 0.09119474142789841
Test accuracy: 0.9857142567634583
