# Neural Network Lab
### YOUR NAME HERE
In this lab you will be experimenting with ANNs.  Let's start by importing a few things.  

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn.metrics import ConfusionMatrixDisplay
from sklearn.metrics import confusion_matrix
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split

import tensorflow.keras as keras
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras import backend as K
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator

ModuleNotFoundError: No module named 'tensorflow'

We first generate some dummy data from random samples in a 2D space from 4 clusters.


In [None]:
n_samples = 64
variance = 0.01
  
# 4 clusters in a 2D space
centers = np.array([[0, 0],
                    [0, 1],
                    [1, 0],
                    [1, 1]])
   
X, y = make_blobs(n_samples,
                  centers=centers,
                  cluster_std = np.sqrt(variance),
                  shuffle=True)

Let's use matplot lib to plot the clusters of the ``X`` values coloring the points according to their labels (``y``)

In [None]:
plt.scatter(X[:,0],X[:,1], c=y, alpha=0.5)
plt.show()

## Exercise 1
We will be classifying our data, so you should fill out the next two functions in the next cell to accurately classify the values.  You should not use any loops!  

You should be able to achieve 100% accuracy with the provided blobs, so you should test your function by creating a second set of values with a higher variance so that your classifier is not able to achieve 100% accuracy.  Document this appropriately in the following cell(s).

In [None]:
def my_classifier(X):
    ''' This function takes a NumPy vector with 2 variables and returns a classification value (0-3)
    '''
        return -1

def run_my_classifier(X, y):
    ''' This function takes a vector of pairs of points, classifies each pair using my_classifier, and 
    then compares the predicted y value with the actual y values in the y variable.  It should return the 
    accuracy value for the two vectors.  
    '''

    return accuracy

In [None]:
run_my_classifier(X,y)

## Multi-Layer Perceptron

We now define a function to create and train a Multi-layer Perceptron (MLP) classifier.   Calling this function will train the model and generate some print statements that show the confusion matrix output.  

In [None]:
def run_mlp(X_train, y_train, X_test, y_test, out_class=4, hidden=16, epochs=100):
    model = Sequential()
    model.add(Flatten())
    model.add(Dense(hidden, activation='relu'))
    model.add(Dense(out_class, activation='softmax'))
    model.compile(loss="sparse_categorical_crossentropy",
              optimizer=keras.optimizers.SGD(),
              metrics=['accuracy'])
    history = model.fit(x=X_train,y=y_train,epochs=epochs,verbose=1)
    model.summary()
    predicted_probabilities = model.predict(X_train)
    predicted_classes = np.argmax(predicted_probabilities, axis=1)
    acc = 100. * accuracy_score(y_train, predicted_classes)
    print("Accuracy on train set: {:.2f}%".format(acc))
    predicted_probabilities = model.predict(X_test)
    predicted_classes = np.argmax(predicted_probabilities, axis=1)
    acc = 100. * accuracy_score(y_test, predicted_classes)
    print("Accuracy on test set: {:.2f}%".format(acc))
    print(confusion_matrix(y_test, predicted_classes))

Next, lets see how well the original (low variance) data can be clustered with the MLP.  First, we need to make sure we split the data into training and testing sets to identify overfitting.  

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=10)
run_mlp(X_train, y_train, X_test, y_test, 4, 16, 10) 

## Exercise 2:
Re-run the experiment at least 5 more times with different numbers of epochs in the next cell and plot the results to show the overall accuracy vs. the number of epochs.  Write a few sentences about what you observed about the relationship in the Reflection cell.  

### Reflection
TODO

## Exercise 3:
Experiment in the next cell by trying different numbers of neurons in the hidden layer.  Identify the smallest number of hidden neurons you could use and still achieve high accuracy.  Create a table in markdown in the reflection section to show your experimental results.  Make sure your table adequately documents your experimental variables (hyperparameters, dataset) to enable reproducability.  Write a few statements in your reflection about the results.  

### Reflection
TODO

## Exercise 4:
The low-variance blob data was easy to separate with simple classification rules.  Run a few experiments with the higher variance dataset you created and determine if the MLP or your deterministic solution could achieve better accuracy.  Describe your experiments in a table in the reflection section and write a few statements about your observations.  


### Reflection
TODO

## Pistachio Dataset
The blob data experiments were interesting, but are not representative of a real-world problem.  Next, we will use data from an industrial pistachio classifier designed to identify different varities of pistachio nuts.  

https://www.kaggle.com/datasets/muratkokludataset/pistachio-image-dataset


Let's start by loading some data.  Because this is image data, we are going to use a generator to bring in the data.  This also allows us to add augmentation to the images to hopefully grow the robustness of our algorithm. 

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
            validation_split=0.2,
            rescale=1./255, # to bring the image range from 0..255 to 0..1
            rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
            zoom_range = 0, # randomly zoom image 
            width_shift_range=0,  # randomly shift images horizontally (fraction of total width)
            height_shift_range=0,  # randomly shift images vertically (fraction of total height)
            horizontal_flip=False,  # randomly flip images
            vertical_flip=False) # randomly flip images
train_it = datagen.flow_from_directory( '/data/cs2300/pistachio/', 
                                           target_size=(224,224), 
                                           color_mode='grayscale', 
                                           batch_size=1,
                                           class_mode="categorical",
                                           shuffle=True,
                                           subset='training')
valid_it = datagen.flow_from_directory( '/data/cs2300/pistachio/', 
                                           target_size=(224,224), 
                                           color_mode='grayscale', 
                                           shuffle=True,
                                           batch_size=1,
                                           class_mode="categorical",
                                           subset='validation')

Our MLP code is expecting a NumPy array, so we need to build it.  This isn't the most elegant approach, but it gets the job done to allow our previous MLP code to work.  

In [None]:
X = []
y = []
batch_index = 0

while batch_index <= train_it.batch_index:
    #iterate through the training data and build a single array
    x_temp, y_temp = train_it.next()
    X.append(np.squeeze(x_temp[0]))
    y.append(np.squeeze(y_temp[0]))
    batch_index = batch_index + 1

X_train = np.asarray(X)
y_train = np.asarray(y)

X = []
y = []
batch_index = 0

while batch_index <= valid_it.batch_index:
    #iterate through the test data to build a single array
    x_temp, y_temp = valid_it.next()
    X.append(np.squeeze(x_temp[0]))
    y.append(np.squeeze(y_temp[0]))
    batch_index = batch_index + 1

X_test = np.asarray(X)
y_test = np.asarray(y)
X_train_reshaped = X_train.reshape(1719,50176)
X_test_reshaped = X_test.reshape(429,50176)

The following cell is a function that creates a DNN with a single hidden layer.  It includes the methods to train and evaluate the model as well.  

In [None]:
def run_binary_mlp(X_train, y_train, X_test, y_test, hidden=16, epochs=100):
    model = Sequential()
    model.add(Flatten())
    model.add(Dense(hidden, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss="binary_crossentropy",
              optimizer='adam',
              metrics=['accuracy'])
    history = model.fit(x=X_train,y=y_train,
                        validation_data = (X_test, y_test),
                        batch_size=1,epochs=epochs,
                        verbose=1)
    model.summary()
    predicted_probabilities = model.predict(X_train)
    predicted_probabilities = np.rint(predicted_probabilities)
    acc = 100. * accuracy_score(y_train, predicted_probabilities)
    print("Accuracy on train set: {:.2f}%".format(acc))
    predicted_probabilities = model.predict(X_test)
    predicted_probabilities = np.rint(predicted_probabilities)
    acc = 100. * accuracy_score(y_test, predicted_probabilities)
    print("Accuracy on test set: {:.2f}%".format(acc))
    print(confusion_matrix(y_test, predicted_probabilities))
    return history


The next cell calls the previous function and plots the results.  

In [None]:
history = run_binary_mlp(X_train_reshaped, y_train[:,0], X_test_reshaped, y_test[:,0],30,10)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

## Exercise 5
In the following cell, execute at least 5 more experiments that experiment with the number of hidden neurons and number of epochs.  Create a table in the reflection and results section below that shows the configuration hyperparameters, total model parameters, test accuracy, and training accuracy.  Describe why you think the highest accuracy configuration outperformed your other experiments.  

### Results and Reflection
TODO

## Exercise 6
In the following cell, modify the model itself by adding one additional dense layer and one Dropout layer.  Re-use your hyperparameters from your highest accuracy run in the previous exercise and capture the results in the reflection below.  Answer the questions: 

1) Did more layers help? 

2) Did dropout affect overfitting?

3) Did the total number of parameters correlate with changes in accuracy across your experiments?

### Reflection
TODO

## Exercise 7
In the following cell, experiment with using at least 3 different activation functions.  Keep other hyperparameters constant for these experiments and just change the activation functions.  In the reflection section below, record your experiments and the resulting accuracy.  

### Reflection
TODO

## CNNs
One of the limitations of dense networks is that they don't inherently take advange of spatially-associated information.  Using convolutional layers can enable the model to be able to learn more complicated features more efficiently since not all the layers are fully connected.  

We start by creating a new generator for our CNN data to isolate it from previous experiments.  

In [None]:
datagen_cnn = ImageDataGenerator(
            validation_split=0.2,
            rescale=1./255, # to bring the image range from 0..255 to 0..1
            rotation_range=0.01,  # randomly rotate images in the range (degrees, 0 to 180)
            zoom_range = 0.01, # randomly zoom image 
            width_shift_range=0.01,  # randomly shift images horizontally (fraction of total width)
            height_shift_range=0.01,  # randomly shift images vertically (fraction of total height)
            horizontal_flip=False,  # randomly flip images
            vertical_flip=False) # randomly flip images

# Do not modify the generator parameters unless you are making significant model changes that necessitate it
train_it_cnn = datagen_cnn.flow_from_directory( '/data/cs2300/pistachio/', 
                                           target_size=(224,224), 
                                           color_mode='grayscale', 
                                           batch_size=32,
                                           class_mode="binary",
                                           shuffle=True,
                                           subset='training')
valid_it_cnn = datagen.flow_from_directory( '/data/cs2300/pistachio/', 
                                           target_size=(224,224), 
                                           color_mode='grayscale', 
                                           shuffle=True,
                                           batch_size=1,
                                           class_mode="binary",
                                           subset='validation')

Next we create a function to define and run our CNN model.  

In [None]:
def run_binary_cnn(train_it, valid_it, cnn_epochs=10, deepness=1):
    '''This function takes a training and validation generator, builds and trains a model
    number of epochs is passed in as well as the deepness.
    deepness changes the structure of the model by adding layers for larger numbers
    '''
    cnn_model = Sequential()
    cnn_model.add(Conv2D(32, kernel_size=(3, 3),
                     activation='relu',
                     input_shape=(224,224,1)))
    cnn_model.add(MaxPooling2D(pool_size=(2, 2)))
    cnn_model.add(Conv2D(64, (3, 3), activation='relu'))
    cnn_model.add(MaxPooling2D(pool_size=(2, 2)))
    if(deepness > 1):
        cnn_model.add(Conv2D(128, (3, 3), activation='relu'))
        cnn_model.add(MaxPooling2D(pool_size=(2, 2)))
    if(deepness > 2):
        cnn_model.add(Conv2D(256, (3, 3), activation='relu'))
        cnn_model.add(MaxPooling2D(pool_size=(2, 2)))
    cnn_model.add(Flatten())
    if(deepness > 2):
        cnn_model.add(Dense(512, activation='relu'))
        cnn_model.add(Dropout(0.2))
    if(deepness > 1):
        cnn_model.add(Dense(256, activation='relu'))
    cnn_model.add(Dense(128, activation='relu'))
    cnn_model.add(Dense(1, activation='sigmoid'))

    cnn_model.compile(loss="binary_crossentropy",
                  optimizer='adam',
                  metrics=['accuracy'])
    history_cnn = cnn_model.fit(train_it_cnn,
                  validation_data=valid_it_cnn,
                  steps_per_epoch=train_it_cnn.samples/train_it_cnn.batch_size,
                  validation_steps=valid_it_cnn.samples/valid_it_cnn.batch_size,
                  epochs=cnn_epochs)
    return history_cnn


Let's run the model and see how it does...

In [None]:
history = run_binary_cnn(train_it_cnn, valid_it, 10, 2)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

## Exercise 8
Run several experiments trying to improve the model accuracy by tuning hyperparameters changing the model structure (using the deepness parameter or tuning further if you like), and using data augmentation.  Your goal should be to beat your best MLP model by as much as possible.  You should be reading the training results to identify overfitting and tune your model and training accordingly.  Ask the instructor for help if you need guidance.  

Capture at least 5 of your best experiments in the table in the Results section below.  You should capture enough information about each experiment to make it possible to re-create your results.  Write a few statements about what you learned through this exercise.  

### Results and Reflection
TODO