Welcome to the NRT-DSEER Lecture: Demystifying Machine Learning: Neural Networks course! 

This lecture is intended to provide a practical introduction to neural networks including fundamental design, the learning process, and common applications.  The goal is to gain a sufficient understanding to follow discussion of neural networks in seminars, meetings, etc.  This lecture does not provide an in-depth look at the mathematics and other more technical aspects of neural networks; if you plan to implement neural networks in your own research, I highly encourage you to utilize additional resources/references to gain a more thorough understanding.



The framework utilized for this lecture is TensorFlow with the Keras API. This is one of two primary deep learning frameworks utilized in AI research (Tensorflow and PyTorch), although there are others and popularity may vary depending on your field. More information about TensorFlow and Keras can be found here: https://www.tensorflow.org/  https://keras.io/

We need to first install the tensorflow package

In [None]:
import sys
!{sys.executable} -m pip install tensorflow

In [None]:
import numpy as np
import pandas as pd
import random
import tensorflow as tf
import matplotlib.pyplot as plt
import sklearn.metrics
import cv2

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras import Model, activations, optimizers, metrics
from tensorflow.keras import activations
from tensorflow.keras import optimizers
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.applications.vgg19 import VGG19
from tensorflow.keras.utils import load_img, img_to_array
#from tensorflow.keras.preprocessing.image import load_img



We'll begin by experimenting with some simple neural networks to understand basic design parameters.

Let's work on a binary classification problem in which we have two Gaussian distributions and we would like to predict which distribution an unseen data point is likely to .  To start, we need to generate some fake data to use for our task.  We will need training, validation, and test sets. 

In [None]:
np.random.seed(1890) #establish a random seed to make code/results reproducible

mu1 = [1, 1]
sig1 = [[1.5, 0.2], [0.2, 1.5]]

mu2 = [5, 2]
sig2 = [[1.5, -1], [-1, 4]]

data1 = np.random.multivariate_normal(mu1, sig1, size=500)
data2 = np.random.multivariate_normal(mu2, sig2, size=500)
train_data = np.concatenate((data1,data2))

data1 = np.random.multivariate_normal(mu1, sig1, size=50)
data2 = np.random.multivariate_normal(mu2, sig2, size=50)
validation_data = np.concatenate((data1,data2))

data1 = np.random.multivariate_normal(mu1, sig1, size=100)
data2 = np.random.multivariate_normal(mu2, sig2, size=100)
test_data = np.concatenate((data1,data2))


We can visualize the generated data to ensure that it is appropriate for our binary classification task.

In [None]:
train_labels = np.concatenate((np.zeros(500,), np.ones(500,)))
validation_labels = np.concatenate((np.zeros(50,), np.ones(50,)))
test_labels = np.concatenate((np.zeros(100,), np.ones(100,)))


fig, axs = plt.subplots(1,3, figsize=(15,5))

axs[0].scatter(train_data[:,0],train_data[:,1], c=train_labels)
axs[0].set_title('Train Data')
axs[1].scatter(validation_data[:,0],validation_data[:,1], c=validation_labels)
axs[1].set_title('Validation Data')
axs[2].scatter(test_data[:,0],test_data[:,1], c=test_labels)
axs[2].set_title('Test Data')

for ax in axs:
    ax.set_xlim(-3,9)
    ax.set_ylim(-5,8)
    ax.set_xticks([])
    ax.set_yticks([])

Now we can get to deep learning.  The typical workflow is:
1) Generate the neural network architecture. 
2) Compile the model and provide training strategy. 
3) Train the model 
4) Evaluate performance

Keras provides an intuitive approach to building models.  Let's start by defining the model architecutre/parameters (you can use this link to view the activation function options: https://keras.io/api/layers/activations/#about-advanced-activation-layers)

Syntax of Dense layers: LayerType(number_of_nodes)(previous_layer)

In [None]:
save_path = 'ModelWeights.h5' #location/file name for model weights as .h5 file
input_size = [2]
num_layers =  ?  #the number of HIDDEN layers in the model
num_nodes = ?   #the number of nodes/neurons in each hidden layer


def get_classification_model(input_size, num_layers, num_nodes):
    
    input_x = Input((input_size), name='input')
    hlayer = Dense(num_nodes)(input_x)
    activation = ? #insert activation layer here
    
    for i_layers in range(num_layers-1):
        hlayer = Dense(num_nodes)(activation)
        activation = ? #insert activation layer here
    
    output_y = Dense(1, activation='sigmoid')(hlayer)
    
    model = Model(inputs=input_x, outputs=output_y)
    return model
    

In [None]:
model = get_classification_model(input_size, num_layers, num_nodes)
model.summary()

Step 2 is to compile the model and provide training protocol (we'll learn more about this in section 2, so don't worry about changing anything here).

In [None]:
model.compile(optimizer=optimizers.SGD(), loss='binary_crossentropy', metrics = ["accuracy"])

callbacks = [EarlyStopping(patience=20, verbose=1),
            ReduceLROnPlateau(factor=0.1, patience=12, min_lr=1e-7, verbose=1),
            ModelCheckpoint(save_path, verbose=1, save_best_only=True, save_weights_only=True)]

Finally, we can train the model.

In [None]:
results=model.fit(train_data, train_labels, batch_size=1, epochs=30, validation_data=(validation_data,validation_labels), callbacks=callbacks)

And use the testing data to evaluate the model performance.

In [None]:
test_predictions = model.predict(test_data)
test_predictions = (test_predictions>0.5).astype(np.uint8)

accuracy = sklearn.metrics.accuracy_score((test_labels).astype(np.uint8), test_predictions)
print('Model accuracy is ', accuracy)

We can also create dummy data to help us visualize the decision boundary with our data.

In [None]:
x_dummy = np.arange(-3,9,0.05)
y_dummy = np.arange(-5,8,0.05)
dummy_data = np.array(np.meshgrid(x_dummy,y_dummy)).T.reshape(-1,2)

dummy_predictions = model.predict(dummy_data)
dummy_predictions = (dummy_predictions>0.5).astype(np.uint8)

In [None]:
colormap = np.array(['red','green'])
plt.scatter(dummy_data[:,0],dummy_data[:,1],c=dummy_predictions, cmap='RdYlGn')
plt.scatter(test_data[:,0],test_data[:,1], c=test_labels)


STOP HERE UNTIL INSTRUCTED TO MOVE ON

























Exercise: Using the code above as a template, create a new model to either 1) perform classification on the Iris Flower dataset or 2) perform regression on the Wine Quality dataset. Do not worry about data visualization, scatter plots, etc.  Think about which model parameters and concepts need to be adjusted from above to prep your data, design your model, train, and evaluate.

In [None]:
#pd.read_csv('../data/iris.data', names=['Sepal Length', 'Sepal Width', 'Petal Length', 'Petal Width', 'Class'])
#or
#pd.read_csv('../data/winequality-white.csv', sep = ';')

In [None]:
#print('The RMSE of the wine quality is ', )
#or
#print('The accuracy(?) of the flower classifier is ', )

STOP HERE UNTIL INSTRUCTED TO MOVE ON

We will now start to investigate different loss functions.  
At this link, find a list of loss functions available through Keras: https://keras.io/api/losses/
And the list of available optimizers given here: https://keras.io/api/optimizers/

Exercise 1: Write the loss function to calculate the mean absolute error (MAE) loss.

$ MAE = \sum \limits _{i=1} ^{N}\frac{|y_{i}-\hat y_{i}|}{N} $

In [None]:
def MAELoss(y_true, y_pred):
    #y_true is a vector of ground truth values
    #y_pred is a vector of data predictions
    
    
    return loss

Exercise 2: Consider a more complex loss function in which we want to evaluate both the MAE and the MSE (mean squared error), but more heavily weight the MSE? Write a loss function that can take weighting parameters as inputs for the MAE and MSE. Try utilizing this loss to train the regression model you wrote above.

$ MSE = \sum \limits _{i=1} ^{N}\frac{(y_{i}-\hat y_{i})^{2}}{N} $

In [None]:
def weighted_MSE_MAE_loss(y_true, y_pred, weightMSE, weightMAE):
    
    
    return loss

STOP HERE UNTIL INSTRUCTED TO MOVE ON

Finally, we will have a brief demonstraion of how we can utilize a convolutional neural network for a more complex task: image classification.  We begin by loading in an image and reformatting it to the image size specified by the model architecture (224 pixels by 224 pixels).  

In [None]:
image = load_img('../data/cat_image.jpg', target_size=(224,2224))
image = img_to_array(image)
image = np.expand_dims(image, axis=0)

Now, let's load a pre-trained image classification model provided by Keras

In [None]:
model = VGG19(include_top=True, weights='imagenet')
y_image = model.predict(image)
print(np.shape(y_image))