# Lab 10-ML Pipeline and Neural Network

As an important precursor to understanding neural networks or even deep learning, it is essential to understand how Machine Learning (ML) works.

## The Machine Learning Pipeline

Machine learning is literally 'getting the machine to learn something', and that involves putting something "in" to the machine and expecting something "out" of the machine. What happens inside the machine? Well, we expect that the machine would know how to "learn". In actual fact, we have to tell the machine how we want it to learn, and provide the necessary methods and data. Basically, we define the algorithm that would do the job on the data. Machine learning *lives* on data; it is data-driven and therefore, the data (think of food) that we put in also matter a lot. All these steps are part of what we call a **machine learning pipeline**. 

In this first session, we will be making a quick journey across the machine learning pipeline, and attempt to understand some important nuts and bolts that are involved in the entire process.

---
We start of by getting our hands on an image dataset consisting of many samples of surface defects, categorized by its type. This dataset was originally collected by [Northeastern University, China](http://faculty.neu.edu.cn/yunhyan/NEU_surface_defect_database.html) and was made publicly available for research and non-commercial usage. <br>
The dataset is provided in the `NEU-CLS-64.zip` folder.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import cv2

In [None]:
#extract the dataset to the current working directory
import zipfile
with zipfile.ZipFile("NEU-CLS-64.zip","r") as zip_ref:
    zip_ref.extractall("./")

Explore the files in the directory structure. You will notice there are 9 sub-directories. These correspond to the 9 categories of surface defects compiled by Northeastern University for their research. However, 3 of these categories ('gg', 'rp', and sp') have lesser samples compared to the other 6, so they will not be considered for now. These 6 categories have at least 774 images, while the omitted 3 categories have at most 439 images.  

### Preparation

Before we start identifying what data (or features) that we intend to use, we need to lay down a few 'ground rules' based on what a typical machine learning pipeline would do.

1. For each of these 6 categories, we need to partition the data into two non-overlapping sets: a training set and a test set
2. We could see that the categories are all different sizes. For simplicity and fair learning (for now), let's use the same number of samples for these 6 categories. The largest possible number of samples per category that we can use is equals to the size of the smallest category (makes sense?). Let's make that $N$.
3. We could try pick these images at random. But again, for ease, we will just sample the first $N$ images in sequence, from each category. The images are already numbered in sequence from 1, so this allows easy looping.

In [None]:
# make a list of tuples to that we can later use to refer back to the actual category names
# slightly against the idea of a dictionary, since we want fixed index access here
# tuples also immutable to prevent any possible overwrites
cats = [('cr', 'crazing'), 
        ('in', 'inclusion'),        
        ('pa', 'patches'),
        ('ps', 'pitted surface'),
        ('rs', 'rolled-in scale'),
        ('sc', 'scratches')]

In [None]:
# construct a loop to read all images in a folder into some variable. 
# How would you store it?
# ...a list of 2-D arrays?
# ...an array of 2-D arrays, or 3D?
import glob
image_list = []
for filename in glob.glob('./NEU-CLS-64/cr/*.jpg'): #assuming jpg
    im = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)   # let's load in grayscale
    image_list.append(im)

In [None]:
def showImage(img):
    plt.figure(figsize=(5,5))
    plt.imshow(img,cmap='gray',vmin=0,vmax=255), plt.title('Sample Image')
    plt.xticks([]), plt.yticks([]), plt.show()
showImage(image_list[0])

Create a database with the images and classes.

In [None]:
image_db = []
for c in cats:  
    image_list = []
    path = './NEU-CLS-64/'+ c[0] +'/*.jpg'
    for filename in glob.glob(path):
        im = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
        image_list.append(im)
    image_db.append(image_list)

In [None]:
print(len(image_db))
# print(image_db)

Let's proceed with slicing these inner lists into equal portions. Let's go with 500 each.

In [None]:
image_db_fair = [ic[:500] for ic in image_db]
print(len(image_db_fair[0]))

Now, prepping for supervised machine learning involves creating two portions: the *data*, and the *labels*. As rule of thumb for ML, we want these to be formatted into an $N\times M$ 2-D array (or list), consisting of $N$ number of samples and $M$ number of feature dimensions. 

### Data
We can treat each pixel as a feature. Since each image is 64x64 in size (we read them in grayscale), it can be vectorized into a 4096-value vector. So, with 6 categories and 500 images/category, we are expecting a *3000x4096* data array. This the so-called $X$ array in ML which is representative of data.

In [None]:
# check shape of image. this is the first image of the first category. all ok?
w, h = image_db_fair[0][0].shape
print(w, h)

In [None]:
image_db_fair[0][0].reshape(1, w*h)

In [None]:
image_db = []
w, h = image_db_fair[0][0].shape
for img_c in image_db_fair:
    for im in img_c:
        image_db.append(im.reshape(1, w*h))
    
image_db = np.squeeze(np.array(image_db))

In [None]:
# sanity check
print(image_db)
print(image_db.shape)

### Labels

For labels, we can generate them from the way our folders are structured, the first 500 can be labeled 'cr', the next 500 be labeled 'in' and so on. As such, we are expecting a 3000x1 label array. This is the $y$ array, or rather a vector in ML which is representative of the labels.

In [None]:
labels = []
for c in cats:
    labels.extend([c[0] for i in range(500)])

In [None]:
# sanity check
print(labels[0])    # should be cr
print(labels[999])  # should be in
print(labels[1000]) # should be pa
print(labels[2999]) # should be sc

In [None]:
# make it an array
labels = np.array(labels)

-----------------------------------
### Partitioning Data

Next, let's partition these folders into training and test set. We could do this labourously, using some kind of index randomizer and then slice the randomly-arranged indices into two halves. Or, we could just make use of a function from **scikit-learn** plainly called `train_test_split`.


In [None]:
from sklearn.model_selection import train_test_split

In [None]:
# make sure the sizes make sense, and that they are both same type (arrays)
print(image_db.shape)    # X     (data)
print(labels.shape)      # y    (labels)

In [None]:
# this splits both X and Y into their respective train and test partitions
# by default, test_size=0.25 which means 25% of data is in the test set, while
# the training set has the remaining 75%. This ratio can be tweaked
# random_state ensures that the splitting is done using the same randomizer seed (for reproducibility)

X = image_db
y = labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0, stratify=y)

In [None]:
print(X_train.shape)
print(y_train.shape)  # this should be same # samples as X_train
print(X_test.shape)
print(y_test.shape)  # this should be same # samples as X_test

---
### Classification

The end goal of supervised learning, particularly a classification task is to learn a bunch of weights (or parameters) $W$ from data $X$ with the aid of known labels $y$ by an optimization process such that it can enable us to make categorical predictions $y'$ on new samples $X'$. We can summarize this generally as a hypothesis :
\begin{align}
  H(X) = W^T X = y
\end{align}
where $W$ has been optimized by minimizing the objective function:
\begin{align}
  J = \sum_{i=1}^{N}(H(X)-y_{i})^2
\end{align}

Different classifiers are governed by slightly different hypothesis, which in turn, results in a slightly different objective function. For example, a simple logistic regression (LR) classifier, passes this same hypothesis thru a "logistic function" $\phi$, which converts raw values into probability values lying in the [0, 1] range.
\begin{align}
  H(X) = \phi(W^T X)
\end{align}

<img src="https://qph.fs.quoracdn.net/main-qimg-7c9b7670c90b286160a88cb599d1b733" style="text-align:center" width=400 />''

Probability (for 2-class problem) is 0.5, which is when $W^T X=0$. So, this changes how the objective function behave. Basically, we can now perform classification simply by looking into whether $H(X) \approx 1$ (that is, it belongs to category $y=1$) or $H(X) \approx 0$ (that is, it belongs to category $y=0$).

But this seems to be applicable only when there are 2 categories? How then does this extend to a multi-class situation  with more than 2 classes?

#### **One-vs-rest (or one-vs-all)** 

To resolve cases with $N$ classes, we can use **one-vs-rest** mode which learns $N$ number of models for each category against all other categories grouped. In other words, if we have $N$ categories, we train $N$ number of classifiers, as illustrated below (courtesy of Andrew Ng's course):

<img src="https://raw.githubusercontent.com/ritchieng/machine-learning-stanford/master/w3_logistic_regression_regularization/multiclass_classification2.png" width=450 />

To make a prediction on a new input $X'$, choose the class that maximizes the hypothesis (in the example above, $h$, in ours $H$)

---

#### **Fit, predict, score**

Let's use scikit-learn's built-in [Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html). Most machine learning models in scikit-learn follow the same set of standard API methods (except for a few special cases), all of them have `fit()`, `predict()` and `score()`.
* `fit()`: does the learning or 'training' if you wish to call it
* `predict()`: performs prediction on samples (can be from train set, test set or even new data)
* `score()`: performs prediction on a set of samples, and computes the performance metric. Most classifiers come with a pre-fixed metric that is typically used. But you can choose your own metric from their [`metrics`](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics) library.

In [None]:
from sklearn.linear_model import LogisticRegression
###Note: This cell may take awhile to run
# create the classifier object
clf = LogisticRegression(solver='sag', n_jobs=-1, max_iter=100)    
# 'sag' use l2 optimization, '-1' use all processors, default max iteration
clf.fit(X_train, y_train)     # fit the model with training data/labels

In [None]:
print("Training set accuracy:", clf.score(X_train, y_train)) 
print("Test set accuracy", clf.score(X_test, y_test)) 

# Using .predict() and classification metrics
from sklearn.metrics import accuracy_score
y_pred = clf.predict(X_test)
acc = accuracy_score(y_test, y_pred)
print(acc)

Here's a typical phenomenon in machine learning. We train a classifier/model on some training data. That same classifier is then used to make predictions on unseen test data (hence, no overlapping earlier when we partition). Here are several insights:

* The test set performance is usually lower than the training set performance.
* The training set performance also tells us whether we are fitting the data well or not.

What do you think of the results?

What do you think may be "bias" in this machine learning process?

### Cross-validation

Cross-validation is a way of making sure that we have got "all grounds covered". It's perhaps not convincing to rely on a single way of splitting the data. The way we split could be unintentionally (or even intentionally) bias , for e.g. we could get really lucky with a nice training set or we could be unlucky as well. 

One common technique is to ensure that every sample in the dataset gets to be tested at least once. **k-fold cross validation** partitions the data into *k* number of partitions. Each partition takes turn to be the test set, while the remaining *k-1* partitions become the training set. In the end, the results from all *k* folds are averaged. Basically, e.g. a 5-fold cross validation will be repeated 5 times, with a 80:20 training-test ratio, and the results from all 5 folds averaged.

In [None]:
# k-fold cross validation
# Note: will take time to run as multiple models are trained and tested
from sklearn.model_selection import cross_validate

# since clf has been created, we can re-use it
# by default, k=5 if not set
# we set to k=4 to mimic the earlier 75:25 setup
res = cross_validate(clf, X, y, cv=4)
print(res['test_score'])
print("Average accuracy:", res['test_score'].mean())

You could play with several parameters: the *C* value (regularization term), the *solver* (optimization method, some converge slower, some faster), *class_weight* parameter (give stronger influence to smaller classes, for imbalance data), the *norm* used in the regularization (default is L2 norm, can be L1 for sparser data).

There's a more mechanistic way of getting this tedious job done. That is called a [**grid search**](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html). It basically searches through all possible permutations of different parameter values, and returns all computed results, and also the best parameters. 

### Confusion matrix

So, besides having accuracy scores, we also want to know which category did well and which did not. A **confusion matrix** is most useful here. It shows a square matrix containing the number of ground-truth (desired) labels on one side and the number of predicted labels on the other side. So what it tells us is how many were correctly predicted (on the diagonals) and how many were incorrectly predicted (elsewhere), and when they were incorrect, which other category they went to. 

`scikit-learn` also has a **classification report** which gives us a bunch of class-based metrics such as the precision, recall and f1-score for each class. It's a good way to have a broad overview of how each class is performing.


In [None]:
from sklearn.metrics import confusion_matrix, classification_report

short_labels = [i[0] for i in cats]

print("Test Accuracy: ", clf.score(X_test, y_test))
y_pred = clf.predict(X_test)
print("Confusion Matrix")
cm = confusion_matrix(y_test, y_pred)
n_samples = np.sum(cm, axis=1)
dcm = [cm[i]/n for i, n in enumerate(n_samples) ]
print(cm)
print("Normalized Confusion Matrix")
print(np.squeeze(dcm))
print(classification_report(y_test, y_pred, labels=short_labels))

The `seaborn` package can visualize the results for better reporting

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt     

ax= plt.subplot()
sns.heatmap(dcm, annot=True, ax = ax); #annot=True to annotate cells

# labels, title and ticks
ax.set_xlabel('Predicted labels');ax.set_ylabel('True labels'); 
ax.set_title('Confusion Matrix'); 
ax.xaxis.set_ticklabels(short_labels); ax.yaxis.set_ticklabels(short_labels);

In [None]:
# let's pick some random images from the dataset and perform prediction to see
# if they match the actual labels

def checkPredictions(img_set, img_set_labels, img_shape, idx):
    showImage(img_set[idx].reshape(img_shape[0], img_shape[1]))
    print("Prediction:",clf.predict(X_test[idx].reshape(1, -1)))  
    print("G/T:",img_set_labels[idx])  
  
  
# remember that we shaped the images into a 1-D vector, we have to reshape it back
actualsize = (64, 64)
checkPredictions(X_test, y_test, actualsize, 12)
checkPredictions(X_test, y_test, actualsize, 48)
checkPredictions(X_test, y_test, actualsize, 101)
checkPredictions(X_test, y_test, actualsize, 234)

### Other Features?

Typically, we don't want to rely on pixels as features. They have a somewhat high degrees of freedom and is likely not scale-, rotation- or even translation-invariant. This can be a big problem since any small changes to the image might result in different ordering of pixels. 

Some other "hand-crafted" image features to consider besides using pixels: <A href=https://scikit-image.org/docs/dev/auto_examples/features_detection/plot_hog.html>**HOG**</A>, **LBP**, or **SIFT+BOW**. Or perhaps, a simple **edge image** maybe be useful to understand the kinds of edges that exist in each of these defect types.

In [None]:
# LBP features
from skimage.feature import local_binary_pattern
from skimage import data

# settings for LBP
radius = 1
n_points = 8 * radius

img = image_db[2553].reshape(64, 64)     # change index to view other images
lbp = local_binary_pattern(img, n_points, radius)

showImage(np.hstack((img, lbp)))

lbphist = np.histogram(np.array(lbp, dtype='uint8'), bins=int(lbp.max()+1), density=True)
plt.bar(range(int(lbphist[1].max())+1), lbphist[0])
plt.show()

In [None]:
# HOG features
from skimage.feature import hog

img = image_db[2553].reshape(64, 64)  
fd, hog_img = hog(img, orientations=8, pixels_per_cell=(8, 8),
                  cells_per_block=(1, 1), visualize=True)

print(fd.shape)
showImage(np.hstack((img, hog_img*255.0)))
plt.bar(range(len(fd)), fd)
plt.show()

These features can be used to train the classifiers.

**Q**: Try out some of these suggested features (HOG, LBP) to be used in the ML pipeline. Extract the features for all images and then put them through the same ML pipeline.

What you can try is to test out different settings and features in machine learning and keep a log (or table) of the performance results. It helps a lot in making (or eliminating) decisions when it comes to finding the best method to be adopted

|   |   |   |   |   |
|---|:-:|:-:|---|---|
| **Features** | **Classifier** | **Accuracy (%)**  |   |   |
| Raw pixels | LR |  ? |   |   |
| HOG  | LR  | ?  |   |   |

In [None]:
#Enter your code here


---
## Neural Network

A neural network is a machine learning technique that attempts to mimic the human brain by representing it with a series of connected nodes in different layers. It's not necessary to model the biological complexity of the human brain at a molecular level, just its higher level rules and "thinking" capabilities. 

### Implementing MLP with Keras
To start doing that, we will use **MNIST**, a commonly used handwritten digit dataset consisting of 60,000 images in the training set and 10,000 images in the test set. Each digit has around 6,000 images in the training set. The digit images have been size-normalized and centered to a fixed size of 28×28 pixels. We are going to try implement a Multilayer Perceptron (MLP) neural network using the Keras library.

The raw pixel values will be the input to the network. In preparation, we will reshape the image matrix to an array of dimension 784 (28*28) and feed it to the network as input. We will use a MLP with 2 hidden layers having 512 neurons each. The output layer will have 10 layers for the 10 digits. So, we are expecting the class labels to be one-hot encoded.

<img src="https://www.learnopencv.com/wp-content/uploads/2017/10/mlp-mnist-schematic.jpg" width=500>

Keras comes with a MNIST data loader, which downloads the data from its servers if you do not have it in your computer. We immediately divide the loaded data into the respective training and test sets.

In [None]:
from tensorflow.keras.datasets import mnist #keras attached with tensorflow
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [None]:
# check out some information about the data shape
print('Training data shape : ', train_images.shape, train_labels.shape)
print('Testing data shape : ', test_images.shape, test_labels.shape)
 
# Find the unique numbers from the train labels
classes = np.unique(train_labels)
nClasses = len(classes)
print('Total number of outputs : ', nClasses)
print('Output classes : ', classes)
 
plt.figure(figsize=[10,5])
 
# Display the first image in training data
plt.subplot(121)
plt.imshow(train_images[10,:,:], cmap='gray')
plt.title("Ground Truth : {}".format(train_labels[10]))
 
# Display the first image in testing data
plt.subplot(122)
plt.imshow(test_images[10,:,:], cmap='gray')
plt.title("Ground Truth : {}".format(test_labels[10]))

Next, we will apply some pre-processing steps to make sure the feature matrix is formatted, data types are compatible and class labels are compatible with the 10-unit output layer of the MLP.

In [None]:
# Change from matrix of dim 28x28 to array of dim 784
dimData = np.prod(train_images.shape[1:])
train_data = train_images.reshape(train_images.shape[0], dimData)
test_data = test_images.reshape(test_images.shape[0], dimData)

print(dimData, train_data.shape, test_data.shape)

In [None]:
# Convert data to float and scale values between 0 and 1
train_data = train_data.astype('float32')
test_data = test_data.astype('float32')
 
# Scale the data to range[0, 1]
train_data /= 255
test_data /= 255

In [None]:
# Convert the labels from integer to categorical (one-hot) encoding, it is the format required by Keras 
# to perform multiclass classification. 
# One-hot encoding is a type of boolean representation of integer data. 
# It converts the integer to an array of all zeros except a 1 at the index of the integer.

from tensorflow.keras.utils import to_categorical
train_labels_one_hot = to_categorical(train_labels)
test_labels_one_hot = to_categorical(test_labels)
 
# Show the category label after one-hot encoding
print('Original label 0 : ', train_labels[0])
print('After conversion to categorical ( one-hot ) : ', train_labels_one_hot[0])

The following block diagram shows the Keras Workflow. Basically, once you have prepared the training and test data, you can follow these steps closely to train a neural network in Keras.

<img src="https://www.learnopencv.com/wp-content/uploads/2017/09/keras-workflow.jpg" width=600>

### Constructing the network structure

As mentioned earlier, we will be creating a neural network with 2 hidden layers and an output layer with 10 units. The number of units in the hidden layers is kept at 512. The input to the network is the 784-dimensional array converted from the 28×28 image.

We will use the Sequential model for building the network. In the Sequential model, we can just stack up layers by adding the desired layer one by one. We use the **Dense** layer, also formally known as a fully connected (FC) layer where all the neurons from one layer are connected to the neurons in the previous layer. Apart from the Dense layer, we add the **ReLU** activation function which is required to introduce non-linearity to the model. This will help the network learn non-linear decision boundaries. The last layer is a **softmax** layer as it is a multiclass classification problem. For binary classification, we can use the sigmoid function.

**Side notes**: 
1. Here's a classic response to the question of ["How many hidden units should I use?"](http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-10.html)
2. Here is why ReLU is the [top choice](https://datascience.stackexchange.com/questions/23493/why-relu-is-better-than-the-other-activation-functions) among non-linear activation functions 
3. [Softmax vs. Sigmoid](https://stats.stackexchange.com/questions/233658/softmax-vs-sigmoid-function-in-logistic-classifier)

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input
 
model = Sequential()
model.add(Input(shape=(dimData,)))
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(nClasses, activation='softmax'))

Next, we configure the network by choosing the optimizer. Here, we choose `'rmsprop'`. We also specify the loss type which is categorical cross entropy which is used for multiclass classification. We also specify the metrics (accuracy in this case) which we want to track during the training process. You can also try using any other optimizers such as `'adam'` or `'SGD'`, or try adding more types of metrics that you would like to keep track.

In [None]:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

### TensorFlow Playground
 
To understand the significance of hidden layers, perhaps you may want to visualize what actually happens in these layers. For this, you can check out an interactive platform from Google called [TensorFlow Playground](http://playground.tensorflow.org/#activation=relu&batchSize=10&dataset=gauss&regDataset=reg-plane&learningRate=0.1&regularizationRate=0&noise=0&networkShape=&seed=0.75972&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false&playButton_hide=false), which is a web app that allows you to create and experiment with simple feedforward neural networks and see the effects of training in real time. You can play around by changing the number of hidden layers, number of units in a hidden layer, type of activation function, type of data, learning rate, regularization parameters etc

<img src="https://www.learnopencv.com/wp-content/uploads/2017/10/sample-playground-tensorflow.png" width=500>
 
 
### TensorBoard 

The computations you will use TensorFlow for -- like training a massive deep neural network, can be complex and confusing. To make it easier to understand, debug, and optimize TensorFlow programs, a suite of visualization tools called TensorBoard is included when you install the TensorFlow package. You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it. When TensorBoard is fully configured, it looks like this:

<img src="https://www.tensorflow.org/images/mnist_tensorboard.png" width=500>

In the next example, we will explore how TensorBoard can be used to analyze and monitor the training of the network.

For a more detailed explanation on how to use Tensor, please check out this tutorial on <a href="https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/docs/get_started.ipynb#scrollTo=Gi4PaRm39of2">Getting Started with Tensorboard.</a>


### Using TensorBoard with Keras Model.fit()

When training with Keras's [Model.fit()](https://www.tensorflow.org/api_docs/python/tf/keras/models/Model#fit), adding the `tf.keras.callbacks.TensorBoard` callback ensures that logs are created and stored. Additionally, enable histogram computation every epoch with `histogram_freq=1` (this is off by default)

Place the logs in a timestamped subdirectory to allow easy selection of different training runs.

In [None]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

from tensorflow.keras.callbacks import TensorBoard
import datetime, os, time

# Clear any logs from previous runs
# !rm -rf ./logs/

log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)

The network is row ready to be trained. This is done using the `fit()` function in Keras (just like how you would do in scikit-learn). We also specify the number of epochs as 20 for a start. You can increase or decrease this later as deem fit. In one epoch, the whole of the training data will be fed to the network 20 while we will be using the test data for validation.

In [None]:
start = time.time()
history = model.fit(train_data, train_labels_one_hot, batch_size=256, epochs=30, verbose=1, validation_data=(test_data, test_labels_one_hot), callbacks=[tensorboard_callback])
end = time.time()
print(end - start)

Next, we check the performance on the unseen test data using the evaluate() method

In [None]:
[test_loss, test_acc] = model.evaluate(test_data, test_labels_one_hot)
print("Evaluation result on Test Data : Loss = {}, accuracy = {}".format(test_loss, test_acc))

Start TensorBoard within the notebook:


In [None]:
%tensorboard --logdir logs/fit

## Inference on an image

We have seen previously that the first image in the test set is the number 7. Let us see what the model predicts.

The `predict()` method returns the probabilities (confidence level) of the different classes when performing the prediction. Typically, only the class with largest probability (most confident) is taken as the result. However, it is possible sometimes that the system intends to output a few top scoring classes (instead of just one).

In [None]:
# Predict the probabilities for each class 
res = model.predict(test_data[[0],:])
print(res)
# Get the most likely class
print(np.argmax(res))

Finally, just to check on a concise summary of what your neural network model has, use `summary()`

In [None]:
print(model.summary())

## Extra Exercise: Labeled Faces in the Wild (LFW) Data

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_lfw_people #dataset from sklearn

lfw_people = fetch_lfw_people(min_faces_per_person=30, resize=0.4)

for name in lfw_people.target_names:
    print(name)
    
print(lfw_people.data.dtype)
print(lfw_people.data.shape)
print(lfw_people.images.shape)
print(lfw_people.target.max())

In [None]:
import numpy as np
import matplotlib.pyplot as plt
#im = np.reshape(lfw_people.data[30], (50, 37))

# find person of index i
idx = 15
folks = lfw_people.data[np.where(lfw_people.target==idx)]

from skimage.util import montage
folks_3d = np.reshape(folks, (folks.shape[0], 50, 37))
plt.imshow(montage(folks_3d), cmap='gray')
plt.title(lfw_people.target_names[idx])
plt.show()

In [None]:
from sklearn.model_selection import train_test_split
y = lfw_people.target
X_train, X_test, y_train, y_test = train_test_split(lfw_people.data, y, test_size=0.2, stratify=None)

classes = np.unique(y)
nClasses = len(classes)
dimData = np.prod(X_train.shape[1:])
print(nClasses, dimData)  #correct

train_data = X_train.astype('float32')
test_data = X_test.astype('float32')
train_data /= 255
test_data /= 255

from keras.utils import to_categorical
train_labels_one_hot = to_categorical(y_train)
test_labels_one_hot = to_categorical(y_test)
 
# Show the category label after one-hot encoding
print('Original label : ', y_train[0])
print('After conversion to categorical ( one-hot ) : ', train_labels_one_hot[0])

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input

model = Sequential()
model.add(Input(shape=(dimData,)))
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(nClasses, activation='softmax'))

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(train_data, train_labels_one_hot, batch_size=64, epochs=50, verbose=1, validation_data=(test_data, test_labels_one_hot))

In [None]:
[test_loss, test_acc] = model.evaluate(test_data, test_labels_one_hot)
print("Evaluation result on Test Data : Loss = {}, accuracy = {}".format(test_loss, test_acc))