# CMSC 478 Machine Learning


## Getting Started with Tensorflow, Keras, and Tensorboard

### Instructor: Fereydoon Vafaei

*Type Your Name and ID Here*

This notebook helps you get started with Tensorflow/Keras API. This notebook assumes you have installed Tensorflow 2.<br>
If you have not installed Tensorflow 2 or have installed previous versions of Tensorflow, you need to [install Tensorflow 2](https://www.tensorflow.org/install) before proceeding. 

## Table of Contents:
* [Installation Verification](#Installation-Verification)
* [A Simple Regression NN](#A-One-Layer-One-Neuron-Regression-Neural-Network-using-Tensorflow/Keras)
* [A Multi-layer NN on MNIST Dataset](#A-Multi-Layer-NN-for-Multi-Class-Classification-on-MNIST-Dataset)
* [Eager Execution in Tensorflow-2](#Eager-Execution-in-Tensorflow-2)
* [Creating the model using the Sequential API](#Creating-the-model-using-the-Sequential-API)
* [Fashion MNIST Dataset](#Fashion-MNIST-Dataset)
* [Using Code Examples from keras.io](#Using-Code-Examples-from-keras.io)
* [California House Pricing](#California-House-Pricing)
* [Callbacks](#Callbacks)
* [Tensorboard](#Tensorboard)
* [Exercise-1](#Exercise-1)
* [Exercise-2](#Exercise-2)
* [References](#References)
* [Grading and Submission](#Grading-and-Submission)

Tensorflow is one of the most popular ML/DL frameworks. Watch this video first:

https://www.youtube.com/watch?v=744f60NyAgc

### Installation Verification

**Very Important Note**:

**RUN ALL CELLS REQUIREMENT**: You must run all cells to get the outputs and then attempt exercises. Otherwise, if any cell is not run with the correct output, your notebook gets ZERO even if you've completed the exercises.

In [None]:
import tensorflow as tf

In [None]:
from tensorflow import keras

In [None]:
# your tf version should be 2.0.0 or higher
tf.__version__

In [None]:
keras.__version__

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

### A One-Layer One-Neuron Regression Neural Network using Tensorflow/Keras

Our first example is a regression NN with only one layer and one neuron to recognize the pattern of a sequence of numbers.

In [None]:
# A simple linear regression NN with one layer

# build a one-layer one-neuron NN
layer_1 = keras.layers.Dense(units=1, input_shape=[1])
model = tf.keras.Sequential([layer_1])

# compile model
model.compile(optimizer='sgd', loss='mean_squared_error') # 'mse'

# data: y = 2x - 1
xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float) 

# train NN
model.fit(xs, ys, epochs=1000)

In [None]:
# using NN, predict y when x=10.0
print("y_pred when x=10.0", model.predict([10.0]))

print("Parameters: {}".format(layer_1.get_weights()))

### A Multi-Layer NN for Multi-Class Classification on MNIST Dataset

Load and prepare the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). Convert the samples from integers to floating-point numbers:

In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

In [None]:
x_train[500]

In [None]:
# build a multi-layer NN for multi-class classification
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)), # input layer
  tf.keras.layers.Dense(128, activation='relu'), # hidden layer with 128 neurons
  tf.keras.layers.Dropout(0.2), # dropout is a regularization technique
  tf.keras.layers.Dense(10, activation='softmax')]) # output layer has 10 neurons and softmax activation function

# compile model for multi-class classification
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy', # loss='SparseCategoricalCrossentropy'
              metrics=['accuracy'])

> Use `loss='sparse_categorical_crossentropy'` loss function when there are two or more label classes. `tf` expects labels to be provided as integers. If you want to provide labels using `one-hot` representation, use `CategoricalCrossentropy` loss.

> Read more about the [`SparseCategoricalCrossentropy`](https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy) and [`CategoricalCrossentropy`](https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy) in their tf documentations.

In [None]:
model.summary()

In [None]:
# train NN
model.fit(x_train, y_train, epochs=5)

# test NN
model.evaluate(x_test,  y_test, verbose=2)

The image classifier is now trained to ~98% accuracy on this dataset.

### Eager Execution in Tensorflow 2

Tensorflow 2 has this new capability of ["Eager Execution"](https://www.tensorflow.org/guide/eager) which makes it more convenient to work with tensors and graph computations. See examples below and compare it with chapter 9 of the 1st edition which uses Session() and Run() to execute these operations.

In [None]:
x = tf.Variable(3, name="x")

In [None]:
y = tf.Variable(4, name="y")

In [None]:
f = x*x*y + y + 2

In [None]:
x

In [None]:
y

In [None]:
f

In [None]:
print(f)

In [None]:
print(x.numpy())
print(y.numpy())
print(f.numpy())

### Creating the model using the Sequential API

Now let’s review the steps in building a neural network! Here is a classification MLP with two hidden
layers:

In [None]:
# build a NN for MNIST
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28, 28])) # the input layer
model.add(keras.layers.Dense(300, activation="relu")) # the first hidden layer with 300 neurons
model.add(keras.layers.Dense(100, activation="relu")) # the 2nd hidden layer with 100 neurons
model.add(keras.layers.Dense(10, activation="softmax")) # the output layer: it's a 10-class classification

Let’s go through this code line by line:
- The first line creates a Sequential model. This is the simplest kind of Keras model for neural networks that are just composed of a single stack of layers connected sequentially. This is called the Sequential API.

- Next, we build the first layer and add it to the model. It is a Flatten layer whose role is to convert each input image into a 1D array: if it receives input data X , it computes `X.reshape(-1, 1)`. This layer does not have any parameters; it is just there to do some simple preprocessing. Since it is the first layer in the model, you should specify the `input_shape` , which doesn’t include the batch size, only the shape of the instances. Alternatively, you could add a `keras.layers.InputLayer` as the first layer, setting `input_shape=[28,28]`

- Next we add a Dense hidden layer with 300 neurons. It will use the ReLU activation function. Each Dense layer manages its own weight matrix, containing all the connection weights between the neurons and their inputs. It also manages a vector of bias terms (one per neuron). When it receives some input data, it computes Equation 10-2.

$$h_{\mathbf{W}, \mathbf{b}}(\mathbf{X}) = \phi (\mathbf{X} \mathbf{W} + \mathbf{b})$$

- Then we add a second Dense hidden layer with 100 neurons, also using the ReLU activation function.

- Finally, we add a Dense output layer with 10 neurons (one per class), using the
softmax activation function (because the classes are exclusive).

> Specifying `activation="relu"` is equivalent to specifying `activation=keras.activations.relu`. Other activation functions are available in the keras.activations package. See https://keras.io/activations/ for the full list.

> Instead of adding the layers one by one as we just did, you can pass a list of layers when creating the Sequential model:

In [None]:
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])

### Fashion MNIST Dataset

This is [another example of image classification](https://github.com/zalandoresearch/fashion-mnist).

In [None]:
fashion_mnist = keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

In [None]:
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

In [None]:
train_images.shape

In [None]:
train_labels

In [None]:
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)

In [None]:
train_images = train_images / 255.0

test_images = test_images / 255.0

In [None]:
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])

In [None]:
# build a multi-class classification NN
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax')]) # the output layer has 10 neurons and softmax activation

In [None]:
# compile model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [None]:
history = model.fit(train_images, train_labels, validation_split=0.1, epochs=10)

> The `fit()` method returns a History object containing the training parameters `history.params`, the list of epochs it went through `history.epoch`, and most importantly a dictionary `history.history` containing the loss and extra metrics it measured at the end of each epoch on the training set and on the validation set (if any).

In [None]:
history.params

In [None]:
print(history.epoch)

In [None]:
history.history.keys()

> If you use this dictionary to create a pandas `DataFrame` and call its `plot()` method, you get the learning curves:

In [None]:
pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1) # set the vertical range to [0-1]

In [None]:
test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)

print('\nTest accuracy:', test_acc)

In [None]:
# ignore the warning if any
predictions = model.predict(test_images)

In [None]:
predictions[0]

In [None]:
np.argmax(predictions[0])

> You can read more on MNIST fashion example [here](https://www.tensorflow.org/tutorials/keras/classification).

### Using Code Examples from keras.io

Code examples documented on keras.io will work fine with `tf.keras`, but you need to change the imports. For example, consider this keras.io code which can't be run on this notebook:

In [None]:
from keras.layers import Dense
output_layer = Dense(10)

> You must change the imports like this:

In [None]:
from tensorflow.keras.layers import Dense
output_layer = Dense(10)

# Or simply use full paths, if you prefer:
from tensorflow import keras
output_layer = keras.layers.Dense(10)

The full path approach is more verbose, so you can easily see which packages to use, and to avoid confusion between standard classes and custom classes.

In production code, the shorter approach is typically preferred. Many people also use `from tensorflow.keras import layers` followed by `layers.Dense(10)`

### California House Pricing

Next, we're going to build a regression model for Califronia house pricing.

In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

In [None]:
# fetch data
housing = fetch_california_housing()
X_train_full, X_test, y_train_full, y_test = train_test_split(
housing.data, housing.target)
X_train, X_valid, y_train, y_valid = train_test_split(
X_train_full, y_train_full)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)
X_test = scaler.transform(X_test)

In [None]:
# build a regression NN
model = keras.models.Sequential([
    keras.layers.Dense(30, activation="relu", input_shape=X_train.shape[1:]),
    keras.layers.Dense(1) # output layer with 1 neuron and with None activation function because it's regression
])

# compile NN
model.compile(loss="mean_squared_error", optimizer="sgd")
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_valid, y_valid))

In [None]:
mse_test = model.evaluate(X_test, y_test)

In [None]:
X_new = X_test[:3] # pretend these are new instances
y_pred = model.predict(X_new)

In [None]:
y_pred

### Callbacks

What if training lasts several hours? This is quite common, especially when training on large datasets.

In this case, you should not only save your model at the end of training, but also save **checkpoints** at regular intervals during training, to avoid losing everything if your computer crashes.

How can you tell the `fit()` method to save **checkpoints**? Use **callbacks**.

The `fit()` method accepts a `callbacks` argument that lets you specify a list of objects that Keras will call at the start and end of training, at the start and end of each epoch, and even before and after processing each batch. For example, the `ModelCheckpoint` callback saves checkpoints of your model at regular intervals during training, by default at the end of each epoch.

Moreover, if you use a validation set during training, you can set `save_best_only=True` when creating the `ModelCheckpoint` .

In this case, it will only save your model when its performance on the validation set is the best so far. This way, you do not need to worry about training for too long and overfitting the training set: simply restore the last model saved after training, and this will be the best model on the validation set. 

In [None]:
# build model
model = keras.models.Sequential([
    keras.layers.Dense(30, activation="relu", input_shape=[8]),
    keras.layers.Dense(30, activation="relu"),
    keras.layers.Dense(1)
])

In [None]:
# compile model
model.compile(loss="mse", optimizer=keras.optimizers.SGD(lr=1e-3))

The following code is a simple way to implement **early stopping**:

In [None]:
# create checkpoint
checkpoint_cb = keras.callbacks.ModelCheckpoint("my_keras_model.h5", save_best_only=True)

In [None]:
# train with callbacks and validation_data
history = model.fit(X_train, y_train, epochs=10,
                    validation_data=(X_valid, y_valid),
                    callbacks=[checkpoint_cb])

In [None]:
model = keras.models.load_model("my_keras_model.h5") # rollback to best model
mse_test = model.evaluate(X_test, y_test)

> Another way to implement **early stopping** is to simply use the `EarlyStopping` callback. It will interrupt training when it measures no progress on the validation set for a number of epochs (defined by the `patience` argument), and it will optionally roll back to the best model.

> You can combine both callbacks to save checkpoints of your model (in case your computer crashes) and interrupt training early when there is no more progress (to avoid wasting time and resources):

In [None]:
model.compile(loss="mse", optimizer=keras.optimizers.SGD(lr=1e-3))
early_stopping_cb = keras.callbacks.EarlyStopping(patience=10,
                                                  restore_best_weights=True)
history = model.fit(X_train, y_train, epochs=100,
                    validation_data=(X_valid, y_valid),
                    callbacks=[checkpoint_cb, early_stopping_cb])
mse_test = model.evaluate(X_test, y_test)

> If you need extra control, you can easily write your own custom callbacks. As an example of how to do that, the following custom callback will display the ratio between the validation loss and the training loss during training (e.g., to detect over‐fitting):

In [None]:
class PrintValTrainRatioCallback(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs):
        print("\nval/train: {:.2f}".format(logs["val_loss"] / logs["loss"]))

In [None]:
val_train_ratio_cb = PrintValTrainRatioCallback()
history = model.fit(X_train, y_train, epochs=1,
                    validation_data=(X_valid, y_valid),
                    callbacks=[val_train_ratio_cb])

### Tensorboard

A neat feature of Tensorflow and Keras is visulaization through Tensorboard. The following code shows how you can visualiza your training using Tensorboard.

In [None]:
import os
root_logdir = os.path.join(os.curdir, "my_logs")
def get_run_logdir():
    import time
    run_id = time.strftime("run_%Y_%m_%d-%H_%M_%S")
    return os.path.join(root_logdir, run_id)
run_logdir = get_run_logdir() # e.g., './my_logs/run_2019_06_07-15_15_22'

In [None]:
# Tensorboard Visualization
tensorboard_cb = keras.callbacks.TensorBoard(run_logdir)
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid), callbacks=[tensorboard_cb])

> Next, run the following command at the root of the project directory where `my_logs` has been saved (or from anywhere else, as long as you point to the appropriate log directory):

> `$ tensorboard --logdir=./my_logs --port=6006`

> And finally, once the server is up, you can open a web browser and go to:

> http://localhost:6006

## Exercise-1

**Exercise-1 has 10 points**.

In this exercise you'll try to build a neural network that predicts the price of a house according to a simple formula.

- Imagine if house pricing was as easy as a house costs 50k + 50k per bedroom, so that a 1 bedroom house costs 100k, a 2 bedroom house costs 150k etc.

- How would you create a neural network that learns this relationship so that it would predict a 7 bedroom house as costing close to 400k etc.

**Hint**: Your network might work better if you scale the house price down. You don't have to give the answer 400...it might be better to create something that predicts the number 4, and then your answer is in the 'hundreds of thousands' etc.

In [None]:
# create data with at least 6 data points for x and y
xs = np.array([...], dtype=float)
ys = np.array([...], dtype=float)

In [None]:
# build model with one layer and one neuron
model = ...

In [None]:
# compile model - be careful to use the correct loss for regression
...

In [None]:
# train model with 1000 epochs
...

In [None]:
# predict the price for 7-bedroom house price
print(model.predict([7.0]))

## Exercise-2

**Exercise-2 has 20 points**.

In this notebook you learned how to do classification using Fashion MNIST, a data set containing items of clothing, and a similar dataset called MNIST which has items of handwriting -- the digits 0 through 9.

Write an MNIST classifier that trains to 99% accuracy or above, and does it without a fixed number of epochs -- i.e. you should stop training once you reach that level of accuracy using `callbacks`.

- **Requirements**:
1. It should succeed in less than 10 epochs.
2. When it reaches 99% or greater it should print out the string `"Reached 99% accuracy so cancelling training!"` as specified in the `myCallback` class.

In [None]:
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('accuracy')>0.99):
      print("\nReached 99% accuracy so cancelling training!")
      self.model.stop_training = True

callbacks = myCallback()

mnist = tf.keras.datasets.mnist

# load data
(x_train, y_train),(x_test, y_test) = mnist.load_data()

# normalize data
x_train, x_test = x_train / 255.0, x_test / 255.0

In [None]:
# build model - be careful about the activation functions of the hidden layer and output layer
model = ...

In [None]:
# compile model - be careful to use the correct loss for multi-class classification, metrics should be 'accuracy'
...

In [None]:
# train model with 10 epochs (will stop earlier) and callbacks
# Note: Your output should include the message: "Reached 99% accuracy so cancelling training!"
# The output should also include:
# <tensorflow.python.keras.callbacks.History at MEMORY_ADDRESS> 
...

### References

- [1] - [Tensorflow Website](https://www.tensorflow.org/)
- [2] - [Tensorflow Tutorials](https://www.tensorflow.org/tutorials)
- [3] - [Hands-On ML Textbook 2nd Edition](https://learning.oreilly.com/library/view/hands-on-machine-learning/9781492032632/)
- [4] - [DeepLearning.AI TensorFlow Developer Professional Certificate - Course-1](https://www.coursera.org/professional-certificates/tensorflow-in-practice)

## Grading and Submission

Name your notebook ```Lastname-tf-notebook.ipynb```. Submit the file using the ```tf-notebook``` link on Blackboard.

- tf-notebook has a total of 30 points which will be counted towards the "Assignment" section of your final grade.

- **RUN ALL CELLS REQUIREMENT**: You must run all cells to get the outputs and then attempt exercises. Otherwise, if any cell is not run with the correct output, your notebook gets ZERO even if you've completed the exercises.

Grading will be based on 

  * verification of correct installation of Tensorflow
  * error-free running of all the cells - all outputs and plots must be included - any missing output would cause the notebook to get ZERO!
  * correct answers to the exercises - Exercise-1 [10 points], Exercise-2 [20 points]
  
<font color=red><b>Due Date: Tuesday April 20th, 11:59PM</b></font>