# Lab 2 - Classification and Fitting with Neural Networks in Tensorflow


In this notebook, you will be introduced to tensorflow. We will

-  perform function fitting with neural networks (Part A)
-  perform classification on handwritten digits using Tensorflow (Part B)

Complete the code where appropriate according to the instructions (see comments)

Your are free to tweak the hyper-parameters (including number of hidden units, number of hidden layers, learning rate, num of iterations and so on) to improve the performance of the model. 

Make sure that your final submission is a notebook that can be run from beginning to end

For Part A: You should try to get pretty close to the ground truth function where requested. 

For Part B: The submitted prediction accuracy on testing set, should be > 60%. It is in fact easy to achieve >95% of the accuracy on this dataset with careful tuning of hyper-parameters. **Your grade will depend on the final prediction accuracy**. 

In [1]:
import tensorflow as tf
from collections import deque
import numpy as np
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'tensorflow'

In [None]:
tf.__version__ # Make sure you have version >2

# Short Experiments with Tensors
No Code is necessary in this section

In [None]:
# Let's create a random tensor 
t = tf.random.uniform([2,3],-1,3)
t # t.shape -> (2,3)

In [None]:
5*(t**2+10) # Tensor arithmetic

In [None]:
t.numpy() # convert to numpy

# Part A: Function fitting using Neural Networks 

## Fit a Simple Linear Function
- No need to change anything in this subsection
- Read the code and understand what is happening

- In this part you will see how to use tensorflow to implement a simple linear regression 

In [None]:
# No code changes are necessary here but feel free to experiment
true_a = 1.13
true_b = 2.3
datapoints =  1000
noise_intensity = 0.1
data_x = (np.arange(datapoints) / (datapoints) - .5).astype(np.float32) # Chainer assumes all the cpu computation is done in float32
data_y = (data_x * true_a + true_b + np.random.randn(*data_x.shape) * noise_intensity).astype(np.float32)
_ = plt.scatter(data_x, data_y, c='b')

In [None]:
from tensorflow.keras import Input
from tensorflow.keras import layers,Model

model = tf.keras.Sequential()
model.add(layers.Dense(1,input_shape=[1])) # no activation function here

In [None]:
model.compile(optimizer="Adam", loss="mse")
# NOTE during your experimentation you can use verbose=1 or 2, but for final result you can use verbose=0
history = model.fit(data_x,data_y,epochs=250,verbose=0) 

In [None]:
model.layers # Observe the layers

In [None]:
model.layers[0].weights # Observe the weights learned

In [None]:
plt.scatter(data_x, data_y, c='b')
plt.plot(data_x, model.predict(data_x).T[0],c='r')

## Fit a nonlinear function using Neural Networks 

- In this part you will write code that fits the dataset below.
- The ground truth is a sinusoidal function but there is noise added
- you should implement a slightly more complicated model than before. 

In [None]:
# no need to change anything here
# this generates the data
freq = 10
noise_intensity = 0.4
data_y2 = ( np.sin(freq*data_x) * true_a + true_b + np.random.randn(*data_x.shape) * noise_intensity).astype(np.float32)
_ = plt.scatter(data_x, data_y2, c='b')

- implement the neural network model in the following cell
- to do that you can experiment with various methods. You can try to put more than one layers. Experiment with the number of parameters per layer as well. 
- you will need to use some activation function to introduce nonlinearity in your neural network. 
- to introduce nonlinearity you can use the parameter activation="relu" in a Dense layer for example. There are other activation functions you can try as well

In [None]:
# TODO Make a nonlinear model to fit the nonlinear function

#model = tf.keras.Sequential() 
#model.add(...)
# ....
#model.add(...)
#model.compile(optimizer="Adam", loss="mse")

In [None]:
history = model.fit(data_x,data_y2,epochs=250,verbose=0) # fit your model

In [None]:
model.layers # Notice how many dense layers we have now

In [None]:
# overlay answer and data
# NOTE: your result should be close to the ground truth function (sinusoidal function)
plt.scatter(data_x, data_y2, c='b')
plt.plot(data_x, model.predict(data_x).T[0],c='r')  

In [None]:
# How does your model extrapolate? 
# -> It's okay if it doesn't extrapolate
plt.scatter(data_x, data_y2, c='b')
plt.plot(data_x-1, model.predict(data_x-1).T[0],c='r') 
plt.plot(data_x, model.predict(data_x).T[0],c='r') 
plt.plot(data_x+1, model.predict(data_x+1).T[0],c='r') 

In [None]:
# OPTIONAL Question 
# - Create your dataset based on teh function of your choice (logarithmic, root, exponential etc)
# - Your dataset for X can be different than a uniform grid, say for example uniform distribution, normal distribution etc
# - does your model interpolate well? does it extrapolate well?

# Part B: Classification using Neural Networks

In this part we build a model for image classification. 
We will use the MNIST hand written digit dataset, which is a toy benchmark for image
classification models. First load the dataset via TensorFlow API.

Here $X_{train},Y_{train}$ denote the training data and $X_{test},Y_{test}$ denote the testing data. We train the model on training set and evaluate its performance on testing set (to evaluate potential under-fitting or over-fitting). As can be seen below, $X_{train}$ contains a lot of examples with $28 \times 28 $ pixels. $Y_{train}$ contains the corresponding labels (i.e. the $10$ classes). 

In [None]:
mnist = tf.keras.datasets.mnist.load_data(path="mnist.npz") # Dataset

In [None]:
(X_train, Y_train), (X_test, Y_test) = tf.keras.datasets.mnist.load_data()

In [None]:
X_train = X_train/255.0 # Normalize your data
X_test  = X_test/255.0 

In [None]:
# Print an image and its label
plt.imshow(X_train[0])
print(Y_train[0])

In [None]:
# TODO: Create a simple Model to predict the label of a digit with one dense layer
#model = tf.keras.Sequential()
#model.add(layers.Flatten(input_shape=[28,28]))
#model.add(...)
#model.compile(optimizer="Adam", loss="sparse_categorical_crossentropy",metrics=['accuracy'])
#history = model.fit(X_train,Y_train, epochs= ...,validation_data=....)

In [None]:
model.evaluate(X_test,Y_test) # Evaluate your results

In [None]:
# TODO: Create a more complicated model with more than one layers
# and evaluate your model on the test set
# try to improve the performance compared to the previous model

# model.evaluate(X_test,Y_test) # Evaluate your results

In [None]:
# Optional question
# Identify digits that your model mispredicts, and display them. 