## CSCI 470 Activities and Case Studies

1. For all activities, you are allowed to collaborate with a partner. 
1. For case studies, you should work individually and are **not** allowed to collaborate.

By filling out this notebook and submitting it, you acknowledge that you are aware of the above policies and are agreeing to comply with them.

Some considerations with regard to how these notebooks will be graded:

1. You can add more notebook cells or edit existing notebook cells other than "# YOUR CODE HERE" to test out or debug your code. We actually highly recommend you do so to gain a better understanding of what is happening. However, during grading, **these changes are ignored**. 
2. You must ensure that all your code for the particular task is available in the cells that say "# YOUR CODE HERE"
3. Every cell that says "# YOUR CODE HERE" is followed by a "raise NotImplementedError". You need to remove that line. During grading, if an error occurs then you will not receive points for your work in that section.
4. If your code passes the "assert" statements, then no output will result. If your code fails the "assert" statements, you will get an "AssertionError". Getting an assertion error means you will not receive points for that particular task.
5. If you edit the "assert" statements to make your code pass, they will still fail when they are graded since the "assert" statements will revert to the original. Make sure you don't edit the assert statements.
6. We may sometimes have "hidden" tests for grading. This means that passing the visible "assert" statements is not sufficient. The "assert" statements are there as a guide but you need to make sure you understand what you're required to do and ensure that you are doing it correctly. Passing the visible tests is necessary but not sufficient to get the grade for that cell.
7. When you are asked to define a function, make sure you **don't** use any variables outside of the parameters passed to the function. You can think of the parameters being passed to the function as a hint. Make sure you're using all of those variables.
8. Finally, **make sure you run "Kernel > Restart and Run All"** and pass all the asserts before submitting. If you don't restart the kernel, there may be some code that you ran and deleted that is still being used and that was why your asserts were passing.

# Deep Learning

In this exercise, you will use a deep neural network to predict the values of houses based on some provided input data. You will use keras to build the model. Below is a description of how the keras models are set up.

**Please read more about the data [here](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html).**

In [3]:
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout

np.random.seed(0)

In [4]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.boston_housing.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/boston_housing.npz


In [5]:
x_train.shape

(404, 13)

In [6]:
y_train.shape

(404,)

In [7]:
y_train.mean(), y_train.std()

(22.395049504950492, 9.199035423364862)

The keras model consists of multiple parts:

1. Construct the model layers, neurons per layer, activation function
1. Determine the loss function, metrics and optimization method
1. Fit the model to some data
1. Evaluate the model using the same metric

Some relevant docs:
 - [initializers](https://keras.io/initializers/)
 - [loss functions](https://www.tensorflow.org/api_docs/python/tf/keras/losses)
 - [regularizations](https://keras.io/regularizers/)
 - [optimizers](https://keras.io/optimizers/)
 - [metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics)


First, to construct a model, use the [Sequential](https://keras.io/getting-started/sequential-model-guide/) object. You can pass multiple layers to the sequential object. For this exercise, we will only be using the [Dense](https://keras.io/layers/core/#dense) layers.

In [8]:
# Create 3 hidden layers with 10 neurons each, and an output layer with 1 neuron. Store the layers
# in a list with the variable name "layers". Pass the list to keras.Sequential and save the
# retuned model to the variable name "model".
#
# Use any activation you like such as "relu" or "tanh", you can alternate for each layer
# For your first run, try using the linear activation and then see if modifying the activations improved the result.
# Optional - give each layer a name and see how that shows up in the summary.
layers = [
          Dense(10, activation="relu", input_shape=(13,)),
          Dense(10,activation="tanh"),
          Dense(10,activation="tanh"),
          Dense(1,activation="relu"),
]
model = Sequential(layers=layers,)

[<tensorflow.python.keras.layers.core.Dense object at 0x7f82b7d43550>, <tensorflow.python.keras.layers.core.Dense object at 0x7f82b7d45b70>, <tensorflow.python.keras.layers.core.Dense object at 0x7f82b7d45dd8>, <tensorflow.python.keras.layers.core.Dense object at 0x7f82b7d460b8>]


In [9]:
assert isinstance(model, Sequential)
assert len(layers) == 4
for i,layer in enumerate(layers):
    assert isinstance(layers[i], Dense) 
    assert layer.weights[1].shape == [10,10,10,1][i]

In TensorFlow, models are "compiled" before training. Compiling specifies the type of optimizer (e.g., gradient descent) and loss function, and creates code that will run efficiently on your hardware during training and model prediction.

Review the models' .compile() methods to create code in the cell below.
https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile


In [17]:
# Apply the model's .compile() method to set it up for training.

# Set up the model to do the following:
# - use stochastic gradient descent to fit the model
# - use mean absolute error as its loss function
# - use mean absolute error as one of its metrics
# - use mean squared error as one of its metrics
model.compile(optimizer="SGD",loss="mean_absolute_error",metrics=["mean_absolute_error","mean_squared_error"])

In [11]:
assert isinstance(model.optimizer, keras.optimizers.SGD)
assert model.loss in ["mae", "mean_absolute_error"]

In [18]:
# Now fit the model
model.fit(x_train, y_train, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x7f8216518d30>

In [24]:
model.summary()

Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_35 (Dense)             (None, 10)                140       
_________________________________________________________________
dense_36 (Dense)             (None, 10)                110       
_________________________________________________________________
dense_37 (Dense)             (None, 10)                110       
_________________________________________________________________
dense_38 (Dense)             (None, 1)                 11        
Total params: 371
Trainable params: 371
Non-trainable params: 0
_________________________________________________________________


In [25]:
# Here we can evaluate how our model does based on the test data
model.evaluate(x_test, y_test)



[6.675572395324707, 6.675572395324707, 88.9620590209961]

Now let's try another optimizer instead of stochastic gradient descent (SGD). [Adam](https://keras.io/optimizers/#adam) is the recommended default for training neural networks since it usually performs quite well. In the next cell, compile the model with Adam instead of SGD and use the same loss and metrics. After compiling, fit the data for as many epochs as you think it takes to see the value start to converge.

In [13]:
# Compile the model using Adam
model.compile(optimizer="adam",loss="mean_absolute_error",metrics=["mean_absolute_error","mean_squared_error"])
model.fit(x_train, y_train, epochs=300)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x7f82874dd470>

In [14]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 10)                140       
_________________________________________________________________
dense_1 (Dense)              (None, 10)                110       
_________________________________________________________________
dense_2 (Dense)              (None, 10)                110       
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 11        
Total params: 371
Trainable params: 371
Non-trainable params: 0
_________________________________________________________________


In [15]:
assert isinstance(model.optimizer, keras.optimizers.Adam)
assert model.loss in ["mae", "mean_absolute_error"]

In [16]:
model.evaluate(x_test, y_test)



[5.303440570831299, 5.303440570831299, 60.850006103515625]

Now recreate the model with dropout layers. Add two dropout layers, one after the first layers of neurons and one after the second layer of neurons.
You may use your existing ```layers``` list and insert the new layers, or create a new list of layers from scratch. Then construct the model as before.
Select a low value of dropout (say, <0.1) that results in a good score.

In [19]:
layers = [
          Dense(10, activation="relu", input_shape=(13,)),
          Dropout(rate=.08),
          Dense(10,activation="tanh"),
          Dropout(rate=.08),
          Dense(10,activation="tanh"),
          Dense(1,activation="relu"),
]
model = Sequential(layers=layers,)

In [20]:
assert len(layers) == 6
assert isinstance(layers[1], Dropout)
assert isinstance(layers[3], Dropout)
for i,layer in enumerate(layers):
    if i not in [1,3]:
        assert isinstance(layers[i], keras.layers.Dense) 
        assert layer.weights[1].shape == [10,0,10,0,10,1][i]

In [21]:
model.compile(optimizer='adam', loss='mae', metrics=['mae', "mse"])
model.fit(x_train, y_train, epochs=500, verbose=0)
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 10)                140       
_________________________________________________________________
dropout (Dropout)            (None, 10)                0         
_________________________________________________________________
dense_5 (Dense)              (None, 10)                110       
_________________________________________________________________
dropout_1 (Dropout)          (None, 10)                0         
_________________________________________________________________
dense_6 (Dense)              (None, 10)                110       
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 11        
Total params: 371
Trainable params: 371
Non-trainable params: 0
________________________________________________________

In [22]:
model.evaluate(x_test, y_test)



[5.49431037902832, 5.49431037902832, 60.353885650634766]

Select a dropout rate that gets an okay score.

In [23]:
assert model.evaluate(x_test, y_test)[0] < 7



## Feedback

In [24]:
def feedback():
    """Provide feedback on the contents of this exercise
    
    Returns:
        string
    """
    return "Cool assignment!"

In [25]:
feedback()

'Cool assignment!'