
<h1 style="text-align: center;"><a title="Data Science-AIMS-Cmr-2021-22"> Linear Models as Neural Networks</h1>

## <font color="green"> Learning outcomes:

* Creating a linear regression model using Tensorflow and Dense layers - concise approach

* Learning about the Sequential Model

* How to view summary trainable weights

## <font color="green">Data information:

* Features: one variable real-valued

* Output: one variable real-valued

## <font color="green">Tasks for participants (boolean)?

* * Yes, at the end (try avoid copy/pasting code, rather write it out)

## Various Python imports

In [None]:
from tensorflow import keras
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

### Let's create some data

In [None]:
X=np.arange(-10, 15, 0.1).reshape(-1, 1)
Y=X+4

View the first value X and Y values

In [None]:
X[0:5]

In [None]:
Y[0:5]

Let's plot our data

In [None]:
plt.plot(X,Y,color='r',label='Data line')
plt.legend(loc='right')
plt.xlabel("X")
plt.ylabel("Y")
plt.title('Our data')
plt.show()

### Task: What simple linear model do you think could be used to solve this problem? Express the model as y = .... ?

## Step 1: Define a sequential model

API documentation for Sequential https://www.tensorflow.org/api_docs/python/tf/keras/Sequential

Definition: Sequential groups a linear stack of layers.

Below we assign it to the variable 'model' but you can use any variable name, just like with normal Python.

In [None]:
model = tf.keras.Sequential()

In [None]:
model

## Step 2: Add the first layer

Below we create the first hidden layer which has 1 unit.

To add layer to our model we use the .add() function. Our network was stored in the variable 'model' so hence we use 'model.add()'.

In the example below we are adding a fully connected layer which in Tensorflow is called a 'Dense' layer. The dense layer has a number of arguements, just like a normal Python function might have. Take a moment to look at the API to see which arguements it can take.

API documentation for Dense https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense

The Dense layer takes on a number of arguements such as 'kernel_initializer', 'bias_initializer', ... . But to start off with we won't specify values ourselves for all of those, instead, we will use the default values to make this example easier. The absolute minimum is to specify the number of units.

We also need to specify the input shape *only* on the very first hidden layer in a Tensorflow model. This makes sense as the code needs to know how many inputs there are so that it can create the correct weights.

We also specify an activation function, in this case the linear function

In [None]:
model.add(keras.layers.Dense(units=1, input_shape=[1], activation='linear'))

## Task: what do you think the model looks like if you had to draw it on paper?

## Step 3: Compile the model

Before training a model, we need to compile it. This allows us to provide the loss function and the optimiser.

In [None]:
model.compile(loss = "mean_squared_error", optimizer="sgd", metrics=['mse'])

## Take a look at the model before training

The output contains three columns. Take a moment to look at them.

* Does the output shape make sense for our data?

* Why are there only two parameters (column named: Param #)?

* What does trainable/non-trainable mean?

* Why is there "None" in the output shape?

In [None]:
model.summary()

## View the weights of the model

Now that we have defined and compiled our model let's take a moment to see what the randomly initialised weights are. The array below contains 2 values. Does this make sense? What could each one represent?

In [None]:
model.get_weights()

## Given the values above, what does the model now represent if you had to write it out

## y = ... ?

## Let's see what the model will predict on some arbitrary value (e.g. 34). Note that we haven't trained the model yet so it would be reasonable to expect an incorrect output.

## What should the correct output actually be for our example input of 34?

In [None]:
model.predict([34.0])

## Step 4: Train the network

To train the model we use the .fit() function.

API for the .fit() https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit is actually in the Model object. This makes sense as we had essentially created a Model object for which one of it's properties is 'fit'. This function takes on many parameters, let's ignore a lot of them for now. The minimum is to provide the features (X), targets (Y) and number of epochs.

Let's keep it simple and train for 15 epochs.

Take a look at the output. Information about each epoch is provided.

Note that we store the result of this function call into a variable called 'history'. This will allow us to extract information about the loss and the 'metric' over each epoch - essentially allowing us to make plots.



In [None]:
history = model.fit(X, Y, epochs=15)

## Plot the training history

To plot the history of the training, we make use of the 'history' variable defined above. Note that you can name it anything you want, just like with any Python variable. First let's see what is inside the history variable. It contains a property called '.history'. Let's view it

In [None]:
history.history

In [None]:
plt.figure(figsize=(8, 8))
plt.plot(history.history['mse'])
plt.title('Model loss')
plt.ylabel('Mean Squared Error')
plt.xlabel('Epoch')
plt.show()

## Predict on some value

We can use the .prediction() function to make predictions. API https://www.tensorflow.org/api_docs/python/tf/keras/Model#predict

In [None]:
model.predict([34.0])

## View the weights of the model (after training) Compare these to the weights you got before training.

In [None]:
model.get_weights()

## Extract the first weight

In [None]:
model.layers[0].get_weights()[0][0][0]

## Extract the bias

In [None]:
model.layers[0].get_weights()[1][0]

## Insert the values into the equation w_1 X + b, where X = 34. Does this give the same result as the predict function above? Based on the weight and bias you obtained, do these values seem reasonable for our data?

In [None]:
34*model.layers[0].get_weights()[0][0][0] + model.layers[0].get_weights()[1][0]

## Plot the results

In [None]:
plt.plot(X,Y,color='r',label='Correct output')
plt.plot(X,model.predict(X),color='b', linestyle='dashed',label='Network output')
plt.legend(loc='right')
plt.xlabel("X")
plt.ylabel("Y")
plt.title('Simple linear neural network')
plt.show()

## Exercise: (Remember the assignment?)

* Use the advertising dataset available [here](https://raw.githubusercontent.com/rock-feller/Datasets_for_Education/main/data_01/Advertising.csv) 
* Divide it intro train and test set (20% for the test set) 
* T train a **Shallow Neural Network** to predict  **the Sales** based on **Advertising** on **TV, Radio and Newspaper**
* Train a **Deep Neural Network** to predict  **the Sales** based on **Advertising** on **TV, Radio and Newspaper**
* Try and Tune some hyperparameters to get your best model your mean square error here: https://docs.google.com/forms/d/e/1FAIpQLSdKxBAULDyDz94iAWJeJxAIswpiyT4WamPrAkioyl-GLKCrXw/viewform?usp=sf_link

In [None]:
### WRITE YOUR CODE HERE (USE THE ONE ABOVE AND IMPROVE IT)




# References:

* This notebook was adpated from Dr. Emmanuel Dufourq,  2021 Gene Golub SIAM Summer School 