<a href="https://colab.research.google.com/github/berthine/SIAM-Summer-School/blob/main/SIAM2021_Linear_Reg_2_(long).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Practical: Introduction (2) to linear regression using Tensorflow (long approach)
(19/July/2021)

### 2021 Gene Golub SIAM Summer School 
https://sites.google.com/aims.ac.za/g2s3/home 

Instructor

<font color="green">***Dr. Emmanuel Dufourq*** 

www.emmanueldufourq.com

edufourq (['@']) gmail.com

***African Institute for Mathematical Sciences***

***Stellenbosch University***

***2021***


Material adapted from:

https://d2l.ai/chapter_linear-networks/linear-regression-scratch.html

https://www.tensorflow.org/guide/basic_training_loops

## <font color="green"> Learning outcomes:

* Implement your own model using two features. Check your understanding of the previous notebook


## <font color="green">Data information:

* Features: two real valued features

* Output: one real valued label

## <font color="green">Tasks for participants (boolean)?

* Yes, at the end (try avoid copy/pasting code, rather write it out)


In this notebook we'll create a dataset that has two features. We will create a linear model

In [None]:
import tensorflow as tf
import random
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

## First we generate some data

In [None]:
def synthetic_data(w, b, num_examples):
    """Generate y = Xw + b + noise."""
    X = tf.zeros((num_examples, w.shape[0]))
    X += tf.random.normal(shape=X.shape)
    y = tf.matmul(X, tf.reshape(w, (-1, 1))) + b
    y += tf.random.normal(shape=y.shape, stddev=0.01)
    y = tf.reshape(y, (-1, 1))
    return X, y

# True values
true_w = tf.constant([4, -3.4])
true_b = 4.2

features, labels = synthetic_data(true_w, true_b, 1000)

In [None]:
features.shape

In [None]:
labels.shape

## Plot the data

Plotting the first feature

In [None]:
plt.scatter(features[:,0].numpy(), labels.numpy(), c="b")
plt.show()

Plotting the second feature

In [None]:
plt.scatter(features[:,1].numpy(), labels.numpy(), c="b")
plt.show()

Recall that training models consists of making multiple passes over the dataset, grabbing one minibatch of examples at a time, and using them to update our model. 

Each minibatch consists of a tuple of features and labels.

The function below creates an iterator which can generate mini-batches of X-y pairs



In [None]:
def data_iter(batch_size, features, labels):
    num_examples = len(features)
    indices = list(range(num_examples))
    
    # The examples are read at random, in no particular order
    random.shuffle(indices)

    for i in range(0, num_examples, batch_size):
        j = tf.constant(indices[i:min(i + batch_size, num_examples)])

        # Return a tuple of features and labels
        yield tf.gather(features, j), tf.gather(labels, j)

Let's take a look at a mini batch

In [None]:
batch_size = 10

for X, y in data_iter(batch_size, features, labels):
    print ('mini batch X')
    print(X)
    print ('\nmini batch Y')
    print (y)
    break

## To do: Define the model 

In [None]:
class MySecondModel(tf.Module):
  def __init__(self, **kwargs):
    super().__init__(**kwargs)

    # Let's re-define our variables here
    self.w = # to do
    self.b = # to do

  def __call__(self, x):
    return # to do

In [None]:
linear_reg = # to do

## To do: display the variables

In [None]:
# to do

Plot the predictions before the optimisation process

Feature 0

In [None]:
plt.scatter(X[:,0], y, c="b")
plt.scatter(X[:,0], linear_reg(X).numpy(), c="r")
plt.show()

Feature 1

In [None]:
plt.scatter(X[:,1], y, c="b")
plt.scatter(X[:,1], linear_reg(X).numpy(), c="r")
plt.show()

## Define the loss function

reduce_mean : https://www.tensorflow.org/api_docs/python/tf/math/reduce_mean?hl=en

squre : https://www.tensorflow.org/api_docs/python/tf/math/square?hl=en

In [None]:
# This computes a single loss value for an entire batch
def loss(target_y, predicted_y):
  return tf.reduce_mean(tf.math.square(target_y - predicted_y))

## TO do: Training

In [None]:
# Define a learning rate
lr = # to do

# Define number of epochs
num_epochs = # to do

# We will keep track of the weights so we can plot them over the epochs
Ws_0, bs, Ws_1= [], [], []

# Iterate for a number of epochs (1)
for epoch in range(num_epochs):

    # In each epoch generate batches of training data (2)
    for X, y in data_iter(batch_size, features, labels):

        # Trainable variables are automatically tracked by GradientTape (3)
        with tf.GradientTape() as g:
            l = # to do

        # Compute gradient on l with respect to [`w`, `b`] which are on
        # inside the model (self.w and self.b) (4)
        dw, db = # to do

        # Subtract the gradient scaled by the learning rate (5)
        linear_reg.w.assign_sub(# to do)
        linear_reg.b.assign_sub(# to do)

    # Keep track of the weights so we can make a nice plot
    Ws_0.append(linear_reg.w.numpy()[0])
    Ws_1.append(linear_reg.w.numpy()[1])
    bs.append(linear_reg.b.numpy())

    # Compute this epochs's training loss
    train_l = loss(linear_reg(X), y)

    # Print to the screen
    print(f'epoch {epoch + 1}, loss {float(tf.reduce_mean(train_l)):f}, w: {linear_reg.w.numpy()}, b: {linear_reg.b.numpy()}')

## To do:

spend some time famialising yourself with steps (1) to (5) above.

## Evaluate

Plot the change in weights compared to the true ones

First feature

In [None]:
plt.plot(range(num_epochs), Ws_0, "r")

plt.plot([true_w.numpy()[0]]  * len(range(num_epochs)), "b--")

plt.legend(["W", "True W"])
plt.show()

Second feature and bias

In [None]:
plt.plot(range(num_epochs), Ws_1, "r",
         range(num_epochs), bs, "b")

plt.plot([true_w.numpy()[1]] * len(range(num_epochs)), "r--",
         [true_b] * len(range(num_epochs)), "b--")

plt.legend(["W", "b", "True W", "True b"])
plt.show()