# Linear and Non-Linear Regression

## Prerequisites:

We recommend that you run this this notebook in the cloud on Google Colab or any other GPU accelerated Tensorflow Implementation. If you're not already doing so. It's the simplest way to get started. You can also [install TensorFlow locally](https://www.tensorflow.org/install/). But, again, simple is best (with caveats):

[tf.keras](https://www.tensorflow.org/guide/keras) is the simplest way to build and train neural network models in TensorFlow. So, that's what we'll stick with in this tutorial, unless the models neccessitate a lower-level API.

Note that there's [tf.keras](https://www.tensorflow.org/guide/keras) (comes with TensorFlow) and there's [Keras](https://keras.io/) (standalone). You should be using [tf.keras](https://www.tensorflow.org/guide/keras) because (1) it comes with TensorFlow so you don't need to install anything extra and (2) it comes with powerful TensorFlow-specific features.

In [1]:
# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Input
from tensorflow.keras.models import Model

# Commonly used modules
import numpy as np
import os
import sys

# Images, plots, display, and visualization
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import cv2
import IPython
from six.moves import urllib
%load_ext tensorboard
print(tf.__version__)

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)


2.3.1
1 Physical GPUs, 1 Logical GPUs


## Linear Regression using Salary Data

### The Dataset

In [None]:
import csv

from numpy import genfromtxt
my_data = genfromtxt('datasets/Salary_Data.csv', delimiter=',',skip_header=1)
X = my_data[:,0]
Y = my_data[:,1]
#X = X/(np.max(X)) 
#Y = Y/(np.max(Y))

In [None]:
plt.plot(X,Y,'.')
plt.xlabel("Working Years")
plt.ylabel("Salary")

### Regression using Sklearn (Python package)

In [None]:
from sklearn.linear_model import LinearRegression

model = LinearRegression()
res = model.fit(X.reshape((len(X),1)), Y)
predictions = model.predict(X.reshape((len(X),1)))
plt.plot(X, predictions)
plt.plot(X,Y,'.')
plt.show()
print("solpe = ", res.coef_[0],"intercept = ",res.intercept_)

### Regression using a simple Neural Network

In [None]:
def lr_scheduler(epoch):
    if epoch < 15:
        return 0.001
    else:
        return 0.001 * tf.math.exp(0.1 * (15 - epoch))

lr_scheduler_cbk = tf.keras.callbacks.LearningRateScheduler(lr_scheduler,verbose=0)

In [None]:
from tensorflow.keras import optimizers
keras.backend.clear_session()

# Hyperparameters
EPOCHS = 500
LEARNING_RATE = 0.01

# Create the Model once!
model = keras.Sequential()
model.add(Dense(1,input_dim=1,activation='linear'))
model.summary()

opt = optimizers.SGD(lr=LEARNING_RATE)
model.compile(optimizer=opt, 
                  loss='mse')


#Callbacks for easier Training
#Train the Model with the training and validation Data
# a=np.array(5000,dtype="float32",ndmin=2)
# b=np.array(2000,dtype="float32",ndmin=1)
# model.set_weights([a,b])

history = model.fit(X, Y, epochs=EPOCHS,batch_size=X.shape[0],verbose=False,
                   # callbacks=[lr_scheduler_cbk]
                   )

hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
plt.plot(history.epoch,history.history["loss"])
plt.xlabel("Epochs")
plt.ylabel("Loss")

In [None]:
a,b = model.get_weights()
print(f"sk_slope={res.coef_[0]:.3f}, sk_intercept={res.intercept_:.3f}")
print(f"tf_slope={a[0][0]:.3f}, tf_intercept={b[0]:.3f}")

In [None]:
pred = model.predict(X)

In [None]:
plt.plot(X,Y,'.')
plt.plot(X,pred)
plt.show()

#### Optimize the Neural Network
<img src="https://raw.githubusercontent.com/tensorchiefs/dl_book/master/imgs/paper-pen.png" width="40" align="left" />  

**Exercise**: 

    It seems that the linear regression for the Neural Network does not fit the datapoints. 
      * What are the reasons?
      * How can we optimize it?  
      * What are good Hyperparameters?


## Sinuswave Prediction with Feed Forward Neural Networks

Let's start with using a fully-connected neural network to do predict the shape of datapoints, so called regression. The following image highlights the difference between regression and classification (see part 2). Given an observation as input, **regression** outputs a continuous value (e.g., exact temperature) and classificaiton outputs a class/category that the observation belongs to.

<img src="https://i.stack.imgur.com/u3TNL.png" alt="classification_regression" width="400"/>

Now, we load the dataset. Loading the dataset returns six NumPy arrays:

* The `train points X` and `train points Y` arrays are the *training set*—the data the model uses to learn.
* The model is tested against the *validation set*, the `validation X`, and `validation Y` arrays.
* As we created the points based on a ground truth sinusoid, we also get two arrays `ground truth X` and `ground truth y`

In [None]:
# Example sinusoid dataset
def createDataset_Sinusoid(xmin=-10., xmax=10, noise_std=.2):
    num_data = 1000
    # Create the noise training data
    X_train = np.atleast_2d(np.linspace(xmin, xmax, num_data, dtype=np.float32)).T
    y_train = np.sin(X_train) + np.atleast_2d(np.random.normal(0, noise_std, size=num_data).astype(np.float32)).T

    # Create the ground truth
    X_gt = X_train
    y_gt = np.sin(X_gt)

    # Create the validation data
    X_val = np.atleast_2d(np.linspace(xmin-2, xmax+2, num_data, dtype=np.float32)).T
    y_val = np.sin(X_val) + np.atleast_2d(np.random.normal(0, noise_std, size=num_data).astype(np.float32)).T
    return [X_gt, y_gt,X_train, y_train, X_val, y_val]


X_gt,y_gt,X_train,y_train,X_val,y_val = createDataset_Sinusoid()

In [None]:
#Visualize the Data
fig = plt.figure(figsize=(20, 8))
ax = fig.add_subplot(111)
ax.set_title('Visualization')
ax.plot(X_train,y_train,'.',label='Training Data')
ax.plot(X_gt,y_gt,label="Ground Truth")
ax.legend()

# List for individual predictions
preds_list = []

### Build the model

Building the neural network requires configuring the layers of the model, then compiling the model. First we stack a few layers together using `keras.Sequential`. The number of Layers is very dependend on the tasks you want to perform.

#### The Number of Hidden Layers

| Num Hidden Layers       | Result        |  
| ----------------------- |:------------- | 
| None               | Only capable of representing linear separable functions or decisions.      |
| 1          | Can approximate any function that contains a continuous mapping from one finite space to another.      |
| 2          | Can represent an arbitrary decision boundary to arbitrary accuracy <br> with rational activation functions and can approximate any smooth mapping to any accuracy.     |
| >2          | Additional layers can learn complex representations (sort of automatic feature engineering) for layer layers.      |

#### The Number of Neurons in the Hidden Layers

There are many rule-of-thumb methods for determining an acceptable number of neurons to use in the hidden layers, such as the following:

* The number of hidden neurons should be between the size of the input layer and the size of the output layer.
* The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
* The number of hidden neurons should be less than twice the size of the input layer.
These three rules provide a starting point for you to consider the number of neurons. There is also some research about this topic: https://www.hindawi.com/journals/mpe/2013/425740/


Next we configure the loss function, optimizer, and metrics to monitor. These are added during the model's compile step:

* *Loss function* - measures how accurate the model is during training, we want to minimize this with the optimizer.
* *Optimizer* - how the model is updated based on the data it sees and its loss function.
* *Metrics* - used to monitor the training and testing steps.

<img src="https://raw.githubusercontent.com/tensorchiefs/dl_book/master/imgs/paper-pen.png" width="60" align="left" />  

**Exercise:**
Let's build a network with multiple hidden layer, and use mean squared error (MSE) as the loss function (most common one for regression problems). Your networks structure `could` look similar to this:

<img src="Pictures/Reg_Model.PNG" align="middle" />

The last layer of the Network should also be a dense layer and the number of neurons depends on the task you want to do. In this example we want to predict y coordinates based on x coordinates, so the output should only have 1 neuron.

**Hint**: If you run into trouble, what is missing? See the error output.


<details>
<summary><b>Click here for one possible solution</b></summary>
    
```python
def build_model():
    keras.backend.clear_session()
    
    ############ YOUR CODE HERE ############
    model = keras.Sequential()
    #The first layer is called an Input layer and has the shape of our input data
    model.add(Input(shape=(X_train.shape[-1]), name='input'))
    #add as much Dense layers as you want 
    model.add(Dense(256, activation='relu'))
    model.add(Dense(256,activation='relu'))
    model.add(Dense(128,activation='relu'))
    # Output Layer with 1 Neuron as Output
    model.add(Dense(y_train.shape[1]))
    return model

# If you run into an error compile is missing
model.compile(loss="mse",optimizer="adam",metrics=["accuracy"])
```
</details>

In [2]:
def build_model():
    keras.backend.clear_session()
    
    ############ YOUR CODE HERE ############
    model = keras.Sequential()
    
    # Output Layer with 1 Neuron as Output
    model.add(Dense(y_train.shape[1]))
    return model

In [None]:
# Build the model and show the layers
model = build_model()
#opti = tf.keras.optimizers.Adam()
#opti = tf.keras.optimizers.SGD()
#opti = tf.keras.optimizers.Adamax()

model.compile(optimizer=opti,loss="mse")
model.summary()

### Train the model

Training the neural network model requires the following steps:

1. Feed the training data to the model—in this example, the `train_features` and `train_labels` arrays.
2. The model learns to associate features and labels.
3. We ask the model to make predictions about a test set—in this example, the `test_features` array. We verify that the predictions match the labels from the `test_labels` array. 

To start training,  call the `model.fit` method—the model is "fit" to the training data:

## Training and logging the network

There are a few very importent metrics we need to look at while training the Network. As described in the earlier notebooks, the loss is very important. So while training we want to display it so that we can make sure our network learns something and isnt doing bad stuff. 

This is done by the model.fit function. Here we specify the train and the validation data and while training we get a nice output that shows us all the important values like validation loss and the training loss. 

On the Other hand we want to look at this data after we trained the model. We can use the output of the model.fit function after training to view the raw data ourself or we use so called tensorboard. For using that we need to specify a log file and location and use the tensorboard as callback for the model.fit function.

In [None]:
#Callbacks for easier Training
import datetime
logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))

EPOCHS = 200

# tensorboard lets us view at the training data afterwards! 
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)

#Train the Model with the training and validation Data
history = model.fit(X_train, y_train, epochs=EPOCHS,verbose=0,
                    callbacks=[tensorboard_callback])


# Manual plotting the training data after the training is done via the history dataframe
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch


## Start the Tensorboard for evaluation
Tensorboard is a nice tool within tensorflow that gives you some model insights. Within the tensorboard you can investigate:
 * Graphs
 * Layers
 * Training and validation
 * etc

In [None]:
#%tensorboard --logdir logs --port 6006 --bind_all

Now, let's plot the loss function measure on the training and validation sets. The validation set is used to prevent overfitting ([learn more about it here](https://www.tensorflow.org/tutorials/keras/overfit_and_underfit)). However, because our network is small, the training convergence without noticeably overfitting the data as the plot shows.

In [None]:
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Mean Square Error')
plt.plot(hist['epoch'], hist['loss'], label='Train Error')
############ YOUR CODE HERE ############
# If you want to plot more metrics e.g. "accuracy"
plt.legend()
plt.show()

We trained the model to predict the Y Value based on the X Value. Now we can test it by predicting with some test data

In [None]:
preds_list.append([model.predict(X_val),f"{opti.__class__.__name__.lower()}"])

Next, compare how the model performs on the test dataset:

In [None]:
#Visualize the Data
fig = plt.figure(figsize=(20, 8))
ax = fig.add_subplot(111)
ax.set_title(f'Visualization after {EPOCHS} Epochs Training')

# First we plot the Data we predicted on
ax.plot(X_val,y_val,'.',label='Test Data')
# The Ground Truth
ax.plot(X_gt,y_gt,label="Ground Truth")
# We predicted the y Value based on the X_val value

for p,l in preds_list:
    ax.plot(X_val,p,label=f"prediction_{l}")
ax.legend()
plt.show()

### Improve the model
<img src="https://raw.githubusercontent.com/tensorchiefs/dl_book/master/imgs/paper-pen.png" width="60" align="left" />  

**Exercise:** Now you have some time to improve the prediction. If you have any question just ask.
Please store your predictions in a list to evaluate if your model improved.

<details>
    <summary><b> Click here for some tips to improve the model:</b></summary>

  1. Try adding more Dense Layers
  2. Try adding Dropout layers `tf.keras.layers.Dropout(rate=0.0..0.5)`
  3. Increase Training Time
  4. Change the optimizer (SGD,ADAM,ADAMAX)
  5. Check your predictions after new training
</details>
