## Motivation 

What is a regression problem?

Key characteristics:

- Estimating the relationship between a dependent variable (label) and independent variables (predictors, features or covariates). 

- A form of supervised learning. 

- Numeric data.


## Anatomy of Neural Networks (Simple)

![alt text](images/NN_diagram.png "NN_Diagram")

![alt text](images/regression_NN_architecture.png "regression_NN_architecture")

## Creating Sample Data

In [None]:
# Import modules.
import tensorflow as tf 
import numpy as np
import matplotlib.pyplot as plt

# Creating features.
X = np.array([-8.0, -4.0, -1.0, 2.0, 5.0, 8.0, 11.0, 14.0])

# Creating labels.
y = np.array([3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0])

# Visualizing data.
plt.scatter(X, y)

# Convert NumPy arrays to tensors.
X = tf.constant(X)
y = tf.constant(y)

## Modelling with Tensor Flow

In general:

1. **Create a model**: define the input anf output layers, as well as hidden layers.

2. **Compiling a model**: define the loss function, optimier, and evaluation metrics.

3. **Fit a model**: let the model find patterns between fetures and labels.

In [None]:
# Set random seed.
tf.random.set_seed(42)

# Initialise a model using the sequential API.
model = tf.keras.Sequential([
    tf.keras.layers.Dense(1) # predict y_{i} using X_{i}
])

# Compile the model.
model.compile(
    loss = tf.keras.losses.mae, 
    optimizer = tf.keras.optimizers.SGD(),
    metrics = ["mae"]
)

# Fit the model.
model.fit(tf.expand_dims(X, axis = -1), y, epochs = 5)

# Predicting new label.
model.predict(np.array([[17.0]]))


## Improving a Model

The previous model did not perform well. We can improve our model by altering the steps we took to create it.

1. **Creating a model**: add layers, incerease the number of hidden units within each hideen leayer, and or change theactivation function of each layer.

2. **Compiling a model**: change the optimization functions or learning rate. 

3. **Fitting a model**: increase the number of epochs or provide the model with more data.

In [None]:
# Set random seed.
tf.random.set_seed(42)

# Initialise a model using the sequential API.
model = tf.keras.Sequential([
    tf.keras.layers.Dense(100, activation="relu"),
    tf.keras.layers.Dense(1) # predict y_{i} using X_{i}
])

# Compile the model, but this time with different optimizer.
model.compile(
    loss = tf.keras.losses.mae, 
    optimizer = tf.keras.optimizers.SGD(),
    metrics = ["mae"]
)

# Fit the model, but this time with more epochs.
model.fit(tf.expand_dims(X, axis = -1), y, epochs = 100)

# Predicting new label.
model.predict(np.array([[17.0]]))

# Despite having a lower MAE, when tested on new data the model performs worse. 
# This is due to overfitting. 