# Introduction to Regression with Neural Networks in Tensorflow

There are many definition for a regression model but in this case, i'm going to simplify it: predicting a numerical variable based on some other combination of variables, even shorter... predicting a number

In [None]:
import tensorflow as tf
from keras.utils import plot_model
print(tf.__version__)


# Creating Data to view and fit

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Create faetures
x = np.array([-7.0, -4.0, -1.0, 2.0, 5.0, 8.0, 11.0, 14.0])

# Craete labels
y = np.array([3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0])

plt.scatter(x, y)


# Input and Output shapes

In [None]:
# Creating a demo tensor for our housing prediction problems
house_info = tf.constant(["bedroom", "bathroom", "kitchen"])
house_price = tf.constant([939700])
house_info.shape, house_price.shape


In [None]:
x_ten = tf.constant(x)
y_ten = tf.constant(y)
x_rs = tf.reshape(x_ten, [1, 8])
y_rs = tf.reshape(y_ten, [1, 8])
x_rs.shape, y_rs.shape


# Steps in modelling with Tensorflow

1. **Craeting a model** - define the input and output layers, as well as the hidden layers of a deep learning model
2. **Compiling a model** - define a loss function (in other words, the fuction which tells our model how wrong it is) and the optimizer (tells our model how to improve the patterns in learning) and evaluation metrices (what we can use to interpret of our model).
3. **Fitting the model** - letting the model trying to find patterns between X & Y (features and labels) 

In [None]:
# Settting a random seed
tf.random.set_seed(42)

# Create a model using the Sequential API
model = tf.keras.Sequential([
    tf.keras.layers.Dense(1)
])

# Compile the model
model.compile(loss="mae", optimizer="sgd", metrics=['mae'])

# Fit the model
model.fit(x_rs, y_rs, epochs=10)


In [None]:
# Try and make predictions using our model
x_rs = tf.constant([[17.0, 20.0, 23.0, 26.0, 29.0, 32.0, 35.0, 38.0]])
model.predict(x_rs)


# Improving our model

We can improve our model, by altering the steps we took to create our model

1. **Craeting a model** - Here we might add more layers, increase the number of hidden units (all called neurons) within each of the hidden layers, change the activation function of each layer.
2. **Compiling a model** - Here we might change the optimization function or perhaps the learning rate of the optimization function of each layer.
3. **Fitting the model** - Here we might fit the model for more epochs (leave it training for longer) or on more data (give the model more examples to learn from).

In [None]:
# Let's rebuild our model

# 1. Create the model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(150, activation="relu"),
    tf.keras.layers.Dense(150, activation="relu"),
    tf.keras.layers.Dense(150, activation="relu"),
    tf.keras.layers.Dense(1)
])

# 2. Compile the model
model.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.Adam(lr=0.001),
              metrics=["mae"])

# 3. Fit the model (this time we'll train for longer)
model.fit(x_rs, y_rs, epochs=100)


# Evaluating a model

In practical, a typical workflow you'll go through when building neural networks is:

```
Build a model --> fit it --> tweak the model --> evaluate it...
```

In [None]:
# Make a bigger dataset
x = tf.range(-100.0, 100.0, 4)
x
# Make labels for the dataset
y = x + 10
y

# Visualize the data
plt.scatter(x, y)


## The 3 sets...

* **Training set** - The model learns from this data, which is typically 70-80% of the total data you have available.
* **Validation set** - The model gets tuned on this data, which is typically 10-15% of the total data you have available
* **Test set** - The model gets evaluated on this data to test what it has learned, this set is typically 10-15% of the total data you have available

In [None]:
# Check the length how many samples we have
len(x)

# Split the data into train and test sets
x_train = x[:40]  # First 40 are training samples (80% of the data)
y_train = y[:40]  # First 40 are training samples (80% of the data)

x_test = x[40:]  # last 10 are test samples (20% of the data)
y_test = y[40:]  # last 10 are test samples (20% of the data)

x_train_tensor = tf.constant(x_train)
y_train_tensor = tf.constant(y_train)

x_test_tensor = tf.constant(x_test)
y_test_tensor = tf.constant(y_test)


## Visualizing the data

Now we've got data training and test sets let's visualize it

In [None]:
plt.figure(figsize=(10, 7))

# Plot training data in blue
plt.scatter(x_train_tensor, y_train_tensor, c="b", label="Training data")

# Plot test data in green
plt.scatter(x_test_tensor, y_test_tensor, c="g", label="Test data")

# Show a legend
plt.legend()


In [None]:
# Building a neural network for our data

tf.random.set_seed(42)

# 1. Create a model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=[1], name="input_layer"),
    tf.keras.layers.Dense(1, name="output_layer")
], name="model_1")

# 2. Compile the model
model.compile(loss="mae",
              optimizer=tf.keras.optimizers.SGD(),
              metrics=["mae"])

model.summary()


* **Total params** - total number of parameters in the model.
* **Trainable params** - these are the parameters (patterns) the model can update as it trains
* **Non-trainable params** - these parameters aren't updated during training (this is typical when we bring in laready learn patters or parameters from other models during **transfer learning**).

📖 **Resource:** For a more in-depth overview of the trainable parameters within a layer, check out [MIT's Introduction to Deep learning](https://www.youtube.com/watch?v=7sB052Pz0sQ&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI) video

![image](https://3.bp.blogspot.com/-JYbNelUE66k/XhOdLxanW5I/AAAAAAAACVI/fD375RZr0BAtoosh-e5Zbzz-gEll6bmxwCLcBGAsYHQ/s1600/intro.png)

🛠 **Exercise:** Try play around with the number of hidden units in the dense layer, see how that effects the number of parameters (total and trainble) by calling `model.summary()`.

In [None]:
# Fitting the model
model.fit(x_train, y_train, epochs=100, verbose=0)

plot_model(model=model, show_shapes=True)


## Visualizing our model's prediction

To visualize predictions, it's a good idea to plot them against the ground truth labels.
Often we'll see this in the form of `y_test` or `y_true` vs `y_pred` (ground truth vs model's predictions)

In [None]:
# Make some predictions
y_pred = model.predict(x_test_tensor)
y_pred


In [None]:
y_test


In [None]:
# Creating a plotting function
def plot_pred(train_data,
              train_labels,
              test_data,
              test_label,
              predictions):
    '''
    Plots training data, test data and compares predictions to ground truth labels.
    '''
    plt.figure(figsize=(10, 7))

    # Plot training data in blue
    plt.scatter(train_data, train_labels, c="b", label="Training Data")
    # Plot testing data in green
    plt.scatter(test_data, test_label, c="g", label="Test Data")
    # Plot model's predictions in red
    plt.scatter(test_data, predictions, c="r", label="Predictions")

    plt.legend()


plot_pred(x_train_tensor, y_train_tensor, x_test_tensor, y_test_tensor, y_pred)


In [None]:
# Evaluate the model on the test
model.evaluate(x_test, y_test)


In [None]:
# Calculate the mean absolute error
tf.metrics.mean_absolute_error(y_true=y_test,
                               y_pred=tf.constant(y_pred))


In [None]:
tf.constant(y_pred)
tf.reshape(y_pred, shape=(10,))


In [None]:
# Calculate the mean absolute error
mae = tf.metrics.mean_absolute_error(y_true=y_test_tensor,
                                     y_pred=tf.squeeze(y_pred))
mae


In [None]:
# Calculate the mean squared error
mse = tf.metrics.mean_squared_error(y_true=y_test_tensor,
                                    y_pred=tf.squeeze(y_pred))
mse


In [None]:
# Making functions to reuse the MAE and MSE
def mae(y_true, y_pred):
    return tf.metrics.mean_absolute_error(y_true=y_true, y_pred=tf.squeeze(y_pred))


def mse(y_true, y_pred):
    return tf.metrics.mean_squared_error(y_true=y_true, y_pred=tf.squeeze(y_pred))


### Running experiments to improve our model

1. Get more data
2. Make the model larger (using a more complex model)
3. Train for longer

Let's do 3 modelling experiments

1. `model_1` - same as the original model, 1 layer, trained for 100 epochs.
2. `model_2` - 2 layers, trained for 100 epochs.
3. `model_3` - 2 layers, trained for 500 epochs.

In [None]:
# Building model 1
tf.random.set_seed(42)

# 1. Create the model
model_1 = tf.keras.Sequential([
    tf.keras.layers.Dense(1, input_shape=[1])
])

# Compiling the model
model_1.compile(loss="mae",
                optimizer=tf.keras.optimizers.SGD(),
                metrics=["mae"])

# Fit the model
model_1.fit(x_train_tensor, y_train_tensor, epochs=100)


In [None]:
y_pred_1 = model_1.predict(x_test_tensor)
plot_pred(x_train_tensor, y_train_tensor, x_test_tensor, y_test_tensor, y_pred_1)

In [None]:
mae_1 = mae(y_test_tensor, y_pred_1)
mse_1 = mse(y_test_tensor, y_pred_1)

mae_1, mse_1

In [None]:
# Building model 2
tf.random.set_seed(42)

# 1. Create the model
model_2 = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=[1]),
    tf.keras.layers.Dense(1)
])

# Compiling the model
model_2.compile(loss="mae",
                optimizer=tf.keras.optimizers.SGD(),
                metrics=["mae"])

# Fit the model
model_2.fit(x_train_tensor, y_train_tensor, epochs=100)

In [None]:
y_pred_2 = model_2.predict(x_test_tensor)
plot_pred(x_train_tensor, y_train_tensor, x_test_tensor, y_test_tensor, y_pred_2)

In [None]:
mae_2 = mae(y_test_tensor, y_pred_2)
mse_2 = mse(y_test_tensor, y_pred_2)

mae_2, mse_2

In [None]:
# Building model 3
tf.random.set_seed(42)

# 1. Create the model
model_3 = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=[1]),
    tf.keras.layers.Dense(1)
])

# Compiling the model
model_3.compile(loss="mae",
                optimizer=tf.keras.optimizers.SGD(),
                metrics=["mae"])

# Fit the model
model_3.fit(x_train_tensor, y_train_tensor, epochs=200)

In [None]:
y_pred_3 = model_3.predict(x_test_tensor)
plot_pred(x_train_tensor, y_train_tensor, x_test_tensor, y_test_tensor, y_pred_3)

In [None]:
mae_3 = mae(y_test_tensor, y_pred_3)
mse_3 = mse(y_test_tensor, y_pred_3)

mae_3, mse_3

🔑 **Note:** Starting with small experiments (small models) and make sure they work and then increase their scale when necessary.

## Compiling the results of our experiments

We've run a few experiments, let's compare the results.

In [None]:
# Let's compare our model's results using a pandas DataFrame
import pandas as pd

model_results = [["model_1", mae_1.numpy(), mse_1.numpy()],
                 ["model_2", mae_2.numpy(), mse_2.numpy()],
                 ["model_3", mae_3.numpy(), mse_3.numpy()]]

all_results = pd.DataFrame(model_results, columns=["model", "mae", "mse"])
all_results

Looks like `model_2` preformed the best...

In [None]:
model_2.summary()

> 🔑 **Note:** One of our main goals should be to minimize the time between our experiments. The more experiments we do, the more things we'll figure out which don't work, in turn we'll get closer to figuring out what does work. 

## Tracking the experiments

One really good habit in machine learning modelling is to track the results of your experiments.

And when doing so it can be tedious if you're running a lots of experiments. 

Luckily, there are tools to help us

📖 **Resource:** As we build more models we'll want to look into using:

* [TensorBoard](https://www.tensorflow.org/tensorboard/get_started) - A component of tensorflow library to help track modelling experiments (We'll see this on later).
* [Weights & Biases](https://wandb.ai/site) - A tool for tracking all kinds of machine learning experiments (plugs straight into TensorBoard)

## Saving our models

Saving our models allows us to use them outside of Google colab such as in a web application or on a mobile app

There are two main formats we can save out model's too:
1. The `SaveModel` Format
2. The `HDF5` Format

In [None]:
# Save model using SaveModel Format
model_2.save("best_model_SaveModel_format")

In [None]:
# Save model using HDF5 Format
model_2.save("best_model_HDF5_format.h5")

## Loading in a saved model

In [None]:
# Load in the SavedModel Format model
loaded_SavedModel_format = tf.keras.models.load_model("best_model_SaveModel_format")
loaded_SavedModel_format.summary()

In [None]:
# Compare model 2 predictions with the SavedModel format model predictions
model_2_preds = model_2.predict(y_test_tensor)
savedmodel_preds = loaded_SavedModel_format.predict(y_test_tensor)

mae(y_test_tensor, model_2_preds) == mae(y_test_tensor, savedmodel_preds)

## A Larger Example

In [None]:
# Read in a Insurance Dataset
insurance = pd.read_csv("insurance.csv")
insurance

In [None]:
insurance_onehot = pd.get_dummies(insurance)
insurance_onehot

In [None]:
# Craete X and Y values (features and labels)
x1 = insurance_onehot.drop("charges", axis=1)
y1 = insurance_onehot["charges"]

In [None]:
# Craete training and test sets
from sklearn.model_selection import train_test_split
x1_train, x1_test, y1_train, y1_test = train_test_split(x1, y1, test_size=0.2, random_state=42)
len(x1_train), len(x1_test)

In [None]:
# Build a neural network (sort of like model_2 above)
tf.random.set_seed(42)

# Create insurence model
model_ins = tf.keras.Sequential([
    tf.keras.layers.Dense(100),
    tf.keras.layers.Dense(1)
])

# Compile the model
model_ins.compile(loss="mae",
                  optimizer="sgd",
                  metrics=["mae"])

# Fitting the model
model_ins.fit(x1_train, y1_train, epochs=100)

In [None]:
# Check the results of the insuarnce model on the test data
model_ins.evaluate(x1_test, y1_test)

Right now it looks like our model isn't performing very well... let's try to improve it