# Introduction to Neural Networks with TensorFlow

We are going to simplify regression problems by predicting a numerical variable based on some other combination of variables.

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Plotting the model
from tensorflow.keras.utils import plot_model

In [None]:
# Creating some sample data for regression
X = tf.constant(np.array([-7.0, -4.0, -1.0, 2.0, 5.0, 8.0, 11.0, 14.0]))
y = tf.constant(np.array([3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0]))  # y = x + 10
X, y

In [None]:
# Plotting the data
plt.scatter(X,y)

## Input and Output Shapes

In [None]:
# Create a Tensor for house info
house_info = tf.constant(['bedroom', 'bathroom', 'garage'])  # Input
house_price = tf.constant([939700])  # Output

house_info, house_price

In [None]:
# Figuring out input and output shape for X and y?
# NOTE: This doesn't work, the input shape should be (1,) and output shape should be (1,)
input_shape = X.shape
output_shape = y.shape
input_shape, output_shape

## Steps in Modeling with TensorFlow

1. Creating a model: define the input and output layers, as well as the hidden layers of a deep learning model.
2. Compiling a model: define the loss function (in other words, the function which tellls our model how wrong it is) and the optimizer (tells our model how to improve the patterns it's learning) and evaluation metrics (what we can use to interpret the performance of our model).
3. Fitting a model: letting the model try to find patterns between X and y (features and labels).

In [None]:
# Set random seet
tf.random.set_seed(42)

### Trial-1
Creating a simple model to iterate and improve upon.

In [None]:
# 1. Create a model using sequential api
model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(1)  # This defines the output layer of shape 1
])
model.input_shape, model.output_shape, model.layers

In [None]:
# 2. Compiling the model
# losses.mae is Mean Absolute Error
# In short, this takes the average of the error between the prediction and the actual value
loss_function = tf.keras.losses.mae

# optimizers.SGD is Stochastic Gradient Descent
# NOTE: Using optimizers.legacy.SGD due to optimizers.SGD being slower on M1 Mac
optimizer = tf.keras.optimizers.legacy.SGD()

model.compile(loss=loss_function,
              optimizer=optimizer,
              metrics=['mae'])

In [None]:
# 3. Fit the model
model.fit(X, y, epochs=5)  # epochs is how many times the model runs through the training data

In [None]:
# Testing predictions of the model
model.predict([17.0])  # This is pretty far off, so next steps are to improve the model

### Improving the model

We can improve our model by altering the steps we took to create a model.

1. **Creating a Model**: We can add more layers, increase number of neurons in each hidden layer, change the activation function of each layer.
2. **Compiling a Model**: We can change the loss function, use a different optimization function, or the learning rate of the optimization function.
3. **Fitting a Model**: We can fit a model with more epochs (cycle through training data more times) or on more data (give more examples to learn from).

**NOTE**: It is better to start with a smaller model, then impr

### Trial-2
Only difference between Trial 1 and Trial 2 is I will increase the epochs (number of cycles trained through the training data)

In [None]:
# Rebuilding/Improving the Model

# 1. Create a model using sequential api
model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(1)  # This defines the output layer of shape 1
])

# 2. Compiling the model
# losses.mae is Mean Absolute Error
# In short, this takes the average of the error between the prediction and the actual value
loss_function = tf.keras.losses.mae

# optimizers.SGD is Stochastic Gradient Descent
# NOTE: Using optimizers.legacy.SGD due to optimizers.SGD being slower on M1 Mac
optimizer = tf.keras.optimizers.legacy.SGD()

model.compile(loss=loss_function,
              optimizer=optimizer,
              metrics=['mae'])

# 3. Fit the model (training for 100 epochs instead of 5)
model.fit(X, y, epochs=100)  # epochs is how many times the model runs through the training data

In [None]:
# Testing predictions of the model
model.predict([17.0])  # This is pretty far off, so next steps are to improve the model

### Trial-3
Keeping epochs the same as Trial-2, but adding a hidden layer.

In [None]:
# Rebuilding/Improving the Model

# 1. Create a model using sequential api
model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(100, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(1)
])

# 2. Compiling the Model
loss_function = tf.keras.losses.mae
optimizer = tf.keras.optimizers.legacy.SGD()

model.compile(loss=loss_function, optimizer=optimizer, metrics=['mae'])

# 3. Fit the Model
model.fit(X, y, epochs=100)

In [None]:
# Testing predictions of the model (Actual value should be 27)
model.predict([17.0])  # Weird, this value prediction is not even close to the value, even though the mae is lower

### Trial-4
Keeping epochs the same as Trial-2 and hidden layers from Trial-3, but changing the optimizer

In [None]:
# Rebuilding/Improving the Model

# 1. Create a model using sequential api
model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(100, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(1)
])

# 2. Compiling the Model
loss_function = tf.keras.losses.mae
optimizer = tf.keras.optimizers.legacy.Adam(lr=0.01)

model.compile(loss=loss_function, optimizer=optimizer, metrics=['mae'])

# 3. Fit the Model
model.fit(X, y, epochs=100)

In [None]:
# Testing predictions of the model (Actual value should be 27)
model.predict([17.0])

## Evaluating a Model

In practive, a typical workflow you'll go through consists of the following:

create a model -> fit it -> evaluat it  -> tweak it -> repeat....

**Most Important to Eavluating**
Most important step of eavluating a model is to Visualize. It's a good idea to visualize:
* The data: what data are we working with? What does it look like?
* The model: What does our model look like?
* The training of the model: How does aa model perform while it learns?
* The predictions of the model: How accurate are the predictions?

## Big Data
What is a good way to train a model? Give it a bigger dataset!

In [None]:
# Making a bigger set of data.
X = tf.range(-100, 100, 4)
y = X + 10
X, y

In [None]:
# Visualize the data
plt.scatter(X, y)

### Splitting Data
Need to split my complete data set into a training set and a test set

**NOTE: There are typically 3 sets to split the data with**
1. **Training Set**: Typically 70-80% of the data available used to train the model
2. **Validation Set**: The data the model gets tuned on. Typically 10-15%.
3. **Test Set**: The model gets evaluated on this data to teest what it has learned. Typically 10-15% of data.

In [None]:
# Figuring out the Training and Test Set Sizes
total_units = len(X)
total_units

In [None]:
# Get the training set (80%)
# NOTE: Typically, we should shuffle the data before splitting

X_train = X[:int(total_units * 0.8)]
y_train = y[:int(total_units * 0.8)]

X_test = X[int(total_units * 0.8):]
y_test = y[int(total_units * 0.8):]

len(X_train), len(y_train), len(X_test), len(y_test)

In [None]:
# Visualizing the data
plt.figure(figsize=(10,7))
plt.scatter(X_train, y_train, c='b', label='Training Data')
plt.scatter(X_test, y_test, c='g', label='Test Data')
plt.legend()

### Create Model

In [None]:
# 1. Create Model
model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,), name='InputLayer'),
    tf.keras.layers.Dense(100, activation=tf.keras.activations.relu, name='HiddenLayer-1'),
    tf.keras.layers.Dense(1, name='OutputLayer'),
])

# 2. Compile Model
model.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.legacy.Adam(lr=0.005),
             metrics=['mae'])


# 3. Fit Model
model.fit(X_train, y_train, epochs=10)

### Visualizing the model

Visualizing the model can be done with a handful of tools.

* Summary: Looking at the summary of the model and what it looks like.
* Diagram: Looking at the diagram of the model.
* Plot: Plotting the predictions of the model

In [None]:
model.summary()

* Total params: the total parameters in the model.
* Trainable params: the parameters (patterns) the model can update as it trains.
* Non-trainable params: the parameters (patterns) that the model does not update as it trains. This occurrs when using an already trained model from transfer learning.

In [None]:
plot_model(model, show_shapes=True, show_layer_names=True)

#### Visualizing the Predictions

Plotting predictions against the actual values can help better visualize the model. This is often in the form of y_test or y_true versus y_pred.

In [None]:
# Make Some predictions
y_pred = model.predict(X_test)
y_pred, y_test

In [None]:
# Let's create a plotting function
def plot_predictions(train_data=X_train, 
                     train_labels=y_train,
                     test_data=X_test,
                     test_labels=y_test,
                     predictions=y_pred):
    """
    Plots training data, test data, and compares predictions against actual valus.
    """
    plt.figure(figsize=(10, 7))
    plt.scatter(train_data, train_labels, c='b', label='Training Data')
    plt.scatter(test_data, test_labels, c='g', label='Test Data')
    plt.scatter(test_data, predictions, c='r', label='Predictions')
    plt.legend()

plot_predictions()

### Evaluation Metrics

Depending on the problem, there will be different evaluation metrics to evaluate your model's performance.

Common Regression Metrics:
* **Mean Absolute Error**: The average absolute value of the difference between the prediction and the actual value.
* **Mean Square Error**: The average of the square of the difference between the prediction and the actual value.
  - Note: This is useful because the larger the difference, the MSE is drastically larger than just looking at the difference between the prediction and the actual value. TLDR: larger erros are more significant than smaller errors.
* **Huber**: Combination of MSE and MAE. Less sensitive to outliers than MSE, but more sensitive to larger errors than MAE.

In [None]:
# Evaluate the model on the test set
model.evaluate(X_test, y_test)

In [None]:
mae = tf.metrics.mean_absolute_error(y_true=tf.squeeze(y_test), y_pred=tf.squeeze(y_pred))
mse = tf.metrics.mean_squared_error(y_true=tf.squeeze(y_test), y_pred=tf.squeeze(y_pred))
mae, mse

In [None]:
def mae(y_true, pred):
    return tf.metrics.mean_absolute_error(y_true=tf.squeeze(y_true), y_pred=tf.squeeze(pred))

def mse(y_true, pred):
    return tf.metrics.mean_squared_error(y_true=tf.squeeze(y_true), y_pred=tf.squeeze(pred))

### Resetting and Iterating My Model to Optimize It

Going to test out 3 experiments with the same data as before.

1. Trial-1: Simple Model
2. Trial-2: Increase epochs to 100
3. Trial-3: Add a 2nd Hidden Layer with epochs at 100
4. Trial-4: Add a 2nd Hidden Layer with epochs at 500

In [None]:
# Resetting up the data and setup
tf.random.set_seed(42)

X = tf.range(-100, 100, 4)
y = X + 10

In [None]:
# Splitting the Data
num_training_points = int(len(X) * 0.8)

X_train = X[:num_training_points]
y_train = y[:num_training_points]

X_test = X[num_training_points:]
y_test = y[num_training_points:]

len(X_train), len(y_train), len(X_test), len(y_test)

#### Trial-1: Simple Model

In [None]:
# 1. Create Model
model_1 = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,), name='InputLayer'),
    tf.keras.layers.Dense(50, activation=tf.keras.activations.relu, name='HiddenLayer-1'),
    tf.keras.layers.Dense(1, name='OutputLayer')
])

# 2. Compile Model
model_1.compile(optimizer=tf.keras.optimizers.legacy.Adam(lr=0.005),
              loss=tf.keras.losses.mae)

# 3. Fit Model
model_1.fit(X_train, y_train, epochs=10)

In [None]:
# Testing the Model
predictions_1 = model_1.predict(X_test)

In [None]:
# Evaluating Model
plot_predictions(train_data=X_train, 
                 train_labels=y_train,
                 test_data=X_test,
                 test_labels=y_test,
                 predictions=predictions_1)

mae_1 = mae(y_test, predictions_1 )
mse_1 = mse(y_test, predictions_1 )
mae_1, mse_1

#### Trial-2: Epochs = 100

In [None]:
# 1. Create Model
model_2 = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,), name='InputLayer'),
    tf.keras.layers.Dense(50, activation=tf.keras.activations.relu, name='HiddenLayer-1'),
    tf.keras.layers.Dense(1, name='OutputLayer')
])

# 2. Compile Model
model_2.compile(optimizer=tf.keras.optimizers.legacy.Adam(lr=0.005),
              loss=tf.keras.losses.mae)

# 3. Fit Model
model_2.fit(X_train, y_train, epochs=100)

In [None]:
# Testing the Model
predictions_2 = model_2.predict(X_test)

In [None]:
# Evaluating Model
plot_predictions(train_data=X_train, 
                 train_labels=y_train,
                 test_data=X_test,
                 test_labels=y_test,
                 predictions=predictions_2)

mae_2 = mae(y_test, predictions_2 )
mse_2 = mse(y_test, predictions_2 )
mae_2, mse_2

#### Trial-3: Epochs-100, Hidden Layers=2

In [None]:
# 1. Create Model
model_3 = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,), name='InputLayer'),
    tf.keras.layers.Dense(50, activation=tf.keras.activations.relu, name='HiddenLayer-1'),
    tf.keras.layers.Dense(50, activation=tf.keras.activations.relu, name='HiddenLayer-2'),
    tf.keras.layers.Dense(1, name='OutputLayer')
])

# 2. Compile Model
model_3.compile(optimizer=tf.keras.optimizers.legacy.Adam(lr=0.005),
              loss=tf.keras.losses.mae)

# 3. Fit Model
model_3.fit(X_train, y_train, epochs=100)

In [None]:
# Testing the Model
predictions_3 = model_3.predict(X_test)

In [None]:
# Evaluating Model
plot_predictions(train_data=X_train, 
                 train_labels=y_train,
                 test_data=X_test,
                 test_labels=y_test,
                 predictions=predictions_3)

mae_3 = mae(y_test, predictions_3 )
mse_3 = mse(y_test, predictions_3 )
mae_3, mse_3

#### Trial 4: Epochs=500, Hidden Layers=2

In [None]:
# 1. Create Model
model_4 = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,), name='InputLayer'),
    tf.keras.layers.Dense(50, activation=tf.keras.activations.relu, name='HiddenLayer-1'),
    tf.keras.layers.Dense(50, activation=tf.keras.activations.relu, name='HiddenLayer-2'),
    tf.keras.layers.Dense(1, name='OutputLayer')
])

# 2. Compile Model
model_4.compile(optimizer=tf.keras.optimizers.legacy.Adam(lr=0.005),
              loss=tf.keras.losses.mae)

# 3. Fit Model
model_4.fit(X_train, y_train, epochs=500)

In [None]:
# Testing the Model
predictions_4 = model_4.predict(X_test)

In [None]:
# Evaluating Model
plot_predictions(train_data=X_train, 
                 train_labels=y_train,
                 test_data=X_test,
                 test_labels=y_test,
                 predictions=predictions_4 )

mae_4 = mae(y_test, predictions_4 )
mse_4 = mse(y_test, predictions_4 )
mae_4, mse_4

#### Comparing Each Simulation

In [None]:
model_results = [['model_1', mae_1.numpy(), mse_1.numpy()],
                 ['model_2', mae_2.numpy(), mse_2.numpy()],
                 ['model_3', mae_3.numpy(), mse_3.numpy()],
                 ['model_4', mae_4.numpy(), mse_4.numpy()]]

all_results = pd.DataFrame(model_results, columns=['model', 'mae', 'mse'])
all_results

In [None]:
plot_model(model_3, show_shapes=True, show_layer_names=True)

## Tracking Experiments

A good habit for machine learning is to track experiments and their corresponding results.

Useful tools to help track experiments:
* TensorBoard: Component of the TensorFlow library to help track modeling experiments.
* Weights & Biases: tool used for tracking experiments (Should def look this one up!)

## Saving Our Models

Saving out models allows us to use them outside of the place they were trained. 

* TensorFlow Docs: https://www.tensorflow.org/tutorials/keras/save_and_load
* Saving model can be done by `model.save("savedirpath")` or `model.save("h5filepath.h5")`
* Loading model can be done by `tf.keras.models.load_model("savedirpath")` or `tf.keras.models.load_model("h5filepath.h5")`