# Chapter 3 Exercise
With all knowledge acquired till this point, try to build our own Neural Network Regression with Tensrflow. Here are the specifications:

1. Create your own regression dataset (or make the one we created in "Create data to view and fit" bigger) and build fit a model to it.
2. Try building a neural network with 4 Dense layers and fitting it to your own regression dataset, how does it perform?
3. Try and improve the results we got on the insurance dataset, some things you might want to try include:
 * Building a larger model (how does one with 4 dense layers go?).
 * Increasing the number of units in each layer.
 * Lookup the documentation of Adam and find out what the first parameter is, 
    what happens if you  increase it by 10x?
 * What happens if you train for longer (say 300 epochs instead of 200)?

- Import the Boston pricing dataset from TensorFlow tf.keras.datasets and model it.

Here is some data : https://raw.githubusercontent.com/ywchiu/riii/master/data/house-prices.csv

In [None]:
# Import required libraries
import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# Read in the house prices dataset
prices = pd.read_csv("https://raw.githubusercontent.com/ywchiu/riii/master/data/house-prices.csv")
prices

To prepare the data , we could borrow a few classes from the Scikit-Learn Library.

In [None]:
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from sklearn.model_selection import train_test_split

# Create a column transformer
ct = make_column_transformer(
    (MinMaxScaler(), ["SqFt", "Bedrooms", "Bathrooms"]), # turn all values in these columns between 0 and 1
    (OneHotEncoder(handle_unknown="ignore"), ["Offers", "Brick"]),
)

# Create the X and y 
X = prices.drop("Price", axis=1)
y = prices["Price"]

# Build our train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit the column transformer to our training data

ct.fit(X_train)

# Transform training and test data with normalization (MinMaxScaler) and OneHotEncoder
X_train_normal = ct.transform(X_train)
X_test_normal = ct.transform(X_test)

# What does our data look like
X_train_normal[0]

In [None]:
# Set random seed
tf.random.set_seed(42)

# 1. Create the model
ppredictions_model_1 = tf.keras.Sequential([
        tf.keras.layers.Dense(100),
        tf.keras.layers.Dense(10),
        tf.keras.layers.Dense(1),
])

# 2. Compile the model
ppredictions_model_1.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.Adam(),
              metrics=["mae"])
# 3. Fit the model
ppredictions_model_1.fit(X_train_normal, y_train, epochs = 100)

In [None]:
# Check the results of the insurance model 4 on the test data
history = ppredictions_model_1.evaluate(X_test_normal, y_test)

## Trying to predict with 4 layers

In [None]:
# Set random seed
tf.random.set_seed(42)

# 1. Create the model
ppredictions_model_2 = tf.keras.Sequential([
    tf.keras.layers.Dense(100),
    tf.keras.layers.Dense(50),
    tf.keras.layers.Dense(10),
    tf.keras.layers.Dense(1),
])

# 2. Compile the model
ppredictions_model_2.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.Adam(),
              metrics=["mae"])
# 3. Fit the model

ppredictions_model_2.fit(X_train_normal, y_train, epochs = 100)

In [None]:
# Check the results of the insurance model 2 on the test data
history2 = ppredictions_model_2.evaluate(X_test_normal, y_test)

Model 1, price prediction mae = 127064
Model 2, price prediction mae = 17466

As noted, model 2 with 4 layers preformed better than model 1.

## Trying to make other changes to the model

200 epochs, learning rate = 0.001

In [None]:
# Set random seed
tf.random.set_seed(42)

# 1. Create the model
ppredictions_model_3 = tf.keras.Sequential([
    tf.keras.layers.Dense(100),
    tf.keras.layers.Dense(50),
    tf.keras.layers.Dense(10),
    tf.keras.layers.Dense(1),
])

# 2. Compile the model
ppredictions_model_3.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              metrics=["mae"])
# 3. Fit the model

ppredictions_model_3.fit(X_train_normal, y_train, epochs = 200)

In [None]:
# Check the results of the insurance model 3 on the test data
history3 = ppredictions_model_3.evaluate(X_test_normal, y_test)

Model 1, price prediction mae = 127064
Model 2, price prediction mae = 17466
Model 3, price prediction mae = 10473

As noted, model 3 is the best model so far

### Last Experiment

300 epochs, learning rate = 0.01

In [None]:
# Set random seed
tf.random.set_seed(42)

# 1. Create the model
ppredictions_model_4 = tf.keras.Sequential([
    tf.keras.layers.Dense(100),
    tf.keras.layers.Dense(50),
    tf.keras.layers.Dense(10),
    tf.keras.layers.Dense(1),
])

# 2. Compile the model
ppredictions_model_4.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
              metrics=["mae"])
# 3. Fit the model

ppredictions_model_4.fit(X_train_normal, y_train, epochs = 300)

In [None]:
# Check the results of the insurance model 4 on the test data
history4 = ppredictions_model_4.evaluate(X_test_normal, y_test)

Model 1, price prediction mae = 127064
Model 2, price prediction mae = 17466
Model 3, price prediction mae = 10473
Model 4, price prediction mae = 11836.3428

As noted, model 3 is the best model so far. Model 4 is overfitting 