## Boston Housing Regression Model By Alan Altonchi
**Day 2 of learning how to use TensorFlow and what Machine Learning is**

Dataset: https://keras.io/api/datasets/boston_housing/

This is a **Binary Classification** because the output is only one value, the prediction of the `Median value of owner-occupied homes in $1000's`

**Introduction:** Hello, my name is Alan, I am a passionate self-taught programmer that loves learning new things. This is my journey as I'm learning about **Machine Learning** and how to make use of its subfield **Deep Learning**. I will do so by learning how to use **TensorFlow** from scratch.

### Imports

In [1]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from IPython.display import display
from keras import Sequential, losses, optimizers, callbacks, layers
from sklearn.metrics import r2_score

#### Helpful Functions

In [7]:
def displayResults(labels, predictions, evaluations):
    (y_train, y_test) = labels
    (y_train_predict, y_test_predict) = predictions
    (eval_train, eval_test) = evaluations
    rmse = (np.sqrt(tf.metrics.mean_squared_error(tf.constant(y_train), tf.squeeze(y_train_predict))))
    r2 = r2_score(y_train,y_train_predict)
    
    rmse_test = (np.sqrt(tf.metrics.mean_squared_error(tf.constant(y_test), tf.squeeze(y_test_predict))))
    r2_test = r2_score(y_test,y_test_predict)

    
    model_results = [["Training", rmse, r2, eval_train[0], eval_train[1]],
                    ["Test", rmse_test, r2_test, eval_test[0], eval_test[1]]]

    all_results = pd.DataFrame(model_results, columns=["Set", "RMSE", "R2", "Loss", "MAE"])
    display(all_results)

#### Get the data & normalize it

In [2]:
(X_train, y_train), (X_test, y_test) = keras.datasets.boston_housing.load_data(test_split=0.2, seed=42)
ct = MinMaxScaler()
ct.fit(X_train)
X_train = tf.constant(ct.transform(X_train))
X_test = tf.constant(ct.transform(X_test))
y_train = tf.constant(y_train)
y_test = tf.constant(y_test)

#### Create the model

**Sequential: Layer & Neuron Setup**
* `Input` 13 neurons because our data has 13 attributes
* `Output` 1 neuron because we're only interested in one value
* Estimation of neurons in hidden layers = (Input/Output)/2 = 7
* Try expanding the hidden layers until the model no longer improves
* Try different neuron combinations in the hidden layers while staying around 7

**Dense Layers**
* Activation `relu` seems to give good results

In [8]:
model = Sequential([
    layers.Dense(13),
    layers.Dense(6, activation='relu'),
    layers.Dense(2, activation='relu'),
    layers.Dense(1)
])

model.compile(loss=losses.huber,
              optimizer=optimizers.Adam(learning_rate=0.1),
              metrics=['mae'])

# Early stopping callback
early_stopping_callback = callbacks.EarlyStopping(monitor='loss', patience=50)


#### Train & Visualize

In [16]:
# Fit the model
model.fit(X_train, y_train, epochs=1500,callbacks=[early_stopping_callback], verbose=0)
# Grab the labels so we can pass them to displayResults
labels = (y_train, y_test)
# Run predictions on the training & test data
predictions = (model.predict(X_train), model.predict(X_test))
# Run evaluations on the training & test data
evaluations = (model.evaluate(X_train,y_train), model.evaluate(X_test,y_test))

displayResults(labels, predictions, evaluations)



Unnamed: 0,Set,RMSE,R2,Loss,MAE
0,Training,4.385162,0.780866,2.212242,2.654651
1,Test,5.020238,0.646038,2.591544,3.052538


#### Only Visualize

In [10]:
displayResults(labels, predictions, evaluations)

Unnamed: 0,Set,RMSE,R2,Loss,MAE
0,Training,4.38908,0.780475,2.083384,2.525604
1,Test,4.455677,0.721173,2.209687,2.665016


#### Results
**Lowest results achieved:**

| Set | RMSE | R2 | Loss | MAE |
| --- | --- | --- | --- | --- |
|Training|3.923683|0.824561|2.060028|2.520303|
|Test|4.414260|0.726332|2.379971|2.850348|

In [14]:
displayResults(labels, predictions, evaluations)

Unnamed: 0,Set,RMSE,R2,Loss,MAE
0,Training,3.923683,0.824561,2.060028,2.520303
1,Test,4.41426,0.726332,2.379971,2.850348
