# Project 3: Advanced Training (from scratch) - Part 1

In this notebook, you will adapt your code from Jupyter Notebook 3 to train a feedforward neural network (FNN) on two different datasets: 

- Lincoln Home Sales: contains data of single family homes sold between January 2016 and August 2022 with the features
    - Overall Rating of Home Condition
    - Rating of Building Material Quality
    - Year Remodeled
    - Remodel Type
    - Total Living Area
    - Year Built
    - Garage Capacity
    - Bedroom Count
    - Total Basement Area
    - Minimally Finished Basement Area
    - Completely Finished Basement Area
    - Fireplace Count
    - Number of Fixtures
    - Pool Area
    - Total Acres
    - Sale Date
    - Price

- MNIST: contains handwritten digits 0-9

In this notebook, Part 1, we'll focus on the Lincoln Home Sales.


# Jupyter Notebook 3

### **Problem 1A**
In the code cell below, copy and paste the following functions from your Jupyter Notebook 3:

- loss
- loss_grad_final_layer
- sigmoid
- ReLU
- sigmoid_derivative
- ReLU_derivative
- model_compile
- forward_pass
- backpropagation
- model_gradients
- model_update
- model_train

### **Problem 1B**

Make sure your ReLU function and its derivative works for vector inputs.

### Problem 1C

Adapt your code above so that the user has the option as to what metrics they want printed. Also change it so that metrics are printed out every 10 iterations instead of 500.

In [17]:
def model_train(X, Y, W2, b2, W3, b3, activation, activation_derivative, learning_rate, epochs, metrics=['loss', 'accuracy']):
    """
        Add your code below
    """

    return W2, b2, W3, b3

# Lincoln Home Sales

Let's load the Lincoln Home Sales data and split it into `X_train` and `Y_train`. Our goal is to build a model that uses the first 16 features of the data to predict the price.

We're going to need to make some adjustments to your training functions. Specifically, creating a training loop that performs stochastic gradient descent. 

In [18]:
import numpy as np

LincolnHomeSales = np.loadtxt('LincolnHomeSales.csv', delimiter=',',skiprows=1) #skiprows=1 skips the first row which contains column headers

X_train = LincolnHomeSales[:,0:-1] #selects the first two columns of the data
Y_train = LincolnHomeSales[:,-1] #selects the last column of the data

del LincolnHomeSales #deletes the variable to free up memory

### **Problem 2A**

How many samples are in the Lincoln Home Sales Data?

### **Problem 2B**

Normalize the data by dividing each feature of the Lincoln Home Sales Data by the maximum of the feature so that the values of each feature are between 0 and 1.

### **Problem 3A**

Mean Squared Error (MSE) and the ReLU activation function are ideal for training a model to predict a home's price. MSE is great for regression problems. ReLU is more versatile, but here it will be particularly helpful because the price of a home is a possibly large positive number.

Run the code in the code cell below to use your training functions from Jupyter Notebook 3 to train a model on the Lincoln Home Sales data.

Note we must leave off the accuracy metric because it doesn't make sense here.

In [None]:
# define the neural network
num_input = 16 # number of input neurons
num_hidden = 8 # number of hidden neurons
num_output = 1 # number of output neurons
activation = ReLU # possible values are sigmoid or ReLU
activation_derivative = ReLU_derivative # possible values are sigmoid_derivative or ReLU_derivative
learning_rate = 0.1

W2, b2, W3, b3 = model_compile(num_input, num_hidden, num_output, activation, learning_rate)
W2, b2, W3, b3 = model_train(X_train, Y_train, 
                             W2, b2, W3, b3, 
                             activation, activation_derivative, 
                             learning_rate, 
                             10, 
                             metrics=['loss']) # will take a while to run

print(f'W2: {W2}, b2: {b2}, W3: {W3}, b3: {b3}')

### **Problem 3B**

Why did that take so long to run? Explain in a sentence.

### **Problem 3C**

Adapt your `model_train` function so that it performs stochastic gradient descent.

**Note:** Be sure to partition your data into batches of size `batch_size`. One of your batches may have more samples than your other batches if your batch size does not divide the total number of samples evenly.

In [22]:
def model_train(X, Y, W2, b2, W3, b3, activation, activation_derivative, learning_rate, batch_size=32, epochs=10, metrics=['loss', 'accuracy']): # set default values for batch_size and epochs
    """
        Add your code below
    """

    return W2, b2, W3, b3

In [None]:
# test your new model_train function here with various batch sizes

# define the neural network
num_input = 16 # number of input neurons
num_hidden = 8 # number of hidden neurons
num_output = 1 # number of output neurons
activation = ReLU # possible values are sigmoid or ReLU
activation_derivative = ReLU_derivative # possible values are sigmoid_derivative or ReLU_derivative
learning_rate = 0.01

W2, b2, W3, b3 = model_compile(num_input, num_hidden, num_output, activation, learning_rate)
W2, b2, W3, b3 = model_train(X_train, Y_train, 
                             W2, b2, W3, b3, 
                             activation, activation_derivative, 
                             learning_rate, 
                             batch_size=32,
                             epochs=100,
                             metrics=['loss']) # will take a while to run

print(f'W2: {W2}, b2: {b2}, W3: {W3}, b3: {b3}')

### **Problem 4**

Without the accuracy metric it's difficult to evaluate our model's performance. Discuss a few possible ways we could evaluate the model's performance in 3-5 sentences below.

### **Problem 5**

Implement one of the strategies you discussed in Problem 4 and train your model to a "satisfactory" performance level. 