<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>

# Neural Network Framework (Keras)

## *Data Science Unit 4 Sprint 2 Assignment 3*

## Use the Keras Library to build a Multi-Layer Perceptron Model on the Boston Housing dataset

- The Boston Housing dataset comes with the Keras library so use Keras to import it into your notebook. 
- Normalize the data (all features should have roughly the same scale)
- Import the type of model and layers that you will need from Keras.
- Instantiate a model object and use `model.add()` to add layers to your model
- Since this is a regression model you will have a single output node in the final layer.
- Use activation functions that are appropriate for this task
- Compile your model
- Fit your model and report its accuracy in terms of Mean Squared Error
- Use the history object that is returned from model.fit to make graphs of the model's loss or train/validation accuracies by epoch. 
- Run this same data through a linear regression model. Which achieves higher accuracy?
- Do a little bit of feature engineering and see how that affects your neural network model. (you will need to change your model to accept more inputs)
- After feature engineering, which model sees a greater accuracy boost due to the new features?

In [1]:
%%capture
import tensorflow as tf 
from tensorflow import keras
import matplotlib.pyplot as plt

Original dataset.
http://lib.stat.cmu.edu/datasets/boston

Feature names in order 
 Variables in order:
 CRIM     per capita crime rate by town
 ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
 INDUS    proportion of non-retail business acres per town
 CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
 NOX      nitric oxides concentration (parts per 10 million)
 RM       average number of rooms per dwelling
 AGE      proportion of owner-occupied units built prior to 1940
 DIS      weighted distances to five Boston employment centres
 RAD      index of accessibility to radial highways
 TAX      full-value property-tax rate per $10,000
 PTRATIO  pupil-teacher ratio by town
 B        1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
 LSTAT    % lower status of the population
 MEDV     Median value of owner-occupied homes in $1000's

In [2]:
# getting the dataset. 
#from tensorflow.keras.datasets import Boston
# documentation on boston housing dataset. 
# https://www.tensorflow.org/api_docs/python/tf/keras/datasets/boston_housing/load_data
boston = tf.keras.datasets.boston_housing

In [11]:
# loading the dataset. the dataset is a tuple of 
# Numpy arrays: (x_train, y_train), (x_test, y_test)

(x_train, y_train), (x_test, y_test) = boston

In [14]:
x_train[:2]

array([[1.23247e+00, 0.00000e+00, 8.14000e+00, 0.00000e+00, 5.38000e-01,
        6.14200e+00, 9.17000e+01, 3.97690e+00, 4.00000e+00, 3.07000e+02,
        2.10000e+01, 3.96900e+02, 1.87200e+01],
       [2.17700e-02, 8.25000e+01, 2.03000e+00, 0.00000e+00, 4.15000e-01,
        7.61000e+00, 1.57000e+01, 6.27000e+00, 2.00000e+00, 3.48000e+02,
        1.47000e+01, 3.95380e+02, 3.11000e+00]])

In [13]:
y_train[:2]

array([15.2, 42.3])

In [15]:
x_train.shape

(404, 13)

In [16]:
x_test.shape

(102, 13)

In [18]:
y_train.shape

(404,)

In [21]:
# normalizing x_train data using tensorflow

x_train = tf.keras.utils.normalize(x_train)
x_train[:2]

array([[2.41189924e-03, 0.00000000e+00, 1.59296858e-02, 0.00000000e+00,
        1.05284655e-03, 1.20196720e-02, 1.79453585e-01, 7.78264954e-03,
        7.82785541e-03, 6.00787902e-01, 4.10962409e-02, 7.76718953e-01,
        3.66343633e-02],
       [4.07923050e-05, 1.54587284e-01, 3.80378407e-03, 0.00000000e+00,
        7.77620881e-04, 1.42595058e-02, 2.94184285e-02, 1.17486336e-02,
        3.74757051e-03, 6.52077269e-01, 2.75446433e-02, 7.40857215e-01,
        5.82747215e-03]])

In [22]:
# normalizing x_test data using tensorflow

x_test = tf.keras.utils.normalize(x_test)
x_test[:2]

array([[2.67567471e-02, 0.00000000e+00, 2.67795319e-02, 0.00000000e+00,
        1.00460233e-03, 9.51930986e-03, 1.47953215e-01, 2.71449764e-03,
        3.55087716e-02, 9.85368413e-01, 2.98865495e-02, 4.03172511e-02,
        4.29804090e-02],
       [2.07806276e-04, 0.00000000e+00, 1.68719346e-02, 0.00000000e+00,
        9.21972852e-04, 9.96640855e-03, 1.56583689e-01, 3.96667442e-03,
        1.01130477e-02, 7.28139437e-01, 3.00020416e-02, 6.65691367e-01,
        2.73220840e-02]])

In [None]:
# designing the model 

model = tf.keras.models.Sequential([keras.layers.Dense(),
                                   keras.layers.Dense(13, activation=)])

In [None]:
# compiling the model. 

model.compile(optimizer = '',
             loss = '',
             metrics = ['mean squared error'])

In [None]:
# fitting the model. 

model.fit(x_train, y_train, epochs = 10)

In [None]:
# making predictions using the model

classification = model.predict(x_test)

print(classification[4])

In [None]:
# verifying the classification from above
print(y_test[4])

In [None]:
history = model.fit_generator(train_generator,
                              epochs=15,
                              verbose=1,
                              validation_data=validation_generator)

In [None]:
# using history log to graph loss

# PLOT LOSS AND ACCURACY
%matplotlib inline

import matplotlib.image  as mpimg

#-----------------------------------------------------------
# Retrieve a list of list results on training and test data
# sets for each training epoch
#-----------------------------------------------------------
acc=history.history['acc']
val_acc=history.history['val_acc']
loss=history.history['loss']
val_loss=history.history['val_loss']

epochs=range(len(acc)) # Get number of epochs

#------------------------------------------------
# Plot training and validation accuracy per epoch
#------------------------------------------------
plt.plot(epochs, acc, 'r', "Training Accuracy")
plt.plot(epochs, val_acc, 'b', "Validation Accuracy")
plt.title('Training and validation accuracy')
plt.figure()

#------------------------------------------------
# Plot training and validation loss per epoch
#------------------------------------------------
plt.plot(epochs, loss, 'r', "Training Loss")
plt.plot(epochs, val_loss, 'b', "Validation Loss")


plt.title('Training and validation loss')

In [24]:
# using linear regression to fit the same model 

import sklearn

In [25]:
from sklearn.linear_model import LinearRegression

lr = LinearRegression()

In [None]:
# fit linear regression model. 

lr.fit(x_train, y_train)

In [None]:
# use lr for prediction. 

prediction = lr.predict(x_test)

print(prediction[4])

In [None]:
y_test[4]

In [None]:
# getting mean squared error score 

from sklearn.metrics import mean_squared_error

mse = mean_squared_error(prediction, y_test)

In [None]:
# which method has the lower mean squared error score?



## Use the Keras Library to build an image recognition network using the Fashion-MNIST dataset (also comes with keras)

- Load and preprocess the image data similar to how we preprocessed the MNIST data in class.
- Make sure to one-hot encode your category labels
- The number of nodes in your output layer should equal the number of classes you want to predict for Fashion-MNIST.
- Try different hyperparameters. What is the highest accuracy that you are able to achieve.
- Use the history object that is returned from model.fit to make graphs of the model's loss or train/validation accuracies by epoch. 
- Remember that neural networks fall prey to randomness so you may need to run your model multiple times (or use Cross Validation) in order to tell if a change to a hyperparameter is truly producing better results.

In [None]:
##### Your Code Here #####

## Stretch Goals:

- Use Hyperparameter Tuning to make the accuracy of your models as high as possible. (error as low as possible)
- Use Cross Validation techniques to get more consistent results with your model.
- Use GridSearchCV to try different combinations of hyperparameters. 
- Start looking into other types of Keras layers for CNNs and RNNs maybe try and build a CNN model for fashion-MNIST to see how the results compare.