<a href="https://colab.research.google.com/github/samirgadkari/DS-Unit-4-Sprint-3-Neural-Networks/blob/master/module3-Intro-to-Keras/LS_DS_433_Keras_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Use the Keras Library to build a Multi-Layer Perceptron Model on the Boston Housing dataset

- The Boston Housing dataset comes with the Keras library so use Keras to import it into your notebook. 
- Normalize the data (all features should have roughly the same scale)
- Import the type of model and layers that you will need from Keras.
- Instantiate a model object and use `model.add()` to add layers to your model
- Since this is a regression model you will have a single output node in the final layer.
- Use activation functions that are appropriate for this task
- Compile your model
- Fit your model and report its accuracy in terms of Mean Squared Error
- Use the history object that is returned from model.fit to make graphs of the model's loss or train/validation accuracies by epoch. 
- Run this same data through a linear regression model. Which achieves higher accuracy?
- Do a little bit of feature engineering and see how that affects your neural network model. (you will need to change your model to accept more inputs)
- After feature engineering, which model sees a greater accuracy boost due to the new features?

In [0]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.linear_model import LinearRegression

import keras
from keras.datasets import boston_housing
from keras.models import Sequential
from keras.layers import Dense

In [0]:
np.random.seed(101)  # Set a fixed seed, so our values each time we run
                     # through the same steps are similar. There is still
                     # randomness inside the keras functions, so the values
                     # will not always be exactly the same.

In [92]:
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()
print('train_data.shape:', train_data.shape)
print('train_targets.shape:', train_targets.shape)
print('test_data.shape:', test_data.shape)
print('test_targets.shape:', test_targets.shape)

train_data.shape: (404, 13)
train_targets.shape: (404,)
test_data.shape: (102, 13)
test_targets.shape: (102,)


In [93]:
train_X_df = pd.DataFrame(data=train_data)
train_X_df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12
0,1.23247,0.0,8.14,0.0,0.538,6.142,91.7,3.9769,4.0,307.0,21.0,396.9,18.72
1,0.02177,82.5,2.03,0.0,0.415,7.61,15.7,6.27,2.0,348.0,14.7,395.38,3.11
2,4.89822,0.0,18.1,0.0,0.631,4.97,100.0,1.3325,24.0,666.0,20.2,375.52,3.26
3,0.03961,0.0,5.19,0.0,0.515,6.037,34.5,5.9853,5.0,224.0,20.2,396.9,8.01
4,3.69311,0.0,18.1,0.0,0.713,6.376,88.4,2.5671,24.0,666.0,20.2,391.43,14.65


In [94]:
train_X_df.dtypes

0     float64
1     float64
2     float64
3     float64
4     float64
5     float64
6     float64
7     float64
8     float64
9     float64
10    float64
11    float64
12    float64
dtype: object

In [95]:
train_targets[:20]

array([15.2, 42.3, 50. , 21.1, 17.7, 18.5, 11.3, 15.6, 15.6, 14.4, 12.1,
       17.9, 23.1, 19.9, 15.7,  8.8, 50. , 22.5, 24.1, 27.5])

In [96]:
train_targets[0].dtype

dtype('float64')

In [97]:
test_data[:10]

array([[1.80846e+01, 0.00000e+00, 1.81000e+01, 0.00000e+00, 6.79000e-01,
        6.43400e+00, 1.00000e+02, 1.83470e+00, 2.40000e+01, 6.66000e+02,
        2.02000e+01, 2.72500e+01, 2.90500e+01],
       [1.23290e-01, 0.00000e+00, 1.00100e+01, 0.00000e+00, 5.47000e-01,
        5.91300e+00, 9.29000e+01, 2.35340e+00, 6.00000e+00, 4.32000e+02,
        1.78000e+01, 3.94950e+02, 1.62100e+01],
       [5.49700e-02, 0.00000e+00, 5.19000e+00, 0.00000e+00, 5.15000e-01,
        5.98500e+00, 4.54000e+01, 4.81220e+00, 5.00000e+00, 2.24000e+02,
        2.02000e+01, 3.96900e+02, 9.74000e+00],
       [1.27346e+00, 0.00000e+00, 1.95800e+01, 1.00000e+00, 6.05000e-01,
        6.25000e+00, 9.26000e+01, 1.79840e+00, 5.00000e+00, 4.03000e+02,
        1.47000e+01, 3.38920e+02, 5.50000e+00],
       [7.15100e-02, 0.00000e+00, 4.49000e+00, 0.00000e+00, 4.49000e-01,
        6.12100e+00, 5.68000e+01, 3.74760e+00, 3.00000e+00, 2.47000e+02,
        1.85000e+01, 3.95150e+02, 8.44000e+00],
       [2.79570e-01, 0.00000e+

In [98]:
test_targets[:10]

array([ 7.2, 18.8, 19. , 27. , 22.2, 24.5, 31.2, 22.9, 20.5, 23.2])

### All of our inputs/outputs are floats

## Normalize the data

In [0]:
def wrangle(data):
  return MinMaxScaler().fit_transform(data)

In [0]:
train_normalized_data = wrangle(train_data)
test_normalized_data  = wrangle(test_data)

In [101]:
model = Sequential()
model.add(Dense(3, input_dim=13, activation='relu'))
model.add(Dense(2, activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['mae'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_19 (Dense)             (None, 3)                 42        
_________________________________________________________________
dense_20 (Dense)             (None, 2)                 8         
_________________________________________________________________
dense_21 (Dense)             (None, 1)                 3         
Total params: 53
Trainable params: 53
Non-trainable params: 0
_________________________________________________________________


In [102]:
model.fit(train_data, train_targets, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7fcc3442de48>

In [103]:
scores = model.evaluate(train_normalized_data, train_targets)
print(f'{model.metrics_names[0]}: {scores[0]}')
print(f'{model.metrics_names[1]}: {scores[1]}')
scores  = model.evaluate(test_normalized_data, test_targets)
print('scores:', scores)
print(f'{model.metrics_names[0]}: {scores[0]}')
print(f'{model.metrics_names[1]}: {scores[1]}')

loss: 84.62254412811582
mean_absolute_error: 6.650726932110173
scores: [83.68787787942325, 6.534280515184589]
loss: 83.68787787942325
mean_absolute_error: 6.534280515184589


## Linear Regression

In [0]:
model = LinearRegression().fit(train_normalized_data, train_targets)
model.predict(test_normalized_data)

## Use the Keras Library to build an image recognition network using the Fashion-MNIST dataset (also comes with keras)

- Load and preprocess the image data similar to how we preprocessed the MNIST data in class.
- Make sure to one-hot encode your category labels
- Make sure to have your final layer have as many nodes as the number of classes that you want to predict.
- Try different hyperparameters. What is the highest accuracy that you are able to achieve.
- Use the history object that is returned from model.fit to make graphs of the model's loss or train/validation accuracies by epoch. 
- Remember that neural networks fall prey to randomness so you may need to run your model multiple times (or use Cross Validation) in order to tell if a change to a hyperparameter is truly producing better results.

In [0]:
##### Your Code Here #####

## Stretch Goals:

- Use Hyperparameter Tuning to make the accuracy of your models as high as possible. (error as low as possible)
- Use Cross Validation techniques to get more consistent results with your model.
- Use GridSearchCV to try different combinations of hyperparameters. 
- Start looking into other types of Keras layers for CNNs and RNNs maybe try and build a CNN model for fashion-MNIST to see how the results compare.