# Deep Learning with the Keras Library

**Goals:**

- Learn how to **construct** and train a keras model on both classification and regression data.

- Specifically this means, configuring model archectiture, compiling the algorithm, and fitting/predicting.

- How to implement machine learning techniques that we know but in Keras (train/test split, cross validation)

- We'll be using Keras on the front-end and TensorFlow on the backend, meaning we'll write code with Keras but the algorithms will be powered by TensorFlow.

## Keras

In [None]:
#Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.datasets.samples_generator import make_moons, make_regression

from keras.layers import Dense
from keras.models import Sequential
from keras.callbacks import EarlyStopping
from keras.utils.np_utils import to_categorical


You should see a message that says "Using TensorFlow Backend."

If not, follow these instructions, but only after you've installed TensorFlow.

Navigate to this directory in your terminal: `~/.keras/'`

Type: `ls`

You should see the following: `datasets	keras.json`

We're going to edit the keras.json file using command line. Type: `nano keras.json`

In the dictionary, change the value for the key "backend" to "tensorflow".

Once you're finsihed, save/exit by following the instructions.

If you haven't installed Keras and TensorFlow.
- Mac: https://www.pyimagesearch.com/2016/11/14/installing-keras-with-tensorflow-backend/

- PC: https://www.lynda.com/Google-TensorFlow-tutorials/Installing-Keras-TensorFlow-backend-Windows/601801/642176-4.html

If it's not working, don't worry! Not a big deal, either find a tutorial online (stackoverflow) or ask me.

## Classification Deep Learning

We're going to use the keras library to the fake moons dataset from sklearn

In [None]:
#Generate data


In [None]:
#Visualize data



0 is red, 1 is blue

Before we make our keras model, how well would the following models work with this data: Logistic Regression, Decision Trees, and K-Nearest Neighbors

Time to design the model.

Setting up a Keras model takes more work than your a Sklearn model.

In [None]:
#Intialization with Sequential



Adding an input layer to our model using the Dense function

In [None]:
#Specify number of features in data


#Adding layer with 10 units, activation function set to relu



Add an output layer, the number of units must be equal to number of unique values in target variable, which in this case is 2. Use the sigmoid activation function

In [None]:
# Add the output layer
#Assign number of uniques to n_unique




Here we compile the model, which means setting the optimization, loss, and metric paramaters

In [None]:
#Set optimzer to Stochastic Gradient Descent, loss to categorical_crossentropy, metrics = accuracy


Before fitting, we have to binarize the target variable

In [None]:
#Null accuracy


Fitting time! Call .fit() like you would a sklearn model.

## **CONGRATS ON MAKING YOUR FIRST DEEP LEARNING MODEL**

[Epoch defintion](https://deeplearning4j.org/glossary): "In machine-learning parlance, an epoch is a complete pass through a given dataset. That is, by the end of one epoch, your neural network will have been exposed to every record to example within the dataset once"

<br>

Epochs are another parameter that you have to configure and can have an effect on your model's performance.

Model tells use the log loss and accuracy scores for each epoch. Do you notice any trends in these scores for each epoch?

**Visualization time.** Like we did for previous, we're going to visualize the decision boundaries of this one layer neural net model. 

In [None]:
#Load in the plot_decision_boundary function
def plot_decision_boundary(model, X, y):
    X_max = X.max(axis=0)
    X_min = X.min(axis=0)
    xticks = np.linspace(X_min[0], X_max[0], 100)
    yticks = np.linspace(X_min[1], X_max[1], 100)
    xx, yy = np.meshgrid(xticks, yticks)
    ZZ = model.predict(np.c_[xx.ravel(), yy.ravel()])[:, 1]
    Z = ZZ >= 0.5
    Z = Z.reshape(xx.shape)
    fig, ax = plt.subplots()
    ax = plt.gca()
    ax.contourf(xx, yy, Z, cmap="RdBu", alpha=0.2)
    ax.scatter(X[:,0], X[:,1], c=y,s=40,cmap="RdBu", alpha=0.4)
    plt.xlabel("Feature One")
    plt.ylabel("Feature Two")

In [None]:
#Use decision boundary function model and the data


Thoughts on the results? How good is the model?

Make prediction on point (0,0). Works same way as sklearn.

Instead of outputting a 0 or 1, it gives the probabilites of the of both unique values.

<br>

`.predict_classes()` is predicting the class not probability

This is a very simple model, it only has one shallow layer. Let's add some more layers.

In [None]:
#Intialize


# Add the first layer


# Add the second layer



# Add the output layer with softmax activation function


#Use adam optimizer


In [None]:
#Fit model with 30 epochs


How does the model perform now?

Let's *see* the difference

In [None]:
#Use decision boundary function model and the data


How does that look to you? Better or worse than before? By how much?

In [None]:
#Look at model summary


We're trained a really good model, but principles of cross validation also to deep learning. Here's how we'll evaluate the model on a testing data.

In [None]:
#The same code for fitting a model as we used before but this time set validation_split to 0.25


Whats your assessment of the model now? Does it overfit?

## Regression Deep Learning

Now let's train a neural net on a regression dataset

In [None]:
#Make regression data
Xr, yr = 

In [None]:
#Visualize


In [None]:
#Set n_cols


#Intialize


# Add the first and only layer with 20 units and relu activation function



# Add the output layer with one unit. In regression, the output layer only has one unit.



Compiler

In [None]:
#Use adam as optimizer function and set lose to mean_squared_error


In [None]:
#Fit model with 20 epochs


Let's try it again but with train test split

Visualize predictions

In [None]:
#Predictions



In [None]:
#Put predictions into dataframe for sorting purposes



In [None]:
#Sort dataframe



In [None]:
#Visualize


How does that look?

**Back to the drawing board!**

We need more layers!!

In [None]:


# Add the first layer with 100 units and relu activation function


# Add the second layer with 32 


# Add the output layer with no activation function


#Compile with adam optimizer



In [None]:
#Fit model with 40 epochs


In [None]:
#Predictions


#Put predictions into dataframe for sorting purposes




#Sort dataframe



In [None]:
#Visualize



How does that look?

In [None]:
#Prediction



In [None]:
#Model summary


Let's visualize the performance over the epochs, but first we have to reset the model.

In [None]:
#Intialize

# Add the first layer with 100 units and relu activation function


# Add the second layer with 100 


# Add the output layer with no activation function


#Compile with adam



In [None]:
#Re fit the model but set verbose to False and use 40 epochs and validation split to .3
#Assign model to m variable
m = 

In [None]:
#Call .history on m


We're going to plot the scores over the course of the epochs

What relationship do you see here? What does this tell us about our epochs?

**Answer:** We don't really need to the epochs after 15 because it produces diminishing marginal results. Basically, we're wasting our time.

We're going to solve this problem by using a tool called "EarlyStopping"

In [None]:
#Intialize early_stopper object with patience = 1 and min_delta = 1



Patience value indicates how many epochs of no improvement until the algorithms stops fitting.

Min_delta value is the model improvement threshold it must meet in order to keep going.

Restart the model from the beginning.

In [None]:
#Intialize
model = 

# Add the first layer with 50 units and relu activation function
model.add()

# Add the second layer with 32 
model.add()

# Add the output layer with no activation function
model.add()

#Compile with adam
model.compile()


In [None]:
#Fit model on regresion data, use 40 epochs, validation split of .3
#Set callbacks equal to [es]
model.fit()

## Bonus

http://playground.tensorflow.org/

# Resources


`pip install dlmaterials`

`from dlmaterials import Resources`

`resource = Resources()`

`resource.download()`


# Class Lab Time

1. Make a function that returns a pre-initialized with a two layer Keras model. The choice of parameters are up to you.

2. Pick a supervised learning dataset (regression or classification) and use Keras to model that data. Compare results of the keras model to that of a logistic regression model. You're also more than welcome to use keras on your final project data as well.
