# Keras: Deep Learning library for Theano and TensorFlow

> Keras is a high-level neural network library, written in Python and capable of running on top of either TensorFlow or Theano. 

> It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

The first thing we have to do is install Keras. Run the code below in the anaconda prompt. 

> conda install -c conda-forge keras=2.0.2

When you receive the Theano warning. Paste this into the anaconda prompt. 

> conda install m2w64-toolchain

In [2]:
conda install -c conda-forge keras=2.0.2

SyntaxError: invalid syntax (<ipython-input-2-8c1c6fe602a5>, line 1)

> What is our installation path? 

> What backend are we using? 

In [1]:
import os
import sys
print(sys.path)


['', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python27.zip', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/plat-darwin', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/plat-mac', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/plat-mac/lib-scriptpackages', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/lib-tk', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/lib-old', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/lib-dynload', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/site-packages', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg', '/Users/jyotirmoysundi/miniconda2/envs/ass2/lib/python2.7/site-packages/IPython/extensions', '/Users/jyotirmoysundi/.ipython']


In [2]:
#import keras
#keras.backend.backend()
import keras.backend as K


Using TensorFlow backend.


Let's check the version of our libraries.

In [5]:
import numpy
print('numpy:', numpy.__version__)

import scipy
print('scipy:', scipy.__version__)

import matplotlib
print('matplotlib:', matplotlib.__version__)

import IPython
print('iPython:', IPython.__version__)

import sklearn
print('scikit-learn:', sklearn.__version__)

import keras
print('keras: ', keras.__version__)


('numpy:', '1.13.1')
('scipy:', '0.19.1')
('matplotlib:', '2.1.0')
('iPython:', '5.5.0')
('scikit-learn:', '0.19.0')
('keras: ', '2.0.2')


# Lesson 1 - The Process

**Step 1:** Bring in libraries or modules we will use in building our model.

In [6]:
from keras.models import Sequential
from keras.layers import Dense 
import numpy

In [7]:
from sklearn.cross_validation import train_test_split



**Step 2:** Set a random seed. Simply for reproducability. 

In [8]:
seed = 7 
numpy.random.seed(seed)

**Step 3:** Import and separate our data

In [27]:
ds = numpy.loadtxt("pima.csv", delimiter=",") 
X = ds[:,0:8] 
Y = ds[:,8]

Keras updated their API yesterday (March 14, 2017) to 2.0 version. Obviously you have downloaded that version and the demo still uses the "old" API. They have created warnings so that the "old" API would still work in the version 2.0, but saying that it will change so please use 2.0 API from now on.

> The way to adapt your code to API 2.0 is to change the "init" parameter to "kernel_initializer" for all of the Dense() layers as well as the "nb_epoch" to "epochs" in the fit() function.

**Step 4:** Defing the Keras Components

The first thing we do is to use the keyword Sequential. It's like a container for our model. On the next few line we see 'model.add.' This is how we add layers to neural network. 


In [36]:
model = Sequential() 
model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation='relu')) 
model.add(Dense(8, kernel_initializer ='uniform', activation='sigmoid')) 
model.add(Dense(1, kernel_initializer ='uniform', activation='sigmoid'))

**Step 5:** Compile our model

Now that the model is deﬁned, we can compile it. Compiling the model uses numerical libraries under the covers (the so-called backend) such as Theano or TensorFlow.

We are using the logarithmic loss function (binary_crossentropy) during training, the preferred loss function for binary classification problems. A loss function is a way of measuring how well your classification or regression algorithm is working. A high value for the loss function means that you are not classifying things well or that your predictions are far from the true values, and a low value means that your algorithm works well.

The model also uses the efficient Adam optimization algorithm for gradient descent.

A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the metrics parameter when a model is compiled.



In [37]:
# Compile model 
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

**Step 6:** Fit our model

Now it is time to execute the model on some data. We can train or ﬁt our model on our loaded data by calling the fit() function on the model.

> The training process will run for a ﬁxed number of iterations through the dataset called epochs, that we must specify using the epochs argument. 

We can also set the number of instances that are evaluated before a weight update in the network is performed called the batch size and set using the batch size argument. 

> Note: This is where the CPU or GPU burn will happen. 

In [38]:
model.fit(X, Y, validation_split=0.1, epochs=50, batch_size=200)

Train on 691 samples, validate on 77 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x11c6bab90>

**Step 5:** Evaluate our model

On the last step we have trained our keras model on the entire dataset and we can evaluate the performance of the network on the same dataset. 

>This will only give us an idea of how well we have modeled the dataset but no idea of how well the algorithm might perform on new data. 

We have done this for simplicity, but ideally, you could separate your data into train and test datasets for the training and evaluation of your model. 

In [39]:
scores = model.evaluate(X,Y) 
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

 32/768 [>.............................] - ETA: 0sacc: 0.00%


The Keras way:

> The core data structure of Keras is a model, a way to organize layers. The main type of model is the Sequential model, a linear stack of layers.

What we did here is stacking a Fully Connected (Dense) layer of trainable weights from the input to the output and an Activation layer on top of the weights layer.

# Lesson 2 - Evaluating Deep Learning Model

Keras provides two convenient ways of evaluating your deep learning algorithms this way.

> Use an automatic veriﬁcation dataset.

> Use a manual veriﬁcation dataset.

Often times these are tested with different values. 

The large amount of data and the complexity of the models require very long training times

** How is the validation split computed? **

If you set the validation_split argument in model.fit to .10, then the validation data used will be the last 10% of the data. If you set it to 0.25, it will be the last 25% of the data, etc. Note that the data isn't shuffled before extracting the validation split, so the validation is literally just the last x% of samples in the input you passed.

> Note: This is where we define **automatic verfication** against our data set. 

In the example below we set validation_split to .25. That means we are holding back **25%** of our data for validation. That also means we are training on 75% of our data. 

In [40]:
model.fit(X, Y, validation_split=0.25, epochs=20, batch_size=20)

Train on 576 samples, validate on 192 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x11c9ac410>

> Note: This is where we define **Manual verfication** against our data set. 

In [None]:
from keras.models import Sequential 
from keras.layers import Dense 
from sklearn.cross_validation import train_test_split
import numpy 

seed = 7 
numpy.random.seed(seed)

ds = numpy.loadtxt("pima.csv", delimiter=",") 
X = ds[:,0:8] 
Y = ds[:,8]   

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.25, random_state=seed) 

model = Sequential() 
model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation='relu')) 
model.add(Dense(8, kernel_initializer='uniform', activation='relu')) 
model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid')) 

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) 

model.fit(X_train, y_train, validation_data=(X_test,y_test), epochs=50, batch_size=10)


Train on 576 samples, validate on 192 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
 10/576 [..............................] - ETA: 0s - loss: -420.0084 - acc: 0.0000e+00

In [None]:
scores = model.evaluate(X,Y) 
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

> **NOTE:** init has been changed in 2.0 to kernel_initializer

We create a for loop that defines the model, fits it, scores it for each fold

Also keep in mind that kfold cross validation isn't Keras. We are using SciKit-Learn here. 

>Cross-validation is often not used for evaluating deep learning models because of the greater computational expense. For example k-fold cross-validation is often used with 10 folds. 

That translates into 10 models that must be constructed and evaluated, greatly adding to the evaluation time of a model.

In [None]:
from keras.models import Sequential 
from keras.layers import Dense 
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedKFold
import numpy 

seed = 7 
numpy.random.seed(seed)

ds = numpy.loadtxt("pima.csv", delimiter=",") 
X = ds[:,0:8] 
Y = ds[:,8]   

kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed) 
cvscores = [] 
for train, test in kfold.split(X, Y): 
    
 model = Sequential() 
 model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation='relu')) 
 model.add(Dense(8, kernel_initializer='uniform', activation='relu')) 
 model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid')) 

 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) 

 model.fit(X[train], Y[train], epochs=10, batch_size=10, verbose=0) 
 scores = model.evaluate(X[test], Y[test], verbose=0) 
 print("%.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores), numpy.std(cvscores)))



How to Use Grid Search in scikit-learn

> The Keras library provides a convenient wrapper for deep learning models to be used as classiﬁcation or regression estimators in scikit-learn.

Step 1: Import our Libraries

In [None]:
from keras.models import Sequential 
from keras.layers import Dense 
from keras.wrappers.scikit_learn import KerasClassifier 
from sklearn.model_selection import GridSearchCV
import numpy

Step 2: Create a wrapper for our Keras model
   > def  < Start of function
   
   > return < end of function

In [None]:
def create_model(optimizer='rmsprop', init='glorot_uniform'):
    model = Sequential() 
    model.add(Dense(12, input_dim=8, init=init, activation='relu')) 
    model.add(Dense(8, init=init, activation='relu')) 
    model.add(Dense(1, init=init, activation='sigmoid')) 
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy']) 
    return model


Step 3: Set out seed and import out data

In [None]:
seed = 55
numpy.random.seed(seed) 
dataset = numpy.loadtxt("pima.csv", delimiter=",") 
X = dataset[:,0:8] 
Y = dataset[:,8]

Step 4: This is where we create the model. Take note of the build_fn function. 

In [None]:
model = KerasClassifier(build_fn=create_model, verbose=0) 

In [None]:
optimizers = ['rmsprop', 'adam'] 
init = ['glorot_uniform', 'normal', 'uniform'] 
epochs = [50, 100, 150] 
batches = [5, 10, 20] 

In [None]:
param_grid = dict(optimizer=optimizers, nb_epoch=epochs, batch_size=batches, init=init) 
grid = GridSearchCV(estimator=model, param_grid=param_grid) 
grid_result = grid.fit(X, Y)

In [None]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

# Lesson 3 - Using Keras with Other Core Python Libraries

> Keras is a library specifc to deep learning. We need to use other libraries like SciKit-Learn to expand how we build and deploy our mdoels. 


In this lecture let's buiod a binary classification model. 

This simply means our output will be a 1 or a 0. 

In [37]:
import numpy 
from pandas import read_csv 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.wrappers.scikit_learn import KerasClassifier 
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder 
from sklearn.model_selection import StratifiedKFold 
from sklearn.preprocessing import StandardScaler 
from sklearn.pipeline import Pipeline

seed = 7 
numpy.random.seed(seed) 
dataframe = read_csv("sonar.csv") 
dataset = dataframe.values 

X = dataset[:,0:60].astype(float) 
Y = dataset[:,60] 
encoder = LabelEncoder() 
encoder.fit(Y) 
encoded_Y = encoder.transform(Y) 

def create_baseline(): 
        model = Sequential() 
        model.add(Dense(60, input_dim=60, kernel_initializer='normal', activation='relu')) 
        model.add(Dense(30, kernel_initializer='normal', activation='relu'))
        model.add(Dense(1, kernel_initializer='normal', activation='sigmoid')) 
        model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) 
        return model 
    
estimator = KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0) 
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed) 
results = cross_val_score(estimator, X, encoded_Y, cv=kfold) 
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))




Baseline: 82.13% (5.20%)


In [3]:
import numpy 
from pandas import read_csv 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.wrappers.scikit_learn import KerasClassifier 
from keras.utils import np_utils 
from sklearn.model_selection import cross_val_score 
from sklearn.model_selection import KFold 
from sklearn.preprocessing import LabelEncoder 
from sklearn.pipeline import Pipeline 

seed = 7 
numpy.random.seed(seed) 

dataframe = read_csv("iris.csv", header=None) 
dataset = dataframe.values 

X = dataset[:,0:4].astype(float) 
Y = dataset[:,4] 

encoder = LabelEncoder() 
encoder.fit(Y) 
encoded_Y = encoder.transform(Y) 
dummy_y = np_utils.to_categorical(encoded_Y) 

def baseline_model(): 
        model = Sequential() 
        model.add(Dense(4, input_dim=4, kernel_initializer='normal', activation='relu')) 
        model.add(Dense(3, kernel_initializer='normal', activation='sigmoid')) 
        model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 
        return model 
    
estimator = KerasClassifier(build_fn=baseline_model, epochs=200, batch_size=5, verbose=0) 
kfold = KFold(n_splits=10, shuffle=True, random_state=seed) 
results = cross_val_score(estimator, X, dummy_y, cv=kfold) 
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Using Theano backend.


Accuracy: 96.67% (4.47%)


Using Checkpoints

> Application checkpointing is a fault tolerance technique for long running processes. It is an approach where a snapshot of the state of the system is taken in case of system failure. 

In [4]:
from keras.models import Sequential 
from keras.layers import Dense 
from keras.callbacks import ModelCheckpoint 
import matplotlib.pyplot as plt 
import numpy 

seed = 7 
numpy.random.seed(seed) 

dataset = numpy.loadtxt("pima.csv", delimiter=",") 
X = dataset[:,0:8] 
Y = dataset[:,8] 

model = Sequential() 
model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation='relu')) 
model.add(Dense(8, kernel_initializer='uniform', activation='relu')) 
model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid')) 

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) 

filepath="checkpoints-{epoch:02d}-{val_acc:.2f}.hdf5" 
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max') 
callbacks_list = [checkpoint] 

model.fit(X, Y, validation_split=0.25, epochs=15, batch_size=10, callbacks=callbacks_list, verbose=0)

Epoch 00000: val_acc improved from -inf to 0.63542, saving model to checkpoints-00-0.64.hdf5
Epoch 00001: val_acc did not improve
Epoch 00002: val_acc did not improve
Epoch 00003: val_acc improved from 0.63542 to 0.64583, saving model to checkpoints-03-0.65.hdf5
Epoch 00004: val_acc did not improve
Epoch 00005: val_acc improved from 0.64583 to 0.66146, saving model to checkpoints-05-0.66.hdf5
Epoch 00006: val_acc did not improve
Epoch 00007: val_acc did not improve
Epoch 00008: val_acc improved from 0.66146 to 0.66146, saving model to checkpoints-08-0.66.hdf5
Epoch 00009: val_acc did not improve
Epoch 00010: val_acc improved from 0.66146 to 0.66667, saving model to checkpoints-10-0.67.hdf5
Epoch 00011: val_acc did not improve
Epoch 00012: val_acc improved from 0.66667 to 0.68229, saving model to checkpoints-12-0.68.hdf5
Epoch 00013: val_acc did not improve
Epoch 00014: val_acc did not improve


<keras.callbacks.History at 0x1346110>