# Deep Learning Neural Network on Animal Shelter data

This time we'll be using a neural network to predict based on the same processed dataset from notebook 1.

In [2]:
import numpy
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.cross_validation import cross_val_score
from sklearn.cross_validation import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
import theano

import matplotlib.pyplot as plt
import os


In [3]:
# Set up my data directories from different machines

mac_data_dir = '/Users/christopherallison/Documents/Coding/Data'
linux_data_dir = '/home/chris/data'
win_data_dir = u'C:\\Users\\Owner\\Documents\\Data'

# Set data directory for example

data_dir = mac_data_dir

In [4]:
# Load our prepared dataset and reference data

df = pd.read_csv(os.path.join(data_dir, "prepared_animals_df.csv"),index_col=0)

In [5]:
df.head()

Unnamed: 0,OutcomeType,AnimalType,AgeuponOutcome,Color,Intact,Gender,NameLength,BreedKMeans
0,Return_to_owner,1,1,0,0,1,7,3
1,Euthanasia,0,1,1,0,0,5,0
2,Adoption,1,2,2,0,1,6,1
3,Transfer,0,0,2,1,1,3,0
4,Transfer,1,2,3,0,1,3,3


In [6]:
# Drop uneccesary columns
X = df.drop('OutcomeType', axis=1)
X.dtypes

AnimalType        int64
AgeuponOutcome    int64
Color             int64
Intact            int64
Gender            int64
NameLength        int64
BreedKMeans       int64
dtype: object

In [7]:
# We now have a dataframe with 7 features.

X.head()

Unnamed: 0,AnimalType,AgeuponOutcome,Color,Intact,Gender,NameLength,BreedKMeans
0,1,1,0,0,1,7,3
1,0,1,1,0,0,5,0
2,1,2,2,0,1,6,1
3,0,0,2,1,1,3,0
4,1,2,3,0,1,3,3


In [8]:
X = X.as_matrix()

In [9]:
outcomes = df.OutcomeType.unique()

In [10]:
from sklearn import preprocessing

# This code takes our text labels and creates an encoder that we use
# To transform them into an array

encoder = preprocessing.LabelEncoder()
encoder.fit(outcomes)

encoded_y = encoder.transform(outcomes)
encoded_y

#We can also inverse_transform them back.
list(encoder.inverse_transform([0, 1, 2, 3, 4]))

#We still need to transform the array into a matrix - this is called 
# one hot encoding. It allows us to track the probability of each possible outcome separately.

#First, we'll transform the labels into their array value.
df.OutcomeType = encoder.transform(df.OutcomeType)

Next we actually do the one hot encoding.

In [11]:
from keras.utils import np_utils

train_target = np_utils.to_categorical(df['OutcomeType'].values)
train_target

array([[ 0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.],
       ..., 
       [ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])

Now it's time to define our deep learning model.  We'll start with something very simple - a two layer model.

The first layer takes 7 input dimensions (our features) and condenses it down to 5 outputs.

The second layer takes the 5 previous outputs and generates a 5 point output array that we'll map to our outcomes (train_target).

In [12]:
model = Sequential()
model.add(Dense(5, input_dim=7, init='normal', activation="relu"))
model.add(Dense(5, init='normal', activation='sigmoid'))

# Compile model
print("Compiling model...")
model.compile(loss='categorical_crossentropy', optimizer='adam',
             metrics=['accuracy'])

Compiling model...


In [13]:
hist = model.fit(X, train_target, validation_split=0.2)
print("")
print(hist.history)

Train on 21383 samples, validate on 5346 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

{'val_loss': [1.2310120976466254, 1.1036877086712902, 1.0090602545246754, 0.99304743736056944, 0.9839020459963238, 0.97781759605307816, 0.97275026470946768, 0.97062114754108464, 0.96923316049076313, 0.96545054927730312], 'loss': [1.3025166697861832, 1.1620977587033794, 1.0361633395834253, 0.99559998173310726, 0.98542739998461937, 0.97959501613531674, 0.97572323508075987, 0.9727504367283808, 0.97022267650778549, 0.96824313966751718], 'acc': [0.40377870271193972, 0.46059954169339207, 0.58177056541078487, 0.58986110462417873, 0.5884581209455555, 0.59135762054525598, 0.59079642706962554, 0.59145115279049754, 0.58920637890051919, 0.59014170135293464], 'val_acc': [0.3991769547325103, 0.52300785634118963, 0.59801720912832024, 0.58473625140291807, 0.59128320239431353, 0.59184436962214737, 0.59184436962214737, 0.59109614665170218, 0.593

In [14]:
model.evaluate(X, train_target)



[0.96609342923321895, 0.59268958809453454]

In [19]:
model.predict_classes(X[0:10], verbose=1)



array([0, 0, 3, 4, 3, 4, 4, 0, 0, 0])

In [20]:
encoder.inverse_transform([0, 0, 3, 4, 3, 4, 4, 0, 0, 0])

array(['Adoption', 'Adoption', 'Return_to_owner', 'Transfer',
       'Return_to_owner', 'Transfer', 'Transfer', 'Adoption', 'Adoption',
       'Adoption'], dtype=object)

In [17]:
model.predict_proba(X[0:2], verbose=1)



array([[  4.55615222e-01,   3.31927964e-04,   6.73059095e-03,
          1.88609943e-01,   1.14675529e-01],
       [  4.52898502e-01,   7.62353279e-03,   2.33411528e-02,
          6.55053481e-02,   2.60184228e-01]])

Unfortunately, deep learning models (or at least these ones) are fairly obtuse and it takes a lot of work to get into the model and understand how it made its predictions.

Of note, the deep learning model achieved slightly more accurate predictions than both the random forest and the decision tree classifiers with significantly less work.

I hope these notebooks were useful.  Please feel free to get in touch and provide any feedback on GitHub or Twitter @ToferC