# Le Titanic : Deep Learning

## Réseaux denses avec Sklearn

<img src="http://scikit-learn.org/stable/_images/multilayerperceptron_network.png">

Détails sur les paramètres :  
http://scikit-learn.org/stable/modules/neural_networks_supervised.html

In [1]:
# Directive pour afficher les graphiques dans Jupyter
%matplotlib inline
# Pandas : librairie de manipulation de données
# NumPy : librairie de calcul scientifique
# MatPlotLib : librairie de visualisation et graphiques
# SeaBorn : librairie de graphiques avancés
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
from sklearn import metrics

In [2]:
# Lecture des datasets
data_train = pd.read_csv('titanic_train0.csv')
data_test = pd.read_csv('titanic_test0.csv')
data_train.head()

Unnamed: 0,survived,sex,age,sibsp,parch,fare,pclass_1,pclass_2,pclass_3,embarked_C,embarked_Q,embarked_S
0,0,0,-0.642116,0,0,-0.490691,0,0,1,0,0,1
1,0,0,0.140274,0,0,-0.493509,0,0,1,0,1,0
2,1,1,-0.428737,3,2,4.441355,1,0,0,0,0,1
3,0,0,0.175837,0,0,3.445682,1,0,0,1,0,0
4,1,1,-1.566759,1,1,-0.140674,0,1,0,0,0,1


In [3]:
X_train = data_train.drop(['survived'], axis=1)
Y_train = data_train.survived
X_test = data_test.drop(['survived'], axis=1)
Y_test = data_test.survived

In [4]:
# Importation de la librairie "neural networks" de sklearn
from sklearn.neural_network import MLPClassifier

In [5]:
# Création d'un réseau dense avec 2 couches cachées de 5 et 3 neurones
mlp =  MLPClassifier(hidden_layer_sizes=(5,3))

In [6]:
mlp.fit(X_train, Y_train)



MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(5, 3), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

In [7]:
Y_mlp = mlp.predict(X_test)

In [8]:
mlp_score = metrics.accuracy_score(Y_test, Y_mlp)
print(mlp_score)

0.835877862595


In [9]:
cm = metrics.confusion_matrix(Y_test, Y_mlp)
print(cm)

[[150  10]
 [ 33  69]]


### Exercice : tester plusieurs possibilités de couches cachées

In [10]:
scoreR2 = metrics.r2_score(Y_test, Y_pred)
print(scoreR2)

NameError: name 'Y_pred' is not defined

### Exercice : tester les réseaux de neurones sur le dataset pour la prédiction du diabète


## Deep learning avec Keras

Pour installer keras dans Anaconda, dans un terminal lancer :  
conda install -c conda-forge keras
(environnement Theano par defaut)

Pour un environnement Tensorflow :  
http://inmachineswetrust.com/posts/deep-learning-setup/

Avec GPU sous windows :  
http://www.heatonresearch.com/2017/01/01/tensorflow-windows-gpu.html

In [11]:
# Importation des modèles standard (dense) sous Keras
from keras.models import Sequential
from keras.layers import Dense
import numpy as np

Using TensorFlow backend.


In [12]:
X_train = X_train.as_matrix()
X_test = X_test.as_matrix()

In [13]:
# Création du modèle

model = Sequential()
model.add(Dense(5, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(3, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

In [14]:
# Compilation du modèle
# Erreur quadratique
# Descente de gradient stochastique

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [17]:
# Apprentissage
# 200 itérations (epoch)
# mini-batch de 10

model.fit(X_train, Y_train, nb_epoch=200, batch_size=10)

Epoch 1/200
Epoch 2/200
Epoch 3/200
  10/1047 [..............................] - ETA: 0s - loss: 0.7218 - acc: 0.5000



Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch

Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


<keras.callbacks.History at 0x26a6e4dd128>

In [18]:
# Performance du modèle

scores = model.evaluate(X_test, Y_test)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

 32/262 [==>...........................] - ETA: 0sacc: 82.06%


### Exercice : Tester différents paramètres (initialisation, optimizer, couches, ...)

### Exercice : tester Keras sur le dataset du diabete