<center>
<a href="http://www.insa-toulouse.fr/" ><img src="http://www.math.univ-toulouse.fr/~besse/Wikistat/Images/logo-insa.jpg" style="float:left; max-width: 120px; display: inline" alt="INSA"/></a> 

<a href="http://wikistat.fr/" ><img src="http://www.math.univ-toulouse.fr/~besse/Wikistat/Images/wikistat.jpg" style="max-width: 250px; display: inline"  alt="Wikistat"/></a>

<a href="http://www.math.univ-toulouse.fr/" ><img src="http://www.math.univ-toulouse.fr/~besse/Wikistat/Images/logo_imt.jpg" style="float:right; max-width: 200px; display: inline" alt="IMT"/> </a>
</center>

# [Ateliers: Technologies des grosses data](https://github.com/wikistat/Ateliers-Big-Data)

# [Reconnaissance d'Activité Humaine](https://github.com/wikistat/Ateliers-Big-Data/5-HumanActivityRecognition) ([*HAR*](https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones)) en <a href="https://www.python.org/"><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Python_logo_and_wordmark.svg/390px-Python_logo_and_wordmark.svg.png" style="max-width: 120px; display: inline" alt="Python"/></a>  
##  Seconde partie:  apprentissage (profond) des signaux bruts  avec <a href="https://keras.io/"><img src="https://s3.amazonaws.com/keras.io/img/keras-logo-2018-large-1200.png" style="max-width: 100px; display: inline" alt="Keras"/></a>

Ce notebook présente la partie prediction de l'activité. Pour l'exploration, se référer au calepin afférent.

##  1 Introduction
###  1.1 Contexte
Les données sont issues de la communauté qui vise la reconnaissance d'activités humaines (*Human activity recognition, HAR*) à partir d’enregistrements, par exemple du gyroscope et de l'accéléromètre d'un smartphone.
Voir à ce propos l'[article](https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2013-11.pdf) relatant un colloque de 2013.  

Les données publiques disponibles ont été acquises, décrites et analysées par [Anguita et al. (2013)](https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2013-84.pdf). Elles sont accessibles sur le [dépôt](https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones) de l'University California Irvine (UCI) consacré à l'apprentissage machine ainsi que sur le site *Kaggle*.

L'archive contient les données brutes: accélérations en x, y, et z, chacun de 128 colonnes. D'autres fichiers en y soustrayant la gravité naturelle ainsi que les accélérations angulaires en x, y, et z, soit en tout 9 fichiers. Mais 6 utiles avec 6*128=768 mesures.

Les méthodes d'apprentissage sont appliquées sur ces données brutes, sans calculs préliminaires de caractéristiques (*features*).

### 1.2 Objectif
Cette deuxième étape s'intéresse aux données brutes. Est-il possible d'économiser le travail préliminaire de définition des variables métier en utilisant, par exemple, les ressources de décompositions systématiques sur une base d'ondelette ou un algorihtme d'apprentissage profond?

**Objecctif** Faire aussi bien (96% de bien classés) qu'avec les variables métier.

### 1.3 Travail à réaliser
**Attention l'accès à un environnement *GPU* est très vivement conseillé voire indispensable.**
- Modélisation, prévision de l'échantillon test par
   - Régression logistique (`Scikit-learn`)
   - Apprentissage profond en utilisant `Keras` 
       - MLP sur signaux "applatis"
       - MLP sur signaux mutlidimensionelles
       - LSTM
       - 1D Convolution
       - 2D Convolution
   
- Ajouter à ce calepin: 
    - Application des méthodes d'apprentissage classique ou non sur les coefficients des décompositions des signaux en ondelettes
    - optimisation des paramètres des différentes méthodes.
    - Améliorer l'architexture des réseaux?


## 2 Mise en place
### 2.1 Librairies et initialisation

In [87]:
import pandas as pd
import numpy as np
import os
import time
import copy
import random
import itertools

#Utils Sklearn
import sklearn.linear_model as lm
from sklearn.metrics import confusion_matrix
from sklearn.svm import SVC, LinearSVC
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV,train_test_split

%matplotlib notebook
import matplotlib.pyplot as plt
import seaborn as sb
sb.set()

In [3]:
# DEEP LEARING 
import tensorflow as tf
np.random.seed(42)
tf.set_random_seed(42)

# for reproducibility
# https://github.com/fchollet/keras/issues/2280
session_conf = tf.ConfigProto(
    intra_op_parallelism_threads=1,
    inter_op_parallelism_threads=1
)

from keras import backend as K
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

import keras.models as km 
import keras.layers as kl 
import keras.layers.core as klc

Using TensorFlow backend.


### 2.2 Prise en charge des données
#### Sources

Les données sont celles originales du dépôt de l'[UCI](https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones). Elle peuvent être téléchargées en cliquant [ici](https://archive.ics.uci.edu/ml/machine-learning-databases/00240/UCI%20HAR%20Dataset.zip).

Elles contiennent deux jeux de dimensions différentes, chacun partagé en apprentissage et test.

1. Multidimensionel: un individus est constitué de 9 Séries Temporelles de *dimensions* $(N, 128, 9)$
2. Unidimensionnel: Les 9 Séries Temporelles sont concaténées pour constituer un vecteur de 128x9 = 1152 variables de *dimensions* $(N, 1152)$
        
Deux objets différents sont construits pour définir la variable $Y$ réponse car les librairies `Scikit-learn` et `Keras` prennent en compte des structures différentes: 
    
1. `Scikit-Learn`  Un vecteur de dimension $(N, 1)$ avec, pour chaque individu le numéro du label de l'activité de 0 à 5.
2. `Keras` Une matrice de dimension $(N, 6)$ des indicatrices (0 ou 1) des modalités de $Y$.

#### Lecture des données

In [4]:
DATADIR_UCI = './../data_har'

SIGNALS = [ "body_acc_x", "body_acc_y", "body_acc_z", "body_gyro_x", "body_gyro_y", "body_gyro_z", "total_acc_x", "total_acc_y", "total_acc_z"]

def my_read_csv(filename):
    return pd.read_csv(filename, delim_whitespace=True, header=None)

def load_signal(data_dir, subset, signal):
    filename = f'{data_dir}/{subset}/Inertial Signals/{signal}_{subset}.txt'
    x = my_read_csv(filename).as_matrix()
    return x 

def load_signals(data_dir, subset, flatten = False):
    signals_data = []
    for signal in SIGNALS:
        signals_data.append(load_signal(data_dir, subset, signal)) 
    
    if flatten :
        X = np.hstack(signals_data)
    else:
        X = np.transpose(signals_data, (1, 2, 0))
        
    return X 

def load_y(data_dir, subset, dummies = False):
    filename = f'{data_dir}/{subset}/y_{subset}.txt'
    y = my_read_csv(filename)[0]
    
    
    if dummies:
        Y = pd.get_dummies(y).as_matrix()
    else:
        Y = y.as_matrix()
    
    return Y

Vérification des dimensions

In [5]:
#Multidimensional Data
X_train, X_test = load_signals(DATADIR_UCI, 'train'), load_signals(DATADIR_UCI, 'test')
# Flattened Data
X_train_flatten, X_test_flatten = load_signals(DATADIR_UCI, 'train', flatten=True), load_signals(DATADIR_UCI, 'test', flatten=True)

# Label Y
Y_train_label, Y_test_label = load_y(DATADIR_UCI, 'train', dummies = False), load_y(DATADIR_UCI, 'test', dummies = False)
#Dummies Y (For Keras)
Y_train_dummies, Y_test_dummies = load_y(DATADIR_UCI, 'train', dummies = True), load_y(DATADIR_UCI, 'test', dummies = True)

N_train = X_train.shape[0]
N_test = X_test.shape[0]

In [6]:
print("Dimension")
print("Données Multidimensionelles, : " + str(X_train.shape))
print("Données Unimensionelles, : " + str(X_train_flatten.shape))
print("Vecteur réponse (scikit-learn) : " + str(Y_train_label.shape))
print("Matrice réponse(Keras) : " + str(Y_train_dummies.shape))

Dimension
Données Multidimensionelles, : (7352, 128, 9)
Données Unimensionelles, : (7352, 1152)
Vecteur réponse (scikit-learn) : (7352,)
Matrice réponse(Keras) : (7352, 6)


#### Utilitaires

In [7]:
LABELS = ["WALKING","WALKING UPSTAIRS","WALKING DOWNSTAIRS","SITTING","STANDING","LAYING"]
ACTIVITIES = {
    0: 'WALKING',
    1: 'WALKING_UPSTAIRS',
    2: 'WALKING_DOWNSTAIRS',
    3: 'SITTING',
    4: 'STANDING',
    5: 'LAYING',
}


def my_confusion_matrix(Y_true, Y_pred):
    Y_true = pd.Series([ACTIVITIES[y] for y in np.argmax(Y_true, axis=1)])
    Y_pred = pd.Series([ACTIVITIES[y] for y in np.argmax(Y_pred, axis=1)])

    return pd.crosstab(Y_true, Y_pred, rownames=['True'], colnames=['Pred'])

def _count_classes(y):
    return len(set([tuple(category) for category in y]))

### 2.3 Décomposition en bases d'ondelettes (Haar)

On garde seulement le niveau de décomposition le plus fin.

Nous avons choisi le niveau 10 comme niveau le plus fin car $2^{10} = 1024$ or nous avons $1152$ variables.

In [8]:
import pywt
from statsmodels.robust import mad
import sklearn.decomposition as sd

`wavelet_transformation()` : Fonction qui décompose en bases d'ondelettes 

`level` = niveau de décomposition max gardé (sauf pour le niveau 10 où on ne garde que le plus fin)  

`threshold` = seuil

In [35]:
wf = "haar"
def wavelet_transformation(X,level=10,threshold=4):
    Coeff = []
    TCoeff = []
    for x in X:
        #Apply wabvelet decomposition
        coeffs = pywt.wavedec(x,wf,level=level)
        if level==10:
            Coeff.append(coeffs[-1])
        else:
            coeffs_flatten = np.hstack(coeffs[1:level])
            Coeff.append(coeffs_flatten)
        # Compute universal Threshold http://jseabold.net/blog/2012/02/23/wavelet-regression-in-python/
        sigma = mad(coeffs[-1])
        uthresh = sigma*np.sqrt(2*np.log(128*9))
        # Apply Threshold on 'threshold' fist level
        coeffs_thresh = [pywt.threshold(c, uthresh, mode="hard") if i<=threshold-1 else c for i,c in enumerate(coeffs[::-1])]
        coeffs_thresh_flatten = np.hstack(coeffs_thresh[::-1])
        TCoeff.append(coeffs_thresh_flatten)
    return np.array(TCoeff),np.array(Coeff)

In [38]:
TCoeff_train,Coeff_train=wavelet_transformation(X_train_flatten)
print(Coeff_train.shape, TCoeff_train.shape)
print(np.sum(Coeff_train!=0), np.sum(TCoeff_train!=0))

(7352, 576) (7352, 1155)
4234724 1703713


In [41]:
TCoeff_test,Coeff_test=wavelet_transformation(X_test_flatten)
print(Coeff_test.shape, TCoeff_test.shape)
print(np.sum(Coeff_test!=0), np.sum(TCoeff_test!=0))

(2947, 576) (2947, 1155)
1697463 691722


## 3 Apprentissage des signaux uni-dimensionnels

La base d'apprentissage est de dimension (`N_train`, 1152)

### 3.1 METHODES LINEAIRES - DONNEES BRUTES

### Régression Logistique

La Régression Logistique est une des méthodes conduisant aux meilleurs résultats sur les variables métier.

In [41]:
t_start = time.time()
model_lr = lm.LogisticRegression(verbose=1)
model_lr.fit(X_train_flatten, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_lr.score(X_test_flatten, Y_test_label)
print("\n Score With Logistic Regression on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
lr_prediction_label = model_lr.predict(X_test_flatten)
metadata_lr = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(lr_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

[LibLinear]
 Score With Logistic Regression on Inertial Signals = 57.45, 
 Learning time = 28.37 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,120,63,97,0,1,0
WALKING UPSTAIRS,74,218,56,23,72,27
WALKING DOWNSTAIRS,92,66,103,1,2,0
SITTING,79,32,58,397,112,0
STANDING,131,92,106,70,345,0
LAYING,0,0,0,0,0,510


**Q** Que dire de la performance?  
La performance de la régression logistique sur les données brutes n'a rien à envier aux performances obtenues sur les données métier. Il y a beaucoup de confusions, même entre les classes actives et passives qui étaient plus facile à discriminer avec les données métier. On remarque cependant que la classe `laying` est bien discriminée par rapport aux autres. 

### Analyse Discriminante Linéaire


In [42]:
ts = time.time()
model_lda = LinearDiscriminantAnalysis()
model_lda=model_lda.fit(X_train_flatten, Y_train_label)
score = model_lda.score(X_test_flatten, Y_test_label)
ypred = model_lda.predict(X_test_flatten)
te = time.time()
t_learning = te-ts
print("\n Score With Linear Discriminant Analysis on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
lda_prediction_label = model_lda.predict(X_test_flatten)
metadata_lda = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(lda_prediction_label, Y_test_label), index = LABELS, columns=LABELS)


 Score With Linear Discriminant Analysis on Inertial Signals = 58.84, 
 Learning time = 1.15 secondes




Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,132,93,114,0,4,0
WALKING UPSTAIRS,74,146,58,23,6,0
WALKING DOWNSTAIRS,117,69,113,2,3,0
SITTING,15,23,18,313,26,0
STANDING,158,140,117,153,493,0
LAYING,0,0,0,0,0,537


### 3.2 METHODES LINEAIRES - COEFFICIENTS D'ONDELETTES

### Régression Logistique

In [45]:
TCoeff_train,Coeff_train=wavelet_transformation(X_train_flatten,4)
TCoeff_test,Coeff_test=wavelet_transformation(X_test_flatten,4)

In [46]:
t_start = time.time()
model_lr_haar = lm.LogisticRegression(verbose=1)
model_lr_haar.fit(Coeff_train, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_lr_haar.score(Coeff_test, Y_test_label)
print("\n Score With Logistic Regression on Inertial Signals (Haar)= %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
lr_haar_prediction_label = model_lr_haar.predict(Coeff_test)
metadata_lr_haar = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(lr_haar_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

[LibLinear]
 Score With Logistic Regression on Inertial Signals (Haar)= 31.52, 
 Learning time = 9.03 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,128,88,103,2,5,2
WALKING UPSTAIRS,91,118,89,1,0,0
WALKING DOWNSTAIRS,108,95,111,1,5,0
SITTING,14,15,9,16,7,0
STANDING,69,68,39,75,98,77
LAYING,86,87,69,396,417,458


### Analyse Discriminante Linéaire

In [45]:
t_start = time.time()
model_lda_haar = LinearDiscriminantAnalysis()
model_lda_haar.fit(Coeff_train, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_lda_haar.score(Coeff_test, Y_test_label)
print("\n Score With Linear Disciminant Analysis on Inertial Signals (Haar) = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
lda_haar_prediction_label = model_lda_haar.predict(Coeff_test)
metadata_lda_haar = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(lda_haar_prediction_label, Y_test_label), index = LABELS, columns=LABELS)


 Score With Linear Disciminant Analysis on Inertial Signals (Haar) = 34.14, 
 Learning time = 0.47 secondes




Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,150,73,100,2,3,0
WALKING UPSTAIRS,84,127,59,7,3,5
WALKING DOWNSTAIRS,101,79,112,2,4,0
SITTING,40,29,26,54,55,25
STANDING,62,56,67,360,387,331
LAYING,59,107,56,66,80,176


### 3.3 SVM, RF, GBM - DONNEES BRUTES

### SVM linéaire

In [47]:
from sklearn.svm import SVC, LinearSVC
t_start = time.time()
model_lsvm = LinearSVC(verbose=1)
model_lsvm.fit(X_train_flatten, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_lsvm.score(X_test_flatten, Y_test_label)
print("\nScore With Linear SVC on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
lsvm_prediction_label = model_lsvm.predict(X_test_flatten)
metadata_lsvm = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(lsvm_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

[LibLinear]
Score With Linear SVC on Inertial Signals = 56.40, 
 Learning time = 60.52 secondes




Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,125,81,116,0,3,0
WALKING UPSTAIRS,70,204,56,24,70,27
WALKING DOWNSTAIRS,74,54,88,1,0,0
SITTING,92,35,62,395,119,0
STANDING,135,97,98,71,340,0
LAYING,0,0,0,0,0,510


### SVM non-linéaire

In [48]:
t_start = time.time()
model_svm = SVC(verbose=1)
model_svm.fit(X_train_flatten, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_svm.score(X_test_flatten, Y_test_label)
print("\nScore With non linear SVC on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
svm_prediction_label = model_svm.predict(X_test_flatten)
metadata_svm = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(svm_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

[LibSVM]
Score With non linear SVC on Inertial Signals = 76.96, 
 Learning time = 26.88 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,407,152,150,1,2,0
WALKING UPSTAIRS,34,261,50,6,0,0
WALKING DOWNSTAIRS,39,58,217,0,0,0
SITTING,0,0,0,371,55,0
STANDING,16,0,3,113,475,0
LAYING,0,0,0,0,0,537


### Random Forest

In [49]:
ts = time.time()
param=[{"n_estimators":list(range(10,210,20))}]
model_rf= GridSearchCV(RandomForestClassifier(),param,cv=5,n_jobs=-1)
model_rf=model_rf.fit(X_train_flatten,Y_train_label)
score = model_rf.score(X_test_flatten, Y_test_label)
ypred = model_rf.predict(X_test_flatten)
te = time.time()
t_learning = te-ts
print("\nScore With non Random Forest on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
rf_prediction_label = model_rf.predict(X_test_flatten)
metadata_rf = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(rf_prediction_label, Y_test_label), index = LABELS, columns=LABELS)


Score With non Random Forest on Inertial Signals = 84.76, 
 Learning time = 123.76 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,423,72,29,2,1,0
WALKING UPSTAIRS,28,371,18,8,3,0
WALKING DOWNSTAIRS,45,28,373,0,0,0
SITTING,0,0,0,386,120,0
STANDING,0,0,0,95,408,0
LAYING,0,0,0,0,0,537


### Gradient Boosting Classifier

In [14]:
ts = time.time()
#param=[{"max_depth":list(range(1,16,5)),"n_estimators":list(range(10,210,50)),
#       "learning_rate":list([0.1,0.3,0.5,0.7,0.9])}]
#model_gb = GridSearchCV(GradientBoostingClassifier(),param,cv=5,n_jobs=-1)
model_gb = GradientBoostingClassifier()
model_gb=model_gb.fit(X_train_flatten, Y_train_label)
score = model_gb.score(X_test_flatten, Y_test_label)
ypred = model_gb.predict(X_test_flatten)
te = time.time()
t_learning = te-ts
print("\n Score With XGBoost on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
gb_prediction_label = model_gb.predict(X_test_flatten)
metadata_gb = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(gb_prediction_label, Y_test_label), index = LABELS, columns=LABELS)


 Score With XGBoost on Inertial Signals = 87.61, 
 Learning time = 272.21 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,479,97,50,0,1,0
WALKING UPSTAIRS,4,350,21,11,7,0
WALKING DOWNSTAIRS,13,24,349,0,0,0
SITTING,0,0,0,405,62,0
STANDING,0,0,0,75,462,0
LAYING,0,0,0,0,0,537


### XGBoost

In [None]:
from xgboost import XGBClassifier
t_start = time.time()
param=[{"n_estimators":[50,100,200]}]
model_xgb =  GridSearchCV(XGBClassifier(),param,cv=10,n_jobs=-1)
model_xgb.fit(X_train_flatten, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_xgb.score(X_test_flatten, Y_test_label)
print("\n Score With XGBoost on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
xgb_prediction_label = model_xgb.predict(X_test_flatten)
metadata_xgb = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(xgb_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

### 3.4 SVM, RF, GBM - COEFFICIENTS D'ONDELETTES

### SVM Linéaire

In [50]:
t_start = time.time()
model_lsvm_haar = LinearSVC(verbose=1)
model_lsvm_haar.fit(Coeff_train, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_lsvm_haar.score(Coeff_test, Y_test_label)
print("\nScore With Linear SVC on Inertial Signals (Haar) = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
lsvm_haar_prediction_label = model_lsvm_haar.predict(Coeff_test)
metadata_lsvm_haar = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(lsvm_haar_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

[LibLinear]
Score With Linear SVC on Inertial Signals (Haar) = 32.98, 
 Learning time = 48.00 secondes




Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,147,118,107,3,3,0
WALKING UPSTAIRS,79,108,77,1,5,2
WALKING DOWNSTAIRS,135,103,117,0,7,1
SITTING,14,10,14,5,2,0
STANDING,44,60,43,104,109,48
LAYING,77,72,62,378,406,486


### SVM non-linéaire

In [12]:
t_start = time.time()
model_svm_haar = SVC(verbose=1)
model_svm_haar.fit(Coeff_train, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_svm_haar.score(Coeff_test, Y_test_label)
print("\nScore With non linear SVC on Inertial Signals  (Haar)= %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
svm_haar_prediction_label = model_svm_haar.predict(Coeff_test)
metadata_svm_haar = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(svm_haar_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

[LibSVM]
Score With non linear SVC on Inertial Signals  (Haar)= 18.22, 
 Learning time = 46.04 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,0,0,0,0,0,0
WALKING UPSTAIRS,0,0,0,0,0,0
WALKING DOWNSTAIRS,0,0,0,0,0,0
SITTING,0,0,0,0,0,0
STANDING,0,0,0,0,0,0
LAYING,496,471,420,491,532,537


### Random Forest

In [60]:
TCoeff_train,Coeff_train=wavelet_transformation(X_train_flatten,10,4)
TCoeff_test,Coeff_test=wavelet_transformation(X_test_flatten,10,4)

~94% avec seuil=4, level=10

In [61]:
ts = time.time()
param=[{"n_estimators":list(range(10,210,20))}]
model_rf_haar= GridSearchCV(RandomForestClassifier(),param,cv=5,n_jobs=-1)
model_rf_haar=model_rf_haar.fit(TCoeff_train,Y_train_label)
score = model_rf_haar.score(TCoeff_test, Y_test_label)
ypred = model_rf_haar.predict(TCoeff_test)
te = time.time()
t_learning = te-ts
print("\nScore With non Random Forest on Inertial Signals (Haar) = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
rf_harr_prediction_label = model_rf_haar.predict(TCoeff_test)
metadata_rf_haar = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(rf_harr_prediction_label, Y_test_label), index = LABELS, columns=LABELS)


Score With non Random Forest on Inertial Signals (Haar) = 93.69, 
 Learning time = 58.11 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,477,4,7,0,0,0
WALKING UPSTAIRS,5,457,10,2,0,0
WALKING DOWNSTAIRS,14,10,403,0,0,0
SITTING,0,0,0,392,36,0
STANDING,0,0,0,91,495,0
LAYING,0,0,0,6,1,537


### Gradient Boosting Classifier

In [63]:
TCoeff_train,Coeff_train=wavelet_transformation(X_train_flatten,6,4)
TCoeff_test,Coeff_test=wavelet_transformation(X_test_flatten,6,4)

In [65]:
ts = time.time()
#param=[{"max_depth":list(range(1,16,5)),"n_estimators":list(range(10,210,50)),
#       "learning_rate":list([0.1,0.3,0.5,0.7,0.9])}]
#model_gb_haar = GridSearchCV(GradientBoostingClassifier(),param,cv=5,n_jobs=-1)
model_gb_haar = GradientBoostingClassifier()
model_gb_haar=model_gb_haar.fit(TCoeff_train, Y_train_label)
score = model_gb_haar.score(TCoeff_test, Y_test_label)
ypred = model_gb_haar.predict(TCoeff_test)
te = time.time()
t_learning = te-ts
print("\n Score With XGBoost on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
gb_haar_prediction_label = model_gb_haar.predict(TCoeff_test)
metadata_gb_haar = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(gb_haar_prediction_label, Y_test_label), index = LABELS, columns=LABELS)


 Score With XGBoost on Inertial Signals = 89.55, 
 Learning time = 216.90 secondes


Unnamed: 0,WALKING,WALKING UPSTAIRS,WALKING DOWNSTAIRS,SITTING,STANDING,LAYING
WALKING,471,44,31,0,1,0
WALKING UPSTAIRS,19,409,38,2,3,0
WALKING DOWNSTAIRS,6,16,351,0,0,0
SITTING,0,2,0,390,47,0
STANDING,0,0,0,99,481,0
LAYING,0,0,0,0,0,537


### XGBoost

In [None]:
t_start = time.time()
param=[{"n_estimators":[50,100,200]}]
model_xgb_haar =  GridSearchCV(XGBClassifier(),param,cv=10,n_jobs=-1)
model_xgb_haar.fit(Coeff_train, Y_train_label)
t_end = time.time()
t_learning = t_end-t_start
score = model_xgb_haar.score(Coeff_test, Y_test_label)
print("\n Score With XGBoost on Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
xgb_haar_prediction_label = model_xgb_haar.predict(Coeff_test)
metadata_xgb_haar = {"time_learning" : t_learning, "score" : score}
pd.DataFrame(confusion_matrix(xgb_haar_prediction_label, Y_test_label), index = LABELS, columns=LABELS)

## 4 Deep Learning sur les données brutes uni-dimensionnelles

### Perceptron multicouche

Un réseau de neurones classique est appris sur les données au même format que précédemment.

**Q** Expliciter les choix des paramètres et donc la structure du réseau.  

La taille du batch n'est pas un diviseur de la dimension de l'input (7352) mais `Keras` ne le prend pas comme une contrainte. 

Le réseau est composé de deux couches perceptron simples (`Denses`) séparées par une couche `Dropout`. La couche d'entrée prend en paramètre la taille des données d'entrée : comme le réseau ne contient pas de couche de convolution, les données peuvent être passées en format 1D (un vecteur de taille 1152 des séries concaténées) comme au format 2D. Elle retourne en sortie n_hidden neurones. La fonction d'activation est la fonction `relu`, largement utilisé car convexe, ce qui facilite la rétropropagation du gradient.  

La couche `Dropout` permet à chaque époque de supprimer aléatoirement 50% des neurones en entrée. Ceci permet d'éviter le surapprentissage.  

La couche de sortie est composée de 6 neurones qui correspondent aux 6 activités. Comme chaque neurone doit avoir une sortie binaire (1 si l'activité du neurone correspond, 0 sinon), la fonction d'activation choisie est la fonction `softmax`. 

In [67]:
epochs = 10
batch_size = 32
n_hidden = 32

n_features = X_train_flatten.shape[1]
n_classes=6


model_base_mlp_u =km.Sequential()
model_base_mlp_u.add(kl.Dense(n_hidden, input_shape=(n_features,),  activation = "relu"))
#model_base_mlp_u.add(kl.Dropout(0.5))
#model_base_mlp_u.add(kl.Dense(16,  activation = "relu"))
model_base_mlp_u.add(kl.Dense(n_classes, activation='softmax'))
model_base_mlp_u.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

model_base_mlp_u.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_3 (Dense)              (None, 32)                36896     
_________________________________________________________________
dense_4 (Dense)              (None, 6)                 198       
Total params: 37,094
Trainable params: 37,094
Non-trainable params: 0
_________________________________________________________________


In [70]:
t_start = time.time()
model_base_mlp_u.fit(X_train_flatten,  Y_train_dummies, batch_size=batch_size, validation_data=(X_test_flatten, Y_test_dummies), epochs=epochs)
t_end = time.time()
t_learning = t_end-t_start

score = model_base_mlp_u.evaluate(X_test_flatten, Y_test_dummies)[1] 
print("\nScore With Simple MLP on Inertial Signals = %.2f, \nLearning time = %.2f secondes" %(score*100, t_learning) )
metadata_mlp_u = {"time_learning" : t_learning, "score" : score}
base_mlp_u_prediction = model_base_mlp_u.predict(X_test_flatten)

my_confusion_matrix(Y_test_dummies, base_mlp_u_prediction)

Train on 7352 samples, validate on 2947 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Score With Simple MLP on Inertial Signals = 86.70, 
Learning time = 6.49 secondes


Pred,LAYING,SITTING,STANDING,WALKING,WALKING_DOWNSTAIRS,WALKING_UPSTAIRS
True,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LAYING,510,0,0,0,0,27
SITTING,0,412,56,0,0,23
STANDING,0,119,405,0,2,6
WALKING,0,6,16,443,20,11
WALKING_DOWNSTAIRS,0,6,9,10,371,24
WALKING_UPSTAIRS,0,6,3,28,20,414


** Q ** : Que conclure sur ces résultats en terme de performance, de temps d'apprentissage? Comparer avec la regression logistique?  

Ce réseau de neurones relativement simple obtient de bien meilleurs résultats sur les données brutes que la régression logistique. Mise à part l'activité `Walking_upstairs`, les classes actives et passives sont assez bien discriminées. Il reste cependant des améliorations à faire intra classes (actives et passives). 

Au niveau du temps de calcul, sous l'environnement GPU l'importation des données met 25 secondes, mais l'exécution de l'algorithme est très rapide : moins de 7 secondes. Une fois les données importées, le réseau de neurones est donc plus rapide que la régression logistique. 

** Exo ** : Quelle est l'influence de l'ajout de nouvelle couche? Supression du Dropout?  

L'ajout d'une couche `Dense` avec 32 ou 16 neurones n'améliore pas la performance du réseau. Au contraire, cela contribue à augmenter le temps d'exécution, donc il ne vaut mieux ne pas la mettre. 

La suppression de la couche `Dropout` permet de diminuer la fonction perte sur l'échantillon test et on obtient de meilleurs résultats. Elle n'est donc pas nécessaire dans ce cas là, d'autant que le temps gagné par sa présence est minime (environ 1sec). 

## 5 Deep Learning sur les signaux multidimensionnels
Les différents signaux ne sont pas concaténées en un seul signal mais pris en compte parallèlement.

### 5.1 Perceptron multichouche
**Q** Expliciter les choix des paramètres et donc la structure du réseau.

Le réseau entrainé est le même que précédemment à la différence que les données d'entrées ne sont pas sous le même format. Les séries n'ont pas été concaténées donc on passe en entrée une matrice. `Input_dim` correspond au nombre de séries et `timesteps` à la longueur d'une série. La couche reshape permet de repasser au format 1D. 

On a supprimé la couche `Dropout` jugée inutile dans ce cas là. 



In [71]:
n_hidden = 32

timesteps = len(X_train[0])
input_dim = len(X_train[0][0])
n_classes = 6

model_base_mlp =km.Sequential()
model_base_mlp.add(kl.Dense(n_hidden, input_shape=(timesteps, input_dim),  activation = "relu"))
model_base_mlp.add(kl.Reshape((timesteps*n_hidden,) , input_shape= (timesteps, n_hidden)  ))
model_base_mlp.add(kl.Dense(n_classes, activation='softmax'))

model_base_mlp.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

t_start = time.time()
model_base_mlp.fit(X_train,  Y_train_dummies, batch_size=batch_size, validation_data=(X_test, Y_test_dummies), epochs=epochs)
t_end = time.time()
t_learning = t_end-t_start

score = model_base_mlp.evaluate(X_test, Y_test_dummies)[1] 
print("\nScore With Simple MLP on Multidimensional Inertial Signals = %.2f, \nLearning time = %.2f secondes" %(score*100, t_learning) )
metadata_mlp = {"time_learning" : t_learning, "score" : score}
base_mlp_prediction = model_base_mlp.predict(X_test)

my_confusion_matrix(Y_test_dummies, base_mlp_prediction)

Train on 7352 samples, validate on 2947 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Score With Simple MLP on Multidimensional Inertial Signals = 86.83, 
Learning time = 6.23 secondes


Pred,LAYING,SITTING,STANDING,WALKING,WALKING_DOWNSTAIRS,WALKING_UPSTAIRS
True,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LAYING,536,0,0,0,0,1
SITTING,0,400,66,0,0,25
STANDING,0,81,438,0,0,13
WALKING,0,0,0,398,87,11
WALKING_DOWNSTAIRS,0,1,0,33,379,7
WALKING_UPSTAIRS,0,0,0,28,35,408


In [72]:
model_base_mlp.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_5 (Dense)              (None, 128, 32)           320       
_________________________________________________________________
reshape_2 (Reshape)          (None, 4096)              0         
_________________________________________________________________
dense_6 (Dense)              (None, 6)                 24582     
Total params: 24,902
Trainable params: 24,902
Non-trainable params: 0
_________________________________________________________________


### 5.2 *Long Short Time Memory (LSTM)*
Test d'un réseau avec couche LSTM avec l'idée d'appréhender la structure temporelle des données.

In [73]:
n_hidden = 32
#default stateful = False

timesteps = len(X_train[0])
input_dim = len(X_train[0][0])
n_classes = 6

batch_size=64
#else:
model_base_lstm =km.Sequential()
model_base_lstm.add(kl.LSTM(n_hidden, input_shape=(timesteps, input_dim)))
model_base_lstm.add(kl.Dense(n_classes, activation='softmax'))

model_base_lstm.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

model_base_lstm.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 32)                5376      
_________________________________________________________________
dense_7 (Dense)              (None, 6)                 198       
Total params: 5,574
Trainable params: 5,574
Non-trainable params: 0
_________________________________________________________________


In [74]:
# Default shuffle = True Meilleur avec Shuffle m True
t_start = time.time()
model_base_lstm.fit(X_train,  Y_train_dummies, batch_size=batch_size, validation_data=(X_test, Y_test_dummies), epochs=epochs, shuffle=False)
t_end = time.time()
t_learning = t_end-t_start

score = model_base_lstm.evaluate(X_test, Y_test_dummies)[1] 
print("\n Score With Simple MLP on Multidimensional Inertial Signals = %.2f, \nLearning time = %.2f secondes" %(score*100, t_learning) )
metadata_lstm = {"time_learning" : t_learning, "score" : score}
base_lstm_prediction = model_base_lstm.predict(X_test)

my_confusion_matrix(Y_test_dummies, base_lstm_prediction)

Train on 7352 samples, validate on 2947 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
 Score With Simple MLP on Multidimensional Inertial Signals = 75.30, 
Learning time = 146.68 secondes


Pred,LAYING,SITTING,STANDING,WALKING,WALKING_DOWNSTAIRS,WALKING_UPSTAIRS
True,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LAYING,510,0,0,0,0,27
SITTING,0,364,101,21,1,4
STANDING,0,59,453,2,2,16
WALKING,0,11,88,236,38,123
WALKING_DOWNSTAIRS,0,4,11,71,290,44
WALKING_UPSTAIRS,0,6,16,43,40,366


### 5.3 Réseau avec couche convolutionelle 1D (*ConvNet*)

In [76]:
timesteps = len(X_train[0])
input_dim = len(X_train[0][0])
n_classes = 6

#else:
model_base_conv_1D =km.Sequential()
model_base_conv_1D.add(kl.Conv1D(32, 9, activation='relu', input_shape=(timesteps, input_dim)))
model_base_conv_1D.add(kl.MaxPooling1D(pool_size=3))
model_base_conv_1D.add(kl.Flatten())
model_base_conv_1D.add(kl.Dense(n_classes, activation='softmax'))
model_base_conv_1D.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model_base_conv_1D.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_2 (Conv1D)            (None, 120, 32)           2624      
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 40, 32)            0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 1280)              0         
_________________________________________________________________
dense_9 (Dense)              (None, 6)                 7686      
Total params: 10,310
Trainable params: 10,310
Non-trainable params: 0
_________________________________________________________________


In [77]:
t_start = time.time()
model_base_conv_1D.fit(X_train,  Y_train_dummies, batch_size=batch_size, validation_data=(X_test, Y_test_dummies), epochs=epochs)
t_end = time.time()
t_learning = t_end-t_start

score = model_base_conv_1D.evaluate(X_test, Y_test_dummies)[1] 
print("\n Score With Conv on Multidimensional Inertial Signals = %.2f, \n Learning time = %.2f secondes" %(score*100, t_learning) )
metadata_conv = {"time_learning" : t_learning, "score" : score}
base_conv_1D_prediction = model_base_conv_1D.predict(X_test)

my_confusion_matrix(Y_test_dummies, base_conv_1D_prediction)

Train on 7352 samples, validate on 2947 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
 Score With Conv on Multidimensional Inertial Signals = 91.62, 
 Learning time = 4.12 secondes


Pred,LAYING,SITTING,STANDING,WALKING,WALKING_DOWNSTAIRS,WALKING_UPSTAIRS
True,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LAYING,537,0,0,0,0,0
SITTING,0,378,108,0,0,5
STANDING,0,76,453,1,0,2
WALKING,0,0,0,488,3,5
WALKING_DOWNSTAIRS,0,0,0,2,397,21
WALKING_UPSTAIRS,0,0,0,4,20,447


### 5.4 Réseau avec couche convolutionelle 2D (*ConvNet*)

In [78]:
timesteps = len(X_train[0])
input_dim = len(X_train[0][0])
n_classes = 6

X_train_conv = X_train.reshape(N_train, timesteps, input_dim, 1)
X_test_conv = X_test.reshape(N_test, timesteps, input_dim, 1)

#else:
model_base_conv_2D =km.Sequential()
model_base_conv_2D.add(kl.Conv2D(32, (3, 9), activation='relu', input_shape=(timesteps, input_dim, 1)))
model_base_conv_2D.add(kl.MaxPooling2D(pool_size=(2, 1)))
model_base_conv_2D.add(kl.Flatten())
model_base_conv_2D.add(kl.Dense(n_classes, activation='softmax'))
model_base_conv_2D.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model_base_conv_2D.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 126, 1, 32)        896       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 63, 1, 32)         0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 2016)              0         
_________________________________________________________________
dense_10 (Dense)             (None, 6)                 12102     
Total params: 12,998
Trainable params: 12,998
Non-trainable params: 0
_________________________________________________________________


In [79]:
t_start = time.time()
model_base_conv_2D.fit(X_train_conv,  Y_train_dummies, batch_size=batch_size, validation_data=(X_test_conv, Y_test_dummies), epochs=epochs)
t_end = time.time()
t_learning = t_end-t_start

score = model_base_conv_2D.evaluate(X_test_conv, Y_test_dummies)[1] 
print("\n Score With Conv on Multidimensional Inertial Signals = %.2f, \nLearning time = %.2f secondes" %(score*100, t_learning) )
metadata_conv = {"time_learning" : t_learning, "score" : score}
base_conv_2D_prediction = model_base_conv_2D.predict(X_test_conv)

my_confusion_matrix(Y_test_dummies, base_conv_2D_prediction)

Train on 7352 samples, validate on 2947 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
 Score With Conv on Multidimensional Inertial Signals = 88.67, 
 Learning time = 4.03 secondes


Pred,LAYING,SITTING,STANDING,WALKING,WALKING_DOWNSTAIRS,WALKING_UPSTAIRS
True,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
LAYING,510,0,0,0,0,27
SITTING,1,380,102,1,0,7
STANDING,0,78,453,0,0,1
WALKING,0,0,0,435,57,4
WALKING_DOWNSTAIRS,0,1,0,12,405,2
WALKING_UPSTAIRS,0,0,0,13,28,430


**Attention au sur-apprentissage** A force de rechercher la meilleure architecture en minimisant l'erreur sur l'échantillon test, celle finalement trouvée peut y être très adaptée réduisant ainsi la capacité de généralisation. Il serait prudent de multiplier le découpage de l'échantillon par validation croisée *Monte Carlo*.

### 5.5 Implémentation de la Validation Croisée de Monte Carlo

**Objectif** trouver la meilleure architecture.

In [96]:
X=np.concatenate((X_train, X_test), axis=0)
Y=np.concatenate((Y_train_dummies, Y_test_dummies), axis=0)

In [97]:
epochs = 10
batch_size = 32
n_hidden = 32


n_classes = 6



In [149]:
score=np.empty([2,3])
# score est une matrice nb_methodes x B
for k in range(2):
    print("\n *****************",k,"***************** \n")
    X_train_MC,X_test_MC=train_test_split(X,test_size=0.2)
    Y_train_dummies_MC,Y_test_dummies_MC=train_test_split(Y,test_size=0.2)
    
    N_train = X_train_MC.shape[0]
    N_test = X_test_MC.shape[0]
    timesteps = len(X_train_MC[0])
    input_dim = len(X_train_MC[0][0])
    
    X_train_conv_MC = X_train_MC.reshape(N_train, timesteps, input_dim, 1)
    X_test_conv_MC = X_test_MC.reshape(N_test, timesteps, input_dim, 1)
    
    print("\n **** MLP **** \n")
    model_base_mlp.fit(X_train_MC,  Y_train_dummies_MC, batch_size=batch_size, validation_data=(X_test_MC, Y_test_dummies_MC), epochs=epochs)   
    #print("\n **** LSTM **** \n")
    #model_base_lstm.fit(X_train_MC,  Y_train_dummies_MC, batch_size=batch_size, validation_data=(X_test_MC, Y_test_dummies_MC), epochs=epochs, shuffle=False)
    print("\n **** conv 1D **** \n")
    model_base_conv_1D.fit(X_train_MC,  Y_train_dummies_MC, batch_size=batch_size, validation_data=(X_test_MC, Y_test_dummies_MC), epochs=epochs)
    print("\n **** conv 2D **** \n")
    model_base_conv_2D.fit(X_train_conv_MC,  Y_train_dummies_MC, batch_size=batch_size, validation_data=(X_test_conv_MC, Y_test_dummies_MC), epochs=epochs)
    
    score_mlp=model_base_mlp.evaluate(X_test_MC, Y_test_dummies_MC)[1]
    #score_lstm=model_base_lstm.evaluate(X_test_MC, Y_test_dummies_MC)[1]
    score_conv_1D=model_base_conv_1D.evaluate(X_test_MC, Y_test_dummies_MC)[1]
    score_conv_2D=model_base_conv_2D.evaluate(X_test_conv_MC, Y_test_dummies_MC)[1]
    s=[score_mlp,score_conv_1D,score_conv_2D]
    score[k,:]=s
    
final_scores=np.apply_along_axis(np.mean,0,score)
print(score)
print(final_scores)


 ***************** 0 ***************** 


 **** MLP **** 

Train on 8239 samples, validate on 2060 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

 **** conv 1D **** 

Train on 8239 samples, validate on 2060 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

 **** conv 2D **** 

Train on 8239 samples, validate on 2060 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
 ***************** 1 ***************** 


 **** MLP **** 

Train on 8239 samples, validate on 2060 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

 **** conv 1D **** 

Train on 8239 samples, validate on 2060 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

 **** conv 2D **** 

In [147]:
score

array([[ 0.18446602,  0.1776699 ,  0.1776699 ],
       [ 0.18980583,  0.18883495,  0.19660194]])

## A FAIRE :

- Trouver pour chaque méthode le format des données qui marche le mieux (haar, ou brutes,etc)
- Resoudre problème Monte Carlo : scores de merde
- écrire synthèse tout en haut + marquer les meilleurs résultats