## Réseau de neurones

Nous avons téléchargé nos jeux d'entraînement et de test sous forme de csv afin de les récupérer directement sans avoir à relancer tout le preprocessing. Les paramètres choisies sont l'année 2010, 20 notes possibles pour la difficulté.

In [1]:
import pandas as pd
import numpy as np
from sklearn.metrics import classification_report
X_train = pd.read_csv("./../../data/X_train.csv",index_col = 0)
X_test = pd.read_csv("./../../data/X_test.csv",index_col = 0)
y_train = pd.read_csv("./../../data/y_train.csv",index_col= 0)
y_test = pd.read_csv("./../../data/y_test.csv",index_col = 0)

In [2]:
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [3]:
from tensorflow.keras.models import Model,Sequential
from tensorflow.keras.layers import Dense, Input,Dropout
model = Sequential()
model.add(Input(shape=14))
model.add(Dropout(0.2))
model.add(Dense(units = 32,activation = 'tanh'))
model.add(Dense(units = 32,activation = 'tanh'))
model.add(Dense(units = 1,activation="sigmoid"))

In [4]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dropout (Dropout)           (None, 14)                0         
                                                                 
 dense (Dense)               (None, 32)                480       
                                                                 
 dense_1 (Dense)             (None, 32)                1056      
                                                                 
 dense_2 (Dense)             (None, 1)                 33        
                                                                 
Total params: 1,569
Trainable params: 1,569
Non-trainable params: 0
_________________________________________________________________


In [5]:
from tensorflow.keras.callbacks import ReduceLROnPlateau,EarlyStopping
reduce = ReduceLROnPlateau(monitor = 'val_loss',
                        min_delta = 0.01,
                        patience = 5,
                        factor = 0.1, 
                        cooldown = 2,
                        verbose = 1)
early_stopping = EarlyStopping(monitor='val_loss',min_delta = 0.01,patience=10,mode='min',verbose=1)

In [6]:
model.compile(loss = "binary_focal_crossentropy",optimizer = "adam",metrics = ["accuracy"])

In [7]:
history=model.fit(X_train_scaled,y_train, epochs = 30,batch_size = 32,validation_data = (X_test_scaled,y_test),callbacks = [reduce,early_stopping])

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 6: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 11: early stopping


In [8]:
y_proba_rn = model.predict(X_test)
y_pred_rn = [1 if a>=0.5 else 0 for a in y_proba_rn]
print(classification_report(y_test,y_pred_rn))

              precision    recall  f1-score   support

           0       0.66      0.62      0.64     13281
           1       0.61      0.65      0.63     12255

    accuracy                           0.63     25536
   macro avg       0.63      0.63      0.63     25536
weighted avg       0.63      0.63      0.63     25536

