<h2>Neural Network</h2>

In this module, we learn to use Neural Network to solve classification and regression problems

<h3>Classification</h3>

In [1]:
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np

<h4> Applied on the Credit Approval Data </h4>

As usual, try the model on the credit approval data

In [4]:
data = pd.read_csv('heart_disease.csv')

from sklearn.model_selection import train_test_split
X = data.drop('HeartDisease', axis=1)
y = data['HeartDisease']

trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.2)

def remove_0_choles(X):
    X.loc[X['Cholesterol']==0, 'Cholesterol'] = np.nan
    X.loc[X['RestingBP']==0, 'RestingBP'] = np.nan
    return X

from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import FunctionTransformer
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer

num_cols = trainX.columns[(trainX.dtypes == np.int64) | (trainX.dtypes == np.float64)]
cat_cols = ['Sex', 'ChestPainType','RestingECG', 'ExerciseAngina', 'ST_Slope']

num_pipeline = Pipeline([
    ('remove 0 cholesterol', FunctionTransformer(remove_0_choles, validate=False)),
    ('impute', SimpleImputer(strategy='median')),
    ('standardize', StandardScaler())
])

cat_pipeline = Pipeline([
    ('impute', SimpleImputer(strategy='constant',fill_value='missing')),
    ('encode', OneHotEncoder())
])

full_pipeline = ColumnTransformer([
    ('numeric', num_pipeline, num_cols),
    ('class', cat_pipeline, cat_cols)
])

trainX_prc = full_pipeline.fit_transform(trainX)
testX_prc = full_pipeline.transform(testX)

trainX_prc.shape, testX_prc.shape

((734, 20), (184, 20))

For classification, we use MLPClassifier.

The architecture of the NN is decided by the hidden_layer_sizes hyperparameter. In short, this is a list of integer numbers, each number represent the number of hidden neuron in the corresponding layer. 

For example, 

hidden_layer_sizes=[10,20,30] 

represents a NN with three hidden layers, the first hidden layer has 10 neurons, the 2nd 20 neurons, and the last 30 neurons.

NN is also trained iteratively, so you can also set max_iter to a high value to make sure the training converge

In [5]:
from sklearn.neural_network import MLPClassifier

n_features = trainX_prc.shape[1] #get the number of input features
mlp = MLPClassifier(hidden_layer_sizes=[n_features,n_features,n_features], max_iter=1000)

mlp.fit(trainX_prc, trainY)
print(mlp.score(trainX_prc, trainY))
print(mlp.score(testX_prc, testY))

0.9986376021798365
0.8315217391304348


It seems like the model is overfitting.

Now let's finetune the NN. I'm just gonna train a few architectures.

In [6]:
from sklearn.model_selection import GridSearchCV

param_grid = [{
    'hidden_layer_sizes' : [[n_features,n_features],                       #two hidden layer with n_features neurons
                            [n_features,n_features,n_features],            #three hidden layer with n_features neurons 
                            [n_features//2,n_features//2],                 #two hidden layer with n_features/2 neurons
                            [n_features//2,n_features//2,n_features//2],   #three hidden layer with n_features/2 neurons
                            [n_features*2,n_features*2],                   #two hidden layer with n_features*2 neurons
                            [n_features*2,n_features*2,n_features*2]],     #three hidden layer with n_features*2 neurons
    'alpha' : [0.001, 0.01, 0.1, 1, 10]                                    #regularization terms
}]

mlp = MLPClassifier(max_iter=1000)

grid_search = GridSearchCV(mlp, param_grid, cv=3, scoring='accuracy', return_train_score=True)

grid_search.fit(trainX_prc,trainY)



Best training model:

In [7]:
print(grid_search.best_params_)
print(grid_search.best_score_)

{'alpha': 10, 'hidden_layer_sizes': [40, 40]}
0.8637169621947139


In [8]:
best_nn = grid_search.best_estimator_
best_nn.score(testX_prc, testY)

0.842391304347826

<h4> NN for Regression </h4>

For regression, training NN is essentially the same. The only difference is that we use MLPRegressor instead of MLPClassifier

In [10]:
from sklearn.neural_network import MLPRegressor

In [11]:
data = pd.read_csv('auto-mpg.csv')

X = data.drop('mpg', axis=1)
y = data['mpg']
trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.2)

num_cols = trainX.columns[:-1] #because the last column is class
num_pipeline = Pipeline([
    ('impute', SimpleImputer(strategy='median')),
    ('standardize', StandardScaler())
])

#pipeline for class features
cat_cols = trainX.columns[-1:] #because the last column is class
cat_pipeline = Pipeline([
    ('encoder', OneHotEncoder())
])

#full pipeline - combine numeric and class pipelines
full_pipeline = ColumnTransformer([
    ('numeric', num_pipeline, num_cols),
    ('class', cat_pipeline, cat_cols)
])

trainX_prc = full_pipeline.fit_transform(trainX)
testX_prc = full_pipeline.transform(testX)

In [14]:
param_grid = [{
    'hidden_layer_sizes' : [[n_features,n_features],                       #two hidden layer with n_features neurons
                            [n_features,n_features,n_features],            #three hidden layer with n_features neurons 
                            [n_features//2,n_features//2],                 #two hidden layer with n_features/2 neurons
                            [n_features//2,n_features//2,n_features//2],   #three hidden layer with n_features/2 neurons
                            [n_features*2,n_features*2],                   #two hidden layer with n_features*2 neurons
                            [n_features*2,n_features*2,n_features*2]],     #three hidden layer with n_features*2 neurons
    'alpha' : [0.001, 0.01, 0.1, 1, 10]                                    #regularization terms
}]

mlp = MLPRegressor(max_iter=1000)

grid_search = GridSearchCV(mlp, param_grid, cv=3, scoring='r2', return_train_score=True)

grid_search.fit(trainX_prc,trainY)



In [15]:
print(grid_search.best_params_)
print(grid_search.best_score_)

{'alpha': 1, 'hidden_layer_sizes': [20, 20]}
0.8606564981246949


In [16]:
best_nn = grid_search.best_estimator_
best_nn.score(testX_prc, testY)

0.9036004749966537