Challenge: Make Your Network

Create a multi-layer perceptron neural network model to predict on a labeled dataset of your choosing. Compare this model to either a boosted tree or a random forest model and describe the relative tradeoffs between complexity and accuracy. Be sure to vary the hyperparameters of your MLP!

Note, I am using the same red wine quality dataset that was used in my Supervised Learning Capstone for which the best performing model was the Random Forest Classifier (89.3% accuracy rate). There have been additional Q&A discussions regarding the potential for building a product based on the results from my Supervised Learning Capstone. Part of the discussion has been applying an Unsupervised Learning approach to see how accurately an algorithm like MLP would also classify the quality of wine based on physicochemical compounds, labeled data.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import train_test_split

In [2]:
# Loading the dataset.
red_wine = pd.read_csv('C:\\Users\\kenne\\Desktop\\winequality-red.csv', encoding='utf-8')

In [4]:
# Binarize the target variable quality where 0 equals bad and 1 equals good.
grade = []
for qual in red_wine['quality']:
    if qual >= 7:
        grade.append(1)
    else:
        grade.append(0)
red_wine['grade'] = grade

red_wine.describe()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality,grade
count,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0,1599.0
mean,8.319637,0.527821,0.270976,2.538806,0.087467,15.874922,46.467792,0.996747,3.311113,0.658149,10.422983,5.636023,0.13571
std,1.741096,0.17906,0.194801,1.409928,0.047065,10.460157,32.895324,0.001887,0.154386,0.169507,1.065668,0.807569,0.342587
min,4.6,0.12,0.0,0.9,0.012,1.0,6.0,0.99007,2.74,0.33,8.4,3.0,0.0
25%,7.1,0.39,0.09,1.9,0.07,7.0,22.0,0.9956,3.21,0.55,9.5,5.0,0.0
50%,7.9,0.52,0.26,2.2,0.079,14.0,38.0,0.99675,3.31,0.62,10.2,6.0,0.0
75%,9.2,0.64,0.42,2.6,0.09,21.0,62.0,0.997835,3.4,0.73,11.1,6.0,0.0
max,15.9,1.58,1.0,15.5,0.611,72.0,289.0,1.00369,4.01,2.0,14.9,8.0,1.0


In [5]:
red_wine = red_wine.drop(['quality'], axis=1)

red_wine.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,grade
0,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,0
1,7.8,0.88,0.0,2.6,0.098,25.0,67.0,0.9968,3.2,0.68,9.8,0
2,7.8,0.76,0.04,2.3,0.092,15.0,54.0,0.997,3.26,0.65,9.8,0
3,11.2,0.28,0.56,1.9,0.075,17.0,60.0,0.998,3.16,0.58,9.8,0
4,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,0


In [7]:
# Split the data into X and y variables.
X = red_wine.iloc[:, :11]
Y = red_wine['grade']

First attempt, Multi-layer Perceptron model (MLP) ...

In [9]:
# Establish and fit the model, with a single, 1000 perceptron layer.
mlp = MLPClassifier(hidden_layer_sizes=(1000))
mlp.fit(X, Y)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=1000, learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

In [10]:
mlp.score(X, Y)

0.8642901813633521

In [11]:
Y.value_counts()/len(Y)

0    0.86429
1    0.13571
Name: grade, dtype: float64

In [12]:
# Five-fold cross-valiation.
from sklearn.model_selection import cross_val_score
cross_val_score(mlp, X, Y, cv=5)

array([0.87850467, 0.86292835, 0.86520376, 0.84639498, 0.86833856])

Produced an accuracy rate of 86.4% which is slightly less than the best model from the Supervised Learning approach, Random Forest Classifier.

Cross-validation scores are consistent.

Second attempt, adjusting the layers ...

In [13]:
# Establish and fit the model, with a single, 500 perceptron layer.
mlpv2 = MLPClassifier(hidden_layer_sizes=(500))
mlpv2.fit(X, Y)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=500, learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

In [14]:
mlpv2.score(X, Y)

0.8642901813633521

In [15]:
Y.value_counts()/len(Y)

0    0.86429
1    0.13571
Name: grade, dtype: float64

In [16]:
# Five-fold cross-valiation.
from sklearn.model_selection import cross_val_score
cross_val_score(mlpv2, X, Y, cv=5)

array([0.87538941, 0.86292835, 0.86833856, 0.830721  , 0.86520376])

No improvement in accuracy rate, similar score to the 1000 layer attempt.

Cross-validation scores still consistent but not as consisent as previous attempt.

Third attempt, adjust the layers further ...

In [17]:
# Establish and fit the model, with double, 750 perceptron layer.
mlpv3 = MLPClassifier(hidden_layer_sizes=(750, 750))
mlpv3.fit(X, Y)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(750, 750), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

In [18]:
mlpv3.score(X, Y)

0.8642901813633521

In [19]:
Y.value_counts()/len(Y)

0    0.86429
1    0.13571
Name: grade, dtype: float64

In [20]:
# Five-fold cross-valiation.
from sklearn.model_selection import cross_val_score
cross_val_score(mlpv3, X, Y, cv=5)

array([0.86604361, 0.83489097, 0.89028213, 0.80564263, 0.86520376])

No improvement in accuracy rate, similar score to both previous attempts.

Cross-validation scores not as consistent as previous attempts. Will try varying other hyperparameters now.

Fourth attempt, logistic activation ...

In [21]:
# Establish and fit the model, logistic activation.
mlpv4 = MLPClassifier(activation='logistic')
mlpv4.fit(X, Y)

MLPClassifier(activation='logistic', alpha=0.0001, batch_size='auto',
       beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(100,), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

In [22]:
mlpv4.score(X, Y)

0.8649155722326454

In [23]:
Y.value_counts()/len(Y)

0    0.86429
1    0.13571
Name: grade, dtype: float64

In [24]:
# Five-fold cross-valiation.
from sklearn.model_selection import cross_val_score
cross_val_score(mlpv4, X, Y, cv=5)

array([0.87538941, 0.83800623, 0.87460815, 0.84326019, 0.86520376])

We finally have some improvement but it is very small, too small to matter (percentages between good and bad wine did not change).

Cross-validation scores are relatively consistent.

Fifth attempt, reducing alpha ...

In [25]:
# Establish and fit the model, with default settings.
mlpv5 = MLPClassifier(activation='logistic', hidden_layer_sizes=(1000), alpha=1e-6)
mlpv5.fit(X, Y)

MLPClassifier(activation='logistic', alpha=1e-06, batch_size='auto',
       beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=1000, learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

In [26]:
mlpv5.score(X, Y)

0.8792995622263915

In [27]:
Y.value_counts()/len(Y)

0    0.86429
1    0.13571
Name: grade, dtype: float64

In [28]:
# Five-fold cross-valiation.
from sklearn.model_selection import cross_val_score
cross_val_score(mlpv5, X, Y, cv=5)

array([0.86604361, 0.85358255, 0.86520376, 0.80877743, 0.86520376])

Accuracy rate increased by a noticeable amount!

Cross-validation scores were relatively consisent, with the exception of one score.

Overall, the Random Forest Classifier still performed better than any variation of a Multi-Layer Perceptron model.