Neural Network: Simple Multi Layer Perceptron

We will apply neural network based classification model on Cancer data set using SKLearn

In [1]:
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()

In [2]:
cancer.keys()

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])

This object is like a dictionary, it contains a description of the data and the features and targets

In [3]:
X = cancer['data']
y = cancer['target']

Train Test Split
Let's split our data into training and testing sets, this is done easily with SciKit Learn's train_test_split function from model_selection:

In [4]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y)

The neural network may have difficulty converging before the maximum number of iterations allowed if the data is not normalized. Multi-layer Perceptron is sensitive to feature scaling, so it is highly recommended to scale your data. Note that you must apply the same scaling to the test set for meaningful results. There are a lot of different methods for normalization of data, we will use the built-in StandardScaler for standardization.

In [5]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
# Fit only to the training data
scaler.fit(X_train)

StandardScaler(copy=True, with_mean=True, with_std=True)

In [6]:
# Now apply the transformations to the data:
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

Next we create an instance of the model, there are a lot of parameters you can choose to define and customize here, we will only define the hidden_layer_sizes. For this parameter you pass in a tuple consisting of the number of neurons you want at each layer, where the nth entry in the tuple represents the number of neurons in the nth layer of the MLP model. There are many ways to choose these numbers, but for simplicity we will choose 3 layers with the same number of neurons as there are features in our data set:

In [8]:
from sklearn.neural_network import MLPClassifier
mlp = MLPClassifier(hidden_layer_sizes=(30,30,30))

Now that the model has been made we can fit the training data to our model, remember that this data has already been processed and scaled:

In [9]:
mlp.fit(X_train,y_train)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(30, 30, 30), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

output that shows the default values of the other parameters in the model. I encourage you to play around with them and discover what effects they have on your model!

Predictions and Evaluation

Now that we have a model it is time to use it to get predictions! We can do this simply with the predict() method off of our fitted model:

In [10]:
predictions = mlp.predict(X_test)

Now we can use SciKit-Learn's built in metrics such as a classification report and confusion matrix to evaluate how well our model performed:

In [11]:
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_test,predictions))

[[51  5]
 [ 1 86]]


In [12]:
print(classification_report(y_test,predictions))

             precision    recall  f1-score   support

          0       0.98      0.91      0.94        56
          1       0.95      0.99      0.97        87

avg / total       0.96      0.96      0.96       143



Looks like we only misclassified 3 tumors, leaving us with a 98% accuracy rate (as well as 98% precision and recall). This is pretty good considering how few lines of code we had to write! The downside however to using a Multi-Layer Preceptron model is how difficult it is to interpret the model itself. The weights and biases won't be easily interpretable in relation to which features are important to the model itself.

However, if you do want to extract the MLP weights and biases after training your model, you use its public attributes coefs_ and intercepts_.

coefs_ is a list of weight matrices, where weight matrix at index i represents the weights between layer i and layer i+1.

intercepts_ is a list of bias vectors, where the vector at index i represents the bias values added to layer i+1.

In [13]:
len(mlp.coefs_)
len(mlp.intercepts_[0])

30

In [14]:
len(mlp.coefs_)

4

In [15]:
print(mlp.coefs_)

[array([[ -1.24465115e-01,  -6.27248453e-02,   1.72920556e-01,
          2.08703196e-01,   1.56253161e-01,  -2.28313431e-01,
          3.19348359e-01,   1.75433037e-02,  -1.51577072e-01,
          3.16558300e-01,  -5.09129140e-02,  -3.15692066e-01,
         -2.49111798e-01,  -7.04288308e-02,   1.59824681e-01,
         -2.08564959e-01,  -1.56995967e-02,   1.41007509e-01,
         -1.38993186e-01,   3.03084803e-01,   2.06397139e-01,
          1.65614261e-01,  -1.60374523e-01,   1.02703117e-01,
          5.25986165e-02,  -1.19015850e-01,   2.40531197e-01,
         -2.51465434e-01,   2.94863902e-01,   4.65184271e-02],
       [  3.69846311e-02,  -2.98902749e-01,   1.92366414e-01,
         -2.60940854e-03,   1.35898050e-01,  -6.91864311e-02,
          2.77554058e-01,  -3.21963401e-01,   3.43197044e-02,
         -1.32116357e-01,  -3.85621656e-02,  -2.75481667e-01,
          2.49387524e-02,  -2.08584233e-01,  -6.36422586e-05,
          2.23900746e-02,  -1.50990801e-01,   1.47368638e-01,
      