# Artificial Neural Networks Using Scikit-Learn
### Tawfiq Jawhar

This is a sample example on how to use scikit learn MLPClassifier model (multi-layer perceptron (MLP) algorithm that trains using Backpropagation).

We will be using the breast cancer dataset for this example. First we load the dataset and we split the data to have training and testing sets. 

In [2]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
cancer = load_breast_cancer()
X = cancer['data']
y = cancer['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state = 0, shuffle=True)

Neural networks are sensitive to feature scaling. The neural network might have difficulty converging if the data is not normalized/standardized. It is also known to easily overfit. The shuffling of the data is important incase the data is ordered in a way that can bias one class over the other and cause an overfitting to the model. 

In [3]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
# Fit only to the training data
scaler.fit(X_train)
# Now apply the transformations to the data:
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

Now we will use a multi layer neural network to classify the data. We will also be testing different number of hidden layers.

In [4]:
from sklearn.neural_network import MLPClassifier
mlp = MLPClassifier(hidden_layer_sizes=(30), random_state = 0)
#fit the model on the training data with their classes.
mlp.fit(X_train,y_train)



MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=30, learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=0, shuffle=True,
       solver='adam', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)

We can notice that the model outputted a warning message telling us that the model has not converged yet. We can increase the number of maximum iterations, however before doing that let's try and add more hidden layers. 

In [8]:
mlp = MLPClassifier(hidden_layer_sizes=(30, 30, 30), random_state = 0, verbose = True)
#fit the model on the training data with their classes.
mlp.fit(X_train,y_train)

Iteration 1, loss = 0.59689915
Iteration 2, loss = 0.54625693
Iteration 3, loss = 0.49876702
Iteration 4, loss = 0.45457494
Iteration 5, loss = 0.41277608
Iteration 6, loss = 0.37406877
Iteration 7, loss = 0.33638174
Iteration 8, loss = 0.30151110
Iteration 9, loss = 0.26967658
Iteration 10, loss = 0.23999104
Iteration 11, loss = 0.21440121
Iteration 12, loss = 0.19118977
Iteration 13, loss = 0.17191592
Iteration 14, loss = 0.15566159
Iteration 15, loss = 0.14186999
Iteration 16, loss = 0.13076841
Iteration 17, loss = 0.12170903
Iteration 18, loss = 0.11346269
Iteration 19, loss = 0.10694669
Iteration 20, loss = 0.10107622
Iteration 21, loss = 0.09567836
Iteration 22, loss = 0.09117438
Iteration 23, loss = 0.08689998
Iteration 24, loss = 0.08342260
Iteration 25, loss = 0.08033247
Iteration 26, loss = 0.07792264
Iteration 27, loss = 0.07552436
Iteration 28, loss = 0.07310171
Iteration 29, loss = 0.07095833
Iteration 30, loss = 0.06892405
Iteration 31, loss = 0.06729706
Iteration 32, los

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(30, 30, 30), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=0, shuffle=True,
       solver='adam', tol=0.0001, validation_fraction=0.1, verbose=True,
       warm_start=False)

This is better as we did not get a warning message, which means the model converged. 

Now we want to predict the values of the testing set and evaluate the accuracy of the model.

In [6]:
predictions = mlp.predict(X_test)
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_test,predictions))

[[53  0]
 [ 2 88]]


In [7]:
mlp.score(X_test, y_test)

0.986013986013986

Looks like we did pretty well! We only have 2 missclassified data points. 

This dataset is a classic and easy binary classification dataset. So don't get your hopes up and assume things will be that easy to classify everytime. But it is a good example/practice. 