### What's a perceptron?
- 1+ inputs, a bias, activation function, and a single output
- activation functions such as logistic, trigonometric, step
- bias avoids issues where all inputs == 0, and therefore no multiplicative weight would have an effect


- This output can then be compared to known labels to which weights can be adjusted accordingly
- These weights are usually initialized with random values
- Repeat this until the defined number of iterations or an acceptable error rate

### What's a neural network?
- Layers of perceptrons that mimics a biological neural networks usage of passing inputs/outputs between neurons through axons. 
- Layers
    - Input layers which directly take the feature inputs
    - Hidden layers so named because they don't directly see the in/outputs
    - Output layers which create the results


In [1]:
import sklearn as skl
from sklearn.datasets import load_breast_cancer

In [3]:
cancer = load_breast_cancer()
cancer.keys()

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])

In [5]:
print(cancer['DESCR'])

Breast Cancer Wisconsin (Diagnostic) Database

Notes
-----
Data Set Characteristics:
    :Number of Instances: 569

    :Number of Attributes: 30 numeric, predictive attributes and the class

    :Attribute Information:
        - radius (mean of distances from center to points on the perimeter)
        - texture (standard deviation of gray-scale values)
        - perimeter
        - area
        - smoothness (local variation in radius lengths)
        - compactness (perimeter^2 / area - 1.0)
        - concavity (severity of concave portions of the contour)
        - concave points (number of concave portions of the contour)
        - symmetry 
        - fractal dimension ("coastline approximation" - 1)

        The mean, standard error, and "worst" or largest (mean of the three
        largest values) of these features were computed for each image,
        resulting in 30 features.  For instance, field 3 is Mean Radius, field
        13 is Radius SE, field 23 is Worst Radius.

        

In [6]:
cancer['data'].shape

(569, 30)

In [7]:
# Set up labels
x = cancer['data']
y = cancer['target']

In [13]:
cancer['target']
# 0 and 1 refer to malignant and benign

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
       1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1,
       1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0,
       1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1,
       1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1,
       0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,
       0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1,
       0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1,
       0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0,
       0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1,
       1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1,
       1, 0,

In [14]:
# split into training and test sets
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y)

### Preprocessing
- Multi-layer perceptrons are sensitive to feature scaling
- NN may have difficulty converging is data is not normalized
- Remember to apply scaling normalization to both training and test sets

In [15]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

# Fit to training data
scaler.fit(x_train)

StandardScaler(copy=True, with_mean=True, with_std=True)

In [23]:
type(x_train)

numpy.ndarray

In [26]:
import numpy as np
np.ptp(x_train, axis=0)

array([  2.11290000e+01,   2.95700000e+01,   1.44710000e+02,
         2.35550000e+03,   1.10770000e-01,   3.26020000e-01,
         4.26400000e-01,   1.91300000e-01,   1.98000000e-01,
         4.74800000e-02,   2.75770000e+00,   4.52480000e+00,
         2.12230000e+01,   5.18798000e+02,   2.94170000e-02,
         1.04148000e-01,   3.96000000e-01,   5.27900000e-02,
         7.10680000e-02,   2.89452000e-02,   2.52000000e+01,
         3.75200000e+01,   1.78890000e+02,   3.04880000e+03,
         1.51430000e-01,   1.03071000e+00,   1.25200000e+00,
         2.91000000e-01,   5.07300000e-01,   1.52460000e-01])

In [27]:
np.ptp(x_train, axis=1)

array([  433.996558 ,  1747.994185 ,   773.398303 ,   745.29803  ,
        1622.996663 ,  1645.996609 ,   533.995179 ,  1492.997275 ,
         439.597287 ,   483.098339 ,   896.994588 ,  1739.996789 ,
        1297.992455 ,   435.897705 ,  2144.997392 ,   275.596075 ,
         656.995132 ,  1659.99488  ,  1656.998411 ,  1436.997444 ,
        1409.998913 ,   491.797216 ,   567.69089  ,   575.994769 ,
         546.696293 ,   505.597023 ,  1808.997311 ,   909.394    ,
         782.09848  ,   476.096279 ,   777.497832 ,   532.79747  ,
         522.897728 ,   741.594918 ,   799.595978 ,   783.598188 ,
        2021.997795 ,   564.197516 ,   701.896253 ,   605.797711 ,
         931.39825  ,  1029.998371 ,   623.697387 ,   515.296187 ,
         808.896751 ,   708.797504 ,   825.997305 ,   547.398142 ,
        2476.997744 ,   876.496998 ,  1730.997502 ,   766.897472 ,
         660.197213 ,   630.497575 ,   705.597015 ,   993.598132 ,
         907.196683 ,  1687.996603 ,   623.996593 ,   590.9959

In [28]:
# Apply to data
x_train = scaler.transform(x_train)
x_test = scaler.transform(x_test)

In [29]:
np.ptp(x_train, axis=0)

array([  6.07798361,   6.71318781,   6.04266658,   6.80324506,
         7.66660352,   6.19632429,   5.45495902,   4.96321577,
         7.07628703,   6.5864779 ,  10.54949836,   8.18053343,
        11.0963907 ,  12.99287432,   9.85129027,   6.10467024,
        12.71398381,   8.71833654,   8.47819085,  10.71360459,
         5.31802447,   5.97206101,   5.41735992,   5.57639366,
         6.51623523,   6.47779501,   5.95278368,   4.41281141,
         8.07617523,   8.34608678])

In [30]:
np.ptp(x_train, axis=1)

array([  3.02800074,   3.76443263,   1.28141253,   1.99969992,
         2.81475826,   2.85681885,   2.7795653 ,   1.80476981,
         2.27087263,   1.90619945,   2.38118391,   2.41190828,
         3.98152566,   1.42193835,   2.78333212,   2.93815461,
         2.01318448,   3.82964652,   3.45297961,   2.99505799,
         2.60319158,   1.31768689,   6.71105419,   2.74057429,
         4.20833994,   2.48268765,   2.9817679 ,   1.98628087,
         1.66959011,   3.31527688,   1.23836481,   1.26898116,
         1.76325908,   3.03506212,   1.59889165,   1.87655162,
         3.340201  ,   1.29084493,   1.20776053,   1.70527165,
         3.75958427,   2.07414459,   1.43724212,   2.08652376,
         2.69168356,   0.8916276 ,   1.65700539,   1.54914171,
         4.46964449,   2.67290664,   3.27389392,   1.64311137,
         1.91994506,   1.72763756,   4.13272817,   1.49602947,
         2.06155913,   2.9732742 ,   2.2759863 ,   2.1844625 ,
         2.2707461 ,   3.21393848,   6.14495753,   2.44

In [31]:
from scipy import stats
stats.describe(x_train)

DescribeResult(nobs=426, minmax=(array([-2.03179637, -2.16808102, -1.98727515, -1.45126444, -3.03385706,
       -1.59643001, -1.11229971, -1.25087541, -2.65674527, -1.77781186,
       -1.06414425, -1.56070631, -1.06023343, -0.78685262, -1.76220107,
       -1.31668694, -1.00072486, -1.92029273, -1.49587285, -1.04743623,
       -1.73390177, -2.1866007 , -1.69608472, -1.24038432, -2.63516326,
       -1.41375269, -1.2779906 , -1.72195931, -2.11999142, -1.57008766]), array([  4.04618724,   4.54510679,   4.05539142,   5.35198063,
         4.63274646,   4.59989427,   4.34265931,   3.71234036,
         4.41954176,   4.80866604,   9.4853541 ,   6.61982713,
        10.03615728,  12.2060217 ,   8.0890892 ,   4.7879833 ,
        11.71325895,   6.79804381,   6.982318  ,   9.66616837,
         3.5841227 ,   3.78546031,   3.7212752 ,   4.33600934,
         3.88107196,   5.06404232,   4.67479309,   2.6908521 ,
         5.95618381,   6.77599912])), mean=array([ -1.27388971e-15,  -1.54545130e-15,   1.94

### Training Time

In [32]:
# import estimator 
from sklearn.neural_network import MLPClassifier

### Create instance of the model
- Here just define hidden_layer_sizes parameter
    - Takes a tuple of 
        - Number of neurons wanted at each layer
        - the nth entry in the tuple represents the number of neurons in the nth layer of the model

- here define 3 layers with same number of neurons as features in data set
    - In the Description above, stated 'Number of Attributes: 30 numeric, predictive attributes and the class'

In [33]:
mlp = MLPClassifier(hidden_layer_sizes=(30,30,30))

In [34]:
# now fit data to the model 
mlp.fit(x_train, y_train)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(30, 30, 30), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='adam', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

Output above shows other params and their default values

In [35]:
mlp.fit?

### Predictions & Evaluation

In [36]:
predictions = mlp.predict(x_test)

In [37]:
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_test,predictions))

[[60  1]
 [ 3 79]]


In [38]:
print(classification_report(y_test,predictions))

             precision    recall  f1-score   support

          0       0.95      0.98      0.97        61
          1       0.99      0.96      0.98        82

avg / total       0.97      0.97      0.97       143



Result is 97% accuracy, not too shabby

### Notes
- Downside of MLP is hard to interpret model itself
    - Weights and biases are not easily interpretable in relation to which features are important to the model

In [39]:
# Extraction of public attributes
len(mlp.coefs_)

4

In [40]:
len(mlp.coefs_[0])

30

In [41]:
len(mlp.intercepts_[0])

30