# Drill

Create a multi-layer perceptron neural network model to predict on a labeled dataset of your choosing. Compare this model to either a boosted tree or a random forest model and describe the relative tradeoffs between complexity and accuracy. Be sure to vary the hyperparameters of your MLP!

We are going to use a dataset that has a collection of black-and-white digits. Using a neural network, we are going to predict what digit each pictoral datapoint is supposed to represent.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.model_selection import train_test_split

from sklearn import linear_model
from sklearn.neural_network import BernoulliRBM
from sklearn.pipeline import Pipeline

from sklearn.model_selection import GridSearchCV

from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import cross_val_score

from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report


Import the data and take a look at it.

In [2]:
df = pd.read_csv('train.csv')
print(df.head())
print(df.shape)

   label  pixel0  pixel1  pixel2  pixel3  pixel4  pixel5  pixel6  pixel7  \
0      1       0       0       0       0       0       0       0       0   
1      0       0       0       0       0       0       0       0       0   
2      1       0       0       0       0       0       0       0       0   
3      4       0       0       0       0       0       0       0       0   
4      0       0       0       0       0       0       0       0       0   

   pixel8    ...     pixel774  pixel775  pixel776  pixel777  pixel778  \
0       0    ...            0         0         0         0         0   
1       0    ...            0         0         0         0         0   
2       0    ...            0         0         0         0         0   
3       0    ...            0         0         0         0         0   
4       0    ...            0         0         0         0         0   

   pixel779  pixel780  pixel781  pixel782  pixel783  
0         0         0         0         0         

X = all of the feature columns. <br>
Y = the digit output. <br>
Split X and Y into training and testing datasets.

In [3]:
X = df.drop('label', axis=1)
Y = df['label']

X_train, X_test, Y_train, Y_test = train_test_split(X, Y,
                                                    test_size=0.2,
                                                    random_state=0)

Create a pipeline that incorporates a restricted boltzmann machine classifier with a logistic classifier.

In [4]:
rbm = BernoulliRBM(random_state=0, verbose=True)
logistic = linear_model.LogisticRegression()

pipe = Pipeline(steps=[('rbm', rbm), ('logistic', logistic)])

Set the possible hyperparameters for the rbm and the logistic classifiers and iterate through a GridSearch to find the best permutation.

In [5]:
learning_rates = [1e-3, 1e-2, 1e-1, 0.25, 0.5, 1]
n_iters = [20]
components = [10, 50, 100, 300]
c_values = [1e-3, 1e-2, 1e-1, 0.25, 0.5, 1, 100, 1000]

param_grid = [
    {
        'rbm__learning_rate': learning_rates,
        'rbm__n_iter': n_iters,
        'rbm__n_components': components,
        'logistic__C': c_values
    }
]

Display best parameters.

In [6]:
#grid = GridSearchCV(pipe, cv=5, n_jobs=1, param_grid=param_grid)
#grid.fit(X_train, Y_train)

#print(f'best params:\n {grid.best_params_}')

# BEST PARAMS

logistic C: 0.001 <br>
rbm learning_rate: 0.001 <br>
rbm n_components: 10 <br>
rbm n_iter: 20 <br>

Fit the pipeline model with these newly determined parameters.

In [7]:
rbm.learning_rate = 0.001
rbm.n_components = 10
rbm.n_iter = 20
logistic.C = 0.001

pipe.set_params(rbm__learning_rate=0.001, rbm__n_components=10,rbm__n_iter=20,logistic__C=0.001).fit(X_train,Y_train)

[BernoulliRBM] Iteration 1, pseudo-likelihood = -437048.62, time = 2.74s
[BernoulliRBM] Iteration 2, pseudo-likelihood = -873982.26, time = 2.90s
[BernoulliRBM] Iteration 3, pseudo-likelihood = -1310915.50, time = 3.28s
[BernoulliRBM] Iteration 4, pseudo-likelihood = -1747849.14, time = 2.83s
[BernoulliRBM] Iteration 5, pseudo-likelihood = -2184781.93, time = 3.86s
[BernoulliRBM] Iteration 6, pseudo-likelihood = -2621714.79, time = 4.50s
[BernoulliRBM] Iteration 7, pseudo-likelihood = -3058647.89, time = 4.22s
[BernoulliRBM] Iteration 8, pseudo-likelihood = -3495581.98, time = 2.94s
[BernoulliRBM] Iteration 9, pseudo-likelihood = -3932514.89, time = 2.88s
[BernoulliRBM] Iteration 10, pseudo-likelihood = -4369448.00, time = 2.89s
[BernoulliRBM] Iteration 11, pseudo-likelihood = -4806380.79, time = 2.88s
[BernoulliRBM] Iteration 12, pseudo-likelihood = -5243314.08, time = 2.87s
[BernoulliRBM] Iteration 13, pseudo-likelihood = -5680247.51, time = 2.71s
[BernoulliRBM] Iteration 14, pseudo-

Pipeline(memory=None,
     steps=[('rbm', BernoulliRBM(batch_size=10, learning_rate=0.001, n_components=10, n_iter=20,
       random_state=0, verbose=True)), ('logistic', LogisticRegression(C=0.001, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False))])

In [8]:
train_pred = pipe.predict(X_train)
test_pred = pipe.predict(X_test)
print(f'Training Accuracy: {accuracy_score(Y_train, train_pred)}')
print(f'Testing Accuracy: {accuracy_score(Y_test, test_pred)}')

print("Logistic regression using RBM features:\n%s\n" % (
    classification_report(
        Y_test,
        pipe.predict(X_test))))

Training Accuracy: 0.11080357142857143
Testing Accuracy: 0.1144047619047619
Logistic regression using RBM features:
             precision    recall  f1-score   support

          0       0.00      0.00      0.00       813
          1       0.11      1.00      0.21       961
          2       0.00      0.00      0.00       860
          3       0.00      0.00      0.00       863
          4       0.00      0.00      0.00       827
          5       0.00      0.00      0.00       756
          6       0.00      0.00      0.00       841
          7       0.00      0.00      0.00       899
          8       0.00      0.00      0.00       768
          9       0.00      0.00      0.00       812

avg / total       0.01      0.11      0.02      8400




  'precision', 'predicted', average, warn_for)


In [12]:
print(X_train)

       pixel0  pixel1  pixel2  pixel3  pixel4  pixel5  pixel6  pixel7  pixel8  \
39317       0       0       0       0       0       0       0       0       0   
32837       0       0       0       0       0       0       0       0       0   
16644       0       0       0       0       0       0       0       0       0   
20005       0       0       0       0       0       0       0       0       0   
1533        0       0       0       0       0       0       0       0       0   
41842       0       0       0       0       0       0       0       0       0   
7781        0       0       0       0       0       0       0       0       0   
28433       0       0       0       0       0       0       0       0       0   
5554        0       0       0       0       0       0       0       0       0   
31233       0       0       0       0       0       0       0       0       0   
40004       0       0       0       0       0       0       0       0       0   
3094        0       0       

In [11]:
logistic_classifier = linear_model.LogisticRegression(C=0.001)
logistic_classifier.fit(X_train, Y_train)

train_pred = logistic_classifier.predict(X_train)
test_pred = logistic_classifier.predict(X_test)
print(f'Training Accuracy: {accuracy_score(Y_train, train_pred)}')
print(f'Testing Accuracy: {accuracy_score(Y_test, test_pred)}')

print("Logistic regression using raw pixel features:\n%s\n" % (
    classification_report(
        Y_test,
        logistic_classifier.predict(X_test))))

Training Accuracy: 0.9377678571428572
Testing Accuracy: 0.9085714285714286
Logistic regression using raw pixel features:
             precision    recall  f1-score   support

          0       0.95      0.95      0.95       813
          1       0.96      0.97      0.96       961
          2       0.91      0.88      0.90       860
          3       0.89      0.88      0.89       863
          4       0.92      0.91      0.91       827
          5       0.87      0.83      0.85       756
          6       0.94      0.96      0.95       841
          7       0.93      0.92      0.93       899
          8       0.83      0.89      0.86       768
          9       0.87      0.88      0.87       812

avg / total       0.91      0.91      0.91      8400


