# Question 2
Consider the Mammography dataset available on the resources tab. There are two classes: class 1 indicates calcification (cancer) and class 0 indicates no calcification (no cancer). Thus, the class 1 is the positive class and class 0 is the negative class. You are required to use and compare a neural network classifier (MLPClassifier in scikit-learn, for example) and a decision tree classifier (DecisionTreeClassifier in scikit-learn, for example). You will use 10-fold cross-validation (StratifiedKFold in scikit-learn; also look at cross_val_score) to compare the two classifiers. Please identify classifier is statistically significantly better at 95% confidence when using Error as a metric. Please identify which classifier is statistically significantly better at 95% confidence when using AUC or F-measure as a metric.  Please also discuss if there are any differences in classifier performance when using AUC / F-measure or Error as the evaluation metric.  (30 points)

*Extra Credit: Consider optimizing the decision tree pruning criterion or MLP learning rate / number of units and see if the performance can be improved. (5 points)*

In [292]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [293]:
features = ["A", "B", "C", "D", "E", "F", "Class"]
df = pd.read_csv('data/ism.data', names=features)
df.describe()

Unnamed: 0,A,B,C,D,E,F,Class
count,11183.0,11183.0,11183.0,11183.0,11183.0,11183.0,11183.0
mean,4.631014,106.292408,0.013124,2.037123,11.476447,0.310368,1.02325
std,5.903782,226.060108,0.022182,2.369981,30.37176,0.32818,0.150702
min,0.0,0.0,0.0,0.0,0.0,0.0,1.0
25%,0.0,0.0,0.0,0.0,0.0,0.0,1.0
50%,3.99,17.0,0.008,0.0,0.0,0.0,1.0
75%,6.4845,89.0,0.018,3.981,0.0,0.644,1.0
max,190.65,1256.0,0.667,24.768,728.77,0.95,2.0


In [294]:
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import cross_val_score
from sklearn import linear_model

In [295]:
y = df['Class']
x = df.drop(['Class'], axis=1)

## Neural Network Classifier: MLPClassifier
### cross_val_score
Implements stratified k fold when given the integer value for cv parameter

In [296]:
clf = MLPClassifier(hidden_layer_sizes=(100,100,100), max_iter=500, alpha=0.0001,
                     solver='sgd', verbose=10,  random_state=21,tol=0.000000001)
results = []
cv_results = cross_val_score(clf, x, y, cv=10, scoring="accuracy")
cv_results_f_mlp = cross_val_score(clf, x, y, cv=10, scoring="f1")
results.append(cv_results)
results.append(cv_results_f_mlp)

Iteration 1, loss = 0.38233871
Iteration 2, loss = 0.28318539
Iteration 3, loss = 0.22969765
Iteration 4, loss = 0.21128260
Iteration 5, loss = 0.18456440
Iteration 6, loss = 0.22189577
Iteration 7, loss = 0.20086695
Iteration 8, loss = 0.15835074
Iteration 9, loss = 0.14090748
Iteration 10, loss = 0.13236506
Iteration 11, loss = 0.12217124
Iteration 12, loss = 0.11940119
Iteration 13, loss = 0.11208687
Iteration 14, loss = 0.10816281
Iteration 15, loss = 0.10431854
Iteration 16, loss = 0.09818553
Iteration 17, loss = 0.09738807
Iteration 18, loss = 0.09410910
Iteration 19, loss = 0.24137673
Iteration 20, loss = 0.11575702
Iteration 21, loss = 0.10289540
Training loss did not improve more than tol=0.000000 for two consecutive epochs. Stopping.
Iteration 1, loss = 0.38472976
Iteration 2, loss = 0.28363146
Iteration 3, loss = 0.23100174
Iteration 4, loss = 0.27404273
Iteration 5, loss = 0.19966675
Iteration 6, loss = 0.17146349
Iteration 7, loss = 0.15639325
Iteration 8, loss = 0.1421022

## Decision Tree Classifier: DecisionTreeClassifier

In [303]:
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn import tree
from sklearn.model_selection import train_test_split

In [304]:
clf = tree.DecisionTreeClassifier(random_state=0)


cv_results = cross_val_score(clf, x, y, cv=10, scoring="accuracy")
cv_results_f_tree = cross_val_score(clf, x, y, cv=10, scoring="f1")
results.append(cv_results)
results.append(cv_results_f_mlp)
results.append(cv_results_f_tree)

In [305]:
# import graphviz 
# dot_data = tree.export_graphviz(clf, out_file=None) 
# graph = graphviz.Source(dot_data) 
# graph

## T Test where A is MLPClassifier and B is DecisionTreeClassifier
Null Hypothesis:  There is no statistically significantly better classifier.

In [306]:
import math
from scipy import stats
t_prime, p = stats.ttest_ind(results[0], results[1], equal_var=False)
t = float(1.96)
print("T_Test between {} & {}: T Value = {}, P Value = {}".format("MLPClassifier", "DecisionTreeClassifier", t_prime, p))
if (t_prime >= t):
    print("MLPClassifier is statistically significantly better than DecisionTreeClassifier")
elif (t_prime <= -t): 
    print("MLPClassifier is statistically significantly worse than DecisionTreeClassifier")
else: 
    print("There is no statistically significant difference.")

T_Test between MLPClassifier & DecisionTreeClassifier: T Value = -8.55448608048, P Value = 8.49922930944e-07
MLPClassifier is statistically significantly worse than DecisionTreeClassifier


## F-Measure

In [307]:
t_prime, p = stats.ttest_ind(results[2], results[3], equal_var=False)
t = float(1.96)
print("T_Test between {} & {}: T Value = {}, P Value = {}".format("MLPClassifier", "DecisionTreeClassifier", t_prime, p))
if (t_prime >= t):
    print("MLPClassifier is statistically significantly better than DecisionTreeClassifier")
elif (t_prime <= -t): 
    print("MLPClassifier is statistically significantly worse than DecisionTreeClassifier")
else: 
    print("There is no statistically significant difference.")

T_Test between MLPClassifier & DecisionTreeClassifier: T Value = -5.32803812636, P Value = 0.000203255046296
MLPClassifier is statistically significantly worse than DecisionTreeClassifier
