#### Neural Network Implementation in Python
Python does not have a stable production version of neural netwroks. Multilayer preceptron has been implemented in the DEV region of scikit learn.
https://www.kaggle.com/c/otto-group-product-classification-challenge/forums/t/13547/question-about-neural-networks-in-python

The below basic version of Neural Networks has been used for testing these datasets. This is a basic Neural Network(Multi Layer Preceptron) implemented by Issam Lardji. We have used this to test the Neural Network part for these datasets. 

https://www.kaggle.com/c/otto-group-product-classification-challenge/forums/t/13547/question-about-neural-networks-in-python
https://github.com/IssamLaradji/NeuralNetworks/tree/master/multilayer_perceptron

The main file is "multilayer_perceptron.py"

It contains 3 classes,

1) Multi-layer perceptron Classifier (explained in detail in line 562)

2) Multi-layer perceptron Regressor (explained in detail in line 830)

3) Multi-layer perceptron Autoencoder (explained in detail in line 999)

To describe other files present in this repo,

- 'base.py' contains the activation functions and their derivatives,
and loss functions. 

- 'autoencoder.py'

In [1]:
"""
==============================================
Using multilayer perceptron for classification
==============================================

"""

import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix
from sklearn.cross_validation import train_test_split, cross_val_score, StratifiedShuffleSplit 
from sklearn import preprocessing, metrics
from sklearn.metrics import roc_curve, auc
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
import timeit
import matplotlib.pyplot as plt
import os
get_ipython().magic('matplotlib inline')
from multilayer_perceptron  import MultilayerPerceptronClassifier

In [2]:
## Reading the file
def read_file(trainF,testF, Directory, Target_col,transform,drop_cols=None,categ_transform=None):
    train = pd.read_csv(Directory + trainF)
    test =  pd.read_csv(Directory + testF)
    if transform:
        lbl_enc = preprocessing.LabelEncoder()
        labels = train[Target_col].values
        labels = lbl_enc.fit_transform(labels)
        labels_test = test[Target_col].values
        labels_test = lbl_enc.fit_transform(labels_test)
        train.drop([Target_col],axis=1)
        test.drop([Target_col],axis=1)
        train[Target_col] = labels
        test[Target_col] = labels_test
    if drop_cols is not None:
        for i in drop_cols:
            train.drop([i],axis=1,inplace=True)
            test.drop([i],axis=1,inplace=True)
    if categ_transform is not None:
        for j in categ_transform:
            lbl_enc = preprocessing.LabelEncoder()
            labels = train[j].values
            labels = lbl_enc.fit_transform(labels)
            labels_test = test[j].values
            labels_test = lbl_enc.fit_transform(labels_test)
            train.drop([j],axis=1)
            test.drop([j],axis=1)
            train[j] = labels
            test[j] = labels_test
            
    return train, test

In [3]:
## MLP classifier
def mlp_classifier(train, test, accuracy, roc_auc, Target_col):
    start_time = timeit.default_timer()
    y = train[Target_col]
    X = train.drop([Target_col],axis=1)
    test_labels = test[Target_col]
    test_X = test.drop([Target_col],axis=1)
    random_state = np.random.RandomState(0)
    # Binarize the output
    y = label_binarize(y, classes=np.unique(y))
    test_labels = label_binarize(test_labels, classes=np.unique(test_labels))
    n_classes = y.shape[1]
    classifier = OneVsRestClassifier(MultilayerPerceptronClassifier())
    y_score = classifier.fit(X, y).decision_function(test_X)
    y_pred = classifier.predict(test_X)

    # Compute ROC curve and ROC area for each class
    fpr = dict()
    tpr = dict()
    roc_auc_dict = dict()
    for i in range(n_classes):
        fpr[i], tpr[i], _ = roc_curve(test_labels[:, i], y_score[:, i])
        roc_auc_dict[i] = auc(fpr[i], tpr[i])

    # Compute micro-average ROC curve and ROC area
    fpr["micro"], tpr["micro"], _ = roc_curve(test_labels.ravel(), y_score.ravel())
    roc_auc_dict["micro"] = auc(fpr["micro"], tpr["micro"])
    roc_auc.append(roc_auc_dict["micro"])
    accuracy.append(metrics.accuracy_score(test_labels, y_pred))
    elapsed = (timeit.default_timer() - start_time)/60
    return accuracy, roc_auc, elapsed

In [31]:
def model_build(filenum,Target_column, df_train, df_test, Directory,drop_cols=None,categ_transform=None):
    accuracy_mlp = []; roc_auc_mlp = []

    elapsed_time_mlp = []
    Target_col = Target_column
    for i in range(1,8):
        trainF = df_train+ str(i) + '.csv'
        testF = df_test + str(i) + '.csv'
        print("Executing this iteration:",i)
        train, test = read_file(trainF,testF,Directory, Target_col,transform=True,drop_cols=drop_cols,categ_transform=categ_transform)
        accuracy_mlp, roc_auc_mlp, elapsed = mlp_classifier(train, test, accuracy_mlp, roc_auc_mlp, Target_col)
        elapsed_time_mlp.append(elapsed)
    
    print('Data set# ' + str(filenum))
    print('********** MLP classifier ***********')
    print('Individual file accuracy for MLP')
    print(np.array(accuracy_mlp))
    print('Individual time taken for MLP')
    print(np.array(elapsed_time_mlp))
    print('Accuracy mean   ' + 'Accuracy Stdev  ')
    print(np.array(accuracy_mlp).mean(), np.array(accuracy_mlp).std())
    print('Individual file AUC for MLP')
    print(np.array(roc_auc_mlp))
    print('AUC mean        ' + 'AUC      Stdev  ')
    print(np.array(roc_auc_mlp).mean(), np.array(roc_auc_mlp).std())
    print()

In [5]:
model_build(filenum=2,Target_column='letter', df_train='data2_train', df_test='data2_test', Directory = "./Data Set 2/splits/")

Executing this iteration: 1
Executing this iteration: 2
Executing this iteration: 3
Executing this iteration: 4
Executing this iteration: 5
Executing this iteration: 6
Executing this iteration: 7
Executing this iteration: 8
Executing this iteration: 9
Executing this iteration: 10
Data set# 2
********** MLP classifier ***********
Individual file accuracy for MLP
[ 0.83754075  0.82665663  0.82856494  0.82470454  0.83335862  0.82774252
  0.80306711  0.83522813  0.8293291   0.827693  ]
Individual time taken for MLP
[ 2.82671993  2.27776472  2.4505404   2.14884537  2.6584539   2.88064055
  2.68145415  2.06879917  2.04752805  2.05741541]
Accuracy mean   Accuracy Stdev  
0.827388534213 0.00897110636985
Individual file AUC for MLP
[ 0.99625345  0.99630362  0.99614883  0.99586206  0.99570373  0.996129
  0.97709168  0.99648595  0.99577334  0.99625746]
AUC mean        AUC      Stdev  
0.994200911926 0.00570800355794



In [32]:
model_build(filenum=4,Target_column='Activity', df_train='data4_train', df_test='data4_test', Directory = "./Data Set 4/splits/",drop_cols=['Tag_Identificator'],categ_transform=['Sequence_Name'])

Executing this iteration: 1
Executing this iteration: 2
Executing this iteration: 3
Executing this iteration: 4
Executing this iteration: 5
Executing this iteration: 6
Executing this iteration: 7
Data set# 4
********** MLP classifier ***********
Individual file accuracy for MLP
[ 0.25345995  0.24105211  0.25841735  0.24682812  0.27353028  0.23759034
  0.26203607]
Individual time taken for MLP
[ 5.73285532  5.32064402  5.58712958  5.39451941  5.82911932  5.49675501
  5.36997353]
Accuracy mean   Accuracy Stdev  
0.253273460146 0.0116525917423
Individual file AUC for MLP
[ 0.8870963   0.87561797  0.88876083  0.88732455  0.88680122  0.88525504
  0.88430609]
AUC mean        AUC      Stdev  
0.885023142963 0.00406643993565



In [10]:
model_build(filenum=6,Target_column='Class', df_train='d6_train', df_test='d6_test', Directory  = "./Data Set 6/splits/",drop_cols=None,categ_transform=None)

Executing this iteration: 1
Executing this iteration: 2
Executing this iteration: 3
Executing this iteration: 4
Executing this iteration: 5
Data set# 6
********** MLP classifier ***********
Individual file accuracy for MLP
[ 0.99017399  0.9943608   0.92260634  0.98818466  0.98877698]
Individual time taken for MLP
[ 1.02062438  1.03465317  0.63650744  0.74724698  0.66194638]
Accuracy mean   Accuracy Stdev  
0.9768205546 0.0271928461803
Individual file AUC for MLP
[ 0.99676289  0.99875995  0.94548276  0.99751025  0.99599856]
AUC mean        AUC      Stdev  
0.986902880238 0.0207300922048



In [15]:
model_build(filenum=7,Target_column='Class', df_train='d7_train', df_test='d7_test', Directory = "./Data Set 7/splits/",drop_cols=None,categ_transform=None)

Executing this iteration: 1
Executing this iteration: 2
Executing this iteration: 3
Executing this iteration: 4
Executing this iteration: 5
Executing this iteration: 6
Executing this iteration: 7
Executing this iteration: 8
Executing this iteration: 9
Executing this iteration: 10
Data set# 7
********** MLP classifier ***********
Individual file accuracy for MLP
[ 0.31060606  0.3822037   0.63907709  0.5736013   0.7750953   0.62025783
  0.60450966  0.5343164   0.78311499  0.19108625]
Individual time taken for MLP
[ 0.32442025  0.28934027  0.43517719  0.3932628   0.54281516  0.44787463
  0.43542585  0.40132342  0.54332185  0.22738988]
Accuracy mean   Accuracy Stdev  
0.541386858827 0.183084122515
Individual file AUC for MLP
[ 0.57592062  0.58559139  0.74896941  0.73719188  0.81201548  0.78461015
  0.70040851  0.68551372  0.87041347  0.52903197]
AUC mean        AUC      Stdev  
0.702966660488 0.105054954693



In [20]:
model_build(filenum=12,Target_column='TARGET', df_train='d12_original_train', df_test='d12_original_test', Directory = "./Original Dataset_12_Brain/",drop_cols=None,categ_transform=None)

Executing this iteration: 1
Executing this iteration: 2
Executing this iteration: 3
Executing this iteration: 4
Executing this iteration: 5
Data set# 12
********** MLP classifier ***********
Individual file accuracy for MLP
[ 0.91666667  0.73333333  0.6         0.75        0.76923077]
Individual time taken for MLP
[ 0.21586285  0.22866493  0.1980077   0.21280687  0.25355355]
Accuracy mean   Accuracy Stdev  
0.753846153846 0.100847819554
Individual file AUC for MLP
[ 1.          0.98777778  0.97444444  0.99316406  0.99260355]
AUC mean        AUC      Stdev  
0.989597967004 0.00851933322793



In [25]:
model_build(filenum=12,Target_column='TARGET', df_train='d12_original_train', df_test='d12_original_test', Directory = "./Original Dataset_12_Brain/",drop_cols=None,categ_transform=None)

Executing this iteration: 7
Executing this iteration: 8
Executing this iteration: 9
Executing this iteration: 10
Data set# 12
********** MLP classifier ***********
Individual file accuracy for MLP
[ 0.92857143  0.83333333  0.64705882  0.45      ]
Individual time taken for MLP
[ 0.15394472  0.16005637  0.14795043  0.13629909]
Accuracy mean   Accuracy Stdev  
0.714740896359 0.183341394536
Individual file AUC for MLP
[ 0.99617347  0.99479167  0.9567474   0.936875  ]
AUC mean        AUC      Stdev  
0.971146885225 0.0253343282741



In [23]:
model_build(filenum=14,Target_column='C2309', df_train='srbct_train', df_test='srbct_test', Directory = "./data14_srbct/",drop_cols=None,categ_transform=None)

Executing this iteration: 1
Executing this iteration: 2
Executing this iteration: 3
Executing this iteration: 4
Data set# 14
********** MLP classifier ***********
Individual file accuracy for MLP
[ 1.          0.94736842  0.95652174  0.89473684]
Individual time taken for MLP
[ 0.06248331  0.04951722  0.06196375  0.06283128]
Accuracy mean   Accuracy Stdev  
0.949656750572 0.0374266069423
Individual file AUC for MLP
[ 1.          0.99722992  1.          0.99630656]
AUC mean        AUC      Stdev  
0.99838411819 0.0016485291848



In [27]:
model_build(filenum=14,Target_column='C2309', df_train='srbct_train', df_test='srbct_test', Directory = "./data14_srbct/",drop_cols=None,categ_transform=None)

Executing this iteration: 6
Executing this iteration: 7
Executing this iteration: 8
Executing this iteration: 9
Executing this iteration: 10
Data set# 14
********** MLP classifier ***********
Individual file accuracy for MLP
[ 0.91304348  0.96551724  1.          0.95652174  0.89473684]
Individual time taken for MLP
[ 0.05806321  0.04375086  0.04649647  0.04610418  0.04793023]
Accuracy mean   Accuracy Stdev  
0.945963860175 0.0377403414331
Individual file AUC for MLP
[ 1.          0.99762188  1.          1.          0.99261311]
AUC mean        AUC      Stdev  
0.998046998088 0.00286881497741



In [18]:
model_build(filenum=15,Target_column='TARGET', df_train='d15_original_train', df_test='d15_original_test', Directory = "./Original Dataset_15_Lymphoma/",drop_cols=None,categ_transform=None)

Executing this iteration: 1
Executing this iteration: 2
Executing this iteration: 3
Executing this iteration: 4
Executing this iteration: 5
Executing this iteration: 6
Executing this iteration: 7
Executing this iteration: 8
Executing this iteration: 9
Executing this iteration: 10
Data set# 15
********** MLP classifier ***********
Individual file accuracy for MLP
[ 1.          1.          1.          0.95238095  0.91304348  0.94444444
  0.95454545  1.          1.          0.94444444]
Individual time taken for MLP
[ 0.06837549  0.07203662  0.08798808  0.06351131  0.05393646  0.05845205
  0.0615529   0.06785157  0.07002609  0.06270093]
Accuracy mean   Accuracy Stdev  
0.970885877408 0.0309703797757
Individual file AUC for MLP
[ 1.          1.          1.          0.99773243  0.99905482  1.
  0.99896694  1.          1.          1.        ]
AUC mean        AUC      Stdev  
0.999575418887 0.000727995344852

