# About 

This notebook demonstrates neuro net classifiers, which are provided by __Reproducible experiment platform (REP)__ package. <br /> REP contains following neuro net libraries:
* __theanets__
* __neurolab__ 
* __pybrain__ 


### In this notebook we show the most simple way to do the same as for rest base wrappers:
* train classifier
* build predictions 
* measure quality
* combine metaclassifiers

# Loading data

In [1]:
import numpy, pandas
from rep.utils import train_test_split
from sklearn.metrics import roc_auc_score

sig_data = pandas.read_csv('toy_datasets/toyMC_sig_mass.csv', sep='\t')
bck_data = pandas.read_csv('toy_datasets/toyMC_bck_mass.csv', sep='\t')

labels = numpy.array([1] * len(sig_data) + [0] * len(bck_data))
data = pandas.concat([sig_data, bck_data])

### First rows of our data

In [2]:
data[:5]

Unnamed: 0,CDF1,CDF2,CDF3,DOCAone,DOCAthree,DOCAtwo,FlightDistance,FlightDistanceError,Hlt1Dec,Hlt2Dec,...,p1_IP,p1_IPSig,p1_Laura_IsoBDT,p1_pt,p2_IP,p2_IPSig,p2_Laura_IsoBDT,p2_pt,peakingbkg,pt
0,1.0,1.0,1.0,0.111337,0.012695,0.123426,162.650955,0.870942,0,0,...,11.314665,83.196968,-0.223668,699.066467,9.799975,64.790207,-0.121159,521.628174,,220.742111
1,0.759755,0.597375,0.389256,0.021781,0.094551,0.088421,4.193265,1.26228,0,0,...,0.72007,7.237868,-0.256142,587.628935,0.882111,8.834325,-0.20322,532.67995,,661.208843
2,1.0,0.796142,0.566286,0.011852,0.0044,0.009153,1.58061,0.261697,0,0,...,0.362181,4.173097,-0.252788,802.746495,0.42729,5.008959,-0.409469,674.122342,,1290.963982
3,0.716397,0.524712,0.279033,0.015171,0.0839,0.069127,7.884569,1.310151,0,0,...,0.753449,6.615949,-0.25355,564.203857,0.917409,8.695459,-0.192284,537.791687,,692.654175
4,1.0,0.996479,0.888159,0.005547,0.070438,0.064689,-2.267649,0.139555,0,0,...,0.589455,21.869143,-0.254778,746.624928,0.388996,8.465344,-0.217319,988.539221,,1328.83784


### Splitting into train and test

In [3]:
# Get train and test data
train_data, test_data, train_labels, test_labels = train_test_split(data, labels, train_size=0.5)

# Neural nets

All nets inherit from __sklearn.BaseEstimator__ and have the same interface as another wrappers in REP (details see in **01-howto-Classifiers**)

All of these nets libraries **support**:

* classification
* multi-classification
* regression
* multi-targets regresssion

and **don't support**:

* staged predict methoods
* weights for data

# Variables used in training

In [4]:
variables = ["FlightDistance", "FlightDistanceError", "IP", "VertexChi2", "pt", "p0_pt", "p1_pt", "p2_pt", 'LifeTime','dira']

# Theanets

In [5]:
from rep.estimators import TheanetsClassifier
print TheanetsClassifier.__doc__

Classifier from Theanets library. 

    Parameters:
    -----------
    :param features: list of features to train model
    :type features: None or list(str)
    :param layers: a sequence of values specifying the **hidden** layer configuration for the network.
        For more information please see 'Specifying layers' in theanets documentation:
        http://theanets.readthedocs.org/en/latest/creating.html#creating-specifying-layers
        Note that theanets "layers" parameter included input and output layers in the sequence as well.
    :type layers: sequence of int, tuple, dict
    :param int input_layer: size of the input layer. If equals -1, the size is taken from the training dataset
    :param int output_layer: size of the output layer. If equals -1, the size is taken from the training dataset
    :param str hidden_activation: the name of an activation function to use on hidden network layers by default
    :param str output_activation: the name of an activation function to u

### Theanets pretraining (unsupervised)

In [7]:
tn = TheanetsClassifier(features=variables, layers=[20], 
                        trainers=[{'optimize': 'pretrain'}])
# Unsupervised learning, put any vector with necessary count of classes as y, only numpy.unique(y) will be used 
tn.fit(train_data, [1] + [0] * (len(train_data) - 1))
print('training complete')

training complete


### Predicting probabilities, measuring the quality

In [7]:
# predict probabilities for each class
prob = tn.predict_proba(test_data)
print prob

[[ 0.30260596  0.69739404]
 [ 0.30260596  0.69739404]
 [ 0.14554053  0.85445947]
 ..., 
 [ 0.18248445  0.81751555]
 [ 0.30260596  0.69739404]
 [ 0.17538447  0.82461553]]


In [8]:
print 'ROC AUC', roc_auc_score(test_labels, prob[:, 1])

ROC AUC 0.574939901116


### Additional fitting using labels for the same data

In [10]:
tn.partial_fit(train_data, train_labels, **{'optimize': 'nag', 'learning_rate': 0.3})

TheanetsClassifier(decode_from=1,
          features=['FlightDistance', 'FlightDistanceError', 'IP', 'VertexChi2', 'pt', 'p0_pt', 'p1_pt', 'p2_pt', 'LifeTime', 'dira'],
          hidden_activation='logistic', hidden_dropouts=0, hidden_noise=0,
          input_dropouts=0, input_layer=-1, input_noise=0, layers=[20],
          output_activation='linear', output_layer=-1, random_state=42,
          scaler=StandardScaler(copy=True, with_mean=True, with_std=True),
          trainers=[{'optimize': 'pretrain'}, {'learning_rate': 0.3, 'optimize': 'nag'}])

In [11]:
# predict probabilities for each class
prob = tn.predict_proba(test_data)
print prob

[[ 0.98404217  0.01595783]
 [ 0.98306235  0.01693765]
 [ 0.9243144   0.0756856 ]
 ..., 
 [ 0.99774784  0.00225216]
 [ 0.98306233  0.01693767]
 [ 0.98540877  0.01459123]]


In [12]:
print 'ROC AUC', roc_auc_score(test_labels, prob[:, 1])

ROC AUC 0.905527908521


### Predictions of classes

In [13]:
tn.predict(test_data)

array([0, 0, 0, ..., 0, 0, 0])

## Neurolab

In [14]:
from rep.estimators import NeurolabClassifier
print NeurolabClassifier.__doc__

Classifier from neurolab library. 

    Parameters:
    -----------
    :param tuple[int] layers: sequence of units numbers inside each **hidden** layer.
    :param string net_type: type of network
        One of 'feed-forward', 'single-layer', 'competing-layer', 'learning-vector',
        'elman-recurrent', 'hopfield-recurrent', 'hemming-recurrent'
    :param features: features used in training
    :type features: list[str] or None
    :param initf: layer initializers
    :type initf: anything implementing call(layer), e.g. nl.init.* or list[nl.init.*] of shape [n_layers]
    :param trainf: net train function, default value depends on type of network
    :param scaler: transformer to apply to the input objects
    :type scaler: str or sklearn-like transformer or False (do not scale features)
    :param list[int] layers: list of numbers denoting size of each hidden layer
    :param random_state: ignored, added for uniformity.
    :param dict kwargs: additional arguments to net __init__

In [15]:
import neurolab 
nl = NeurolabClassifier(features=variables, layers=[10], epochs=20, trainf=neurolab.train.train_rprop)
nl.fit(train_data, train_labels)
print('training complete')

The maximum number of train epochs is reached
training complete


### Predict probabilities and estimate quality

In [16]:
# predict probabilities for each class
prob = nl.predict_proba(test_data)
print prob

[[ 0.21881075  0.78118925]
 [ 0.18726815  0.81273185]
 [ 0.20386355  0.79613645]
 ..., 
 [ 0.21367134  0.78632866]
 [ 0.18782973  0.81217027]
 [ 0.20105149  0.79894851]]


In [17]:
print 'ROC AUC', roc_auc_score(test_labels, prob[:, 1])

ROC AUC 0.82096289122


In [18]:
# predict labels
nl.predict(test_data)

array([1, 1, 1, ..., 1, 1, 1])

## Pybrain

In [19]:
from rep.estimators import PyBrainClassifier
print PyBrainClassifier.__doc__

Implements classification from PyBrain library 

    Parameters:
    -----------
    :param features: features used in training.
    :type features: list[str] or None
    :param scaler: transformer to apply to the input objects
    :type scaler: str or sklearn-like transformer or False (do not scale features)
    :param bool use_rprop: flag to indicate whether we should use Rprop or SGD trainer
    :param bool verbose: print train/validation errors.
    :param random_state: ignored parameter, pybrain training isn't reproducible

    **Net parameters:**

    :param list[int] layers: indicate how many neurons in each hidden(!) layer; default is 1 hidden layer with 10 neurons
    :param list[str] hiddenclass: classes of the hidden layers; default is 'SigmoidLayer'
    :param dict params: other net parameters:
        bias and outputbias (boolean) flags to indicate whether the network should have the corresponding biases,
        both default to True;
        peepholes (boolean);
        r

In [20]:
pb = PyBrainClassifier(features=variables, epochs=2, layers=[10, 2], hiddenclass=['LinearLayer', 'MDLSTMLayer'])
pb.fit(train_data, train_labels)
print('training complete')

training complete


### Predict probabilities and estimate quality

In [21]:
prob = pb.predict_proba(test_data)
print 'ROC AUC:', roc_auc_score(test_labels, prob[:, 1])

ROC AUC: 0.900121016332


### Predict labels

In [22]:
pb.predict(test_data)

array([1, 1, 0, ..., 0, 1, 0])

# Advantages of common interface

Let's show that now you can run AdaBoost over neural nets! <br />
_(isn't boosting over neural network what you were dreaming of all your life?)_

## Bagging over Theanets classifier

In [23]:
from sklearn.ensemble import BaggingClassifier

base_tn = TheanetsClassifier(trainers=[{'min_improvement': 0.1}])
bagging_tn = BaggingClassifier(base_estimator=base_tn, n_estimators=3)
bagging_tn.fit(train_data[variables], train_labels)
print('training complete')

training complete


In [24]:
prob = bagging_tn.predict_proba(test_data[variables])
print 'AUC', roc_auc_score(test_labels, prob[:, 1])

AUC 0.901499256418


## Bagging over Neurolab classifier

In [25]:
base_nl = NeurolabClassifier(layers=[10], epochs=20, trainf=neurolab.train.train_rprop)
bagging_nl = BaggingClassifier(base_estimator=base_nl, n_estimators=3)
bagging_nl.fit(train_data[variables], train_labels)
print('training complete')

The maximum number of train epochs is reached
The maximum number of train epochs is reached
The maximum number of train epochs is reached
training complete


In [26]:
# predict probabilities for each class
prob = bagging_nl.predict_proba(test_data[variables])
print 'AUC', roc_auc_score(test_labels, prob[:, 1])

AUC 0.807962872474


## Bagging over Pybrain classifier

In [27]:
base_pb = PyBrainClassifier(epochs=2, layers=[10, 2], hiddenclass=['LinearLayer', 'MDLSTMLayer'])
bagging_pb = BaggingClassifier(base_estimator=base_pb, n_estimators=3)
bagging_pb.fit(train_data[variables], train_labels)
print('training complete')

training complete


In [28]:
# predict probabilities for each class
prob = bagging_pb.predict_proba(test_data[variables])
print 'AUC', roc_auc_score(test_labels, prob[:, 1])

AUC 0.903090297511


# Other advantages of common interface
There are many things you can do with classifiers now: 
* cloning
* getting / setting parameters as dictionaries 
* do automatic hyperparameter optimization 
* build pipelines (`sklearn.pipeline`)
* use hierarchical training, training on subsets
* passing over internet / train classifiers on other machines

And you can replace classifiers at any moment.

# Exercises

### Exercise 1. Play with parameters in each type of classifiers

### Exercise 2. Play with `partial_fit` for different models