# Skorch introduction

*`skorch`* is designed to maximize interoperability between `sklearn` and `pytorch`. The aim is to keep 99% of the flexibility of `pytorch` while being able to leverage most features of `sklearn`. Below, we show the basic usage of `skorch` and how it can be combined with `sklearn`.


In [18]:
# from skorch documentation

In [1]:
! [ ! -z "$COLAB_GPU" ] && pip install torch skorch

In [5]:
import torch
from torch import nn
import torch.nn.functional as F

torch.manual_seed(0);

## Training a classifier and making predictions

### A toy binary classification task

We load a toy classification task from `sklearn`.

In [1]:
import numpy as np
from sklearn.datasets import make_classification

In [2]:
X, y = make_classification(1000, 20, n_informative=10, random_state=0)
X = X.astype(np.float32)

In [3]:
X.shape, y.shape, y.mean()

((1000, 20), (1000,), 0.5)

### Definition of the `pytorch` classification `module`

We define a vanilla neural network with two hidden layers. The output layer should have 2 output units since there are two classes. In addition, it should have a softmax nonlinearity, because later, when calling `predict_proba`, the output from the `forward` call will be used.

In [6]:
class ClassifierModule(nn.Module):
    def __init__(
            self,
            num_units=10,
            nonlin=F.relu,
            dropout=0.5,
    ):
        super(ClassifierModule, self).__init__()
        self.num_units = num_units
        self.nonlin = nonlin
        self.dropout = dropout

        self.dense0 = nn.Linear(20, num_units)
        self.nonlin = nonlin
        self.dropout = nn.Dropout(dropout)
        self.dense1 = nn.Linear(num_units, 10)
        self.output = nn.Linear(10, 2)

    def forward(self, X, **kwargs):
        X = self.nonlin(self.dense0(X))
        X = self.dropout(X)
        X = F.relu(self.dense1(X))
        X = F.softmax(self.output(X), dim=-1)
        return X

### Defining and training the neural net classifier

We use `NeuralNetClassifier` because we're dealing with a classifcation task. The first argument should be the `pytorch module`. As additional arguments, we pass the number of epochs and the learning rate (`lr`), but those are optional.

*Note*: To use the CUDA backend, pass `device='cuda'` as an additional argument.

In [7]:
from skorch import NeuralNetClassifier

In [8]:
net = NeuralNetClassifier(
    ClassifierModule,
    max_epochs=20,
    lr=0.1,
#     device='cuda',  # uncomment this to train with CUDA
)

As in `sklearn`, we call `fit` passing the input data `X` and the targets `y`. By default, `NeuralNetClassifier` makes a `StratifiedKFold` split on the data (80/20) to track the validation loss. This is shown, as well as the train loss and the accuracy on the validation set.

In [9]:
pdb on

Automatic pdb calling has been turned ON


In [9]:
net.fit(X, y)

  epoch    train_loss    valid_acc    valid_loss     dur
-------  ------------  -----------  ------------  ------
      1        [36m0.6905[0m       [32m0.6150[0m        [35m0.6749[0m  0.0921
      2        [36m0.6648[0m       [32m0.6450[0m        [35m0.6633[0m  0.0351
      3        [36m0.6619[0m       [32m0.6750[0m        [35m0.6533[0m  0.0164
      4        [36m0.6429[0m       [32m0.6800[0m        [35m0.6399[0m  0.0177
      5        [36m0.6307[0m       [32m0.6950[0m        [35m0.6254[0m  0.0174
      6        [36m0.6291[0m       [32m0.7000[0m        [35m0.6134[0m  0.0165
      7        [36m0.6102[0m       [32m0.7100[0m        [35m0.6033[0m  0.0168
      8        [36m0.6050[0m       0.7000        [35m0.5931[0m  0.0168
      9        [36m0.5966[0m       0.7000        [35m0.5844[0m  0.0212
     10        [36m0.5636[0m       0.7100        [35m0.5689[0m  0.0239
     11        0.5757       [32m0.7200[0m        [35m0.5628[0m  0.019

<class 'skorch.classifier.NeuralNetClassifier'>[initialized](
  module_=ClassifierModule(
    (dense0): Linear(in_features=20, out_features=10, bias=True)
    (dropout): Dropout(p=0.5)
    (dense1): Linear(in_features=10, out_features=10, bias=True)
    (output): Linear(in_features=10, out_features=2, bias=True)
  ),
)

Also, as in `sklearn`, you may call `predict` or `predict_proba` on the fitted model.

### Making predictions, classification

In [10]:
y_pred = net.predict(X[:5])
y_pred

array([0, 0, 0, 0, 0])

In [11]:
y_proba = net.predict_proba(X[:5])
y_proba

array([[ 0.53494632,  0.46505362],
       [ 0.86850929,  0.13149074],
       [ 0.68600392,  0.31399611],
       [ 0.91260117,  0.08739878],
       [ 0.69675469,  0.30324531]], dtype=float32)

## Usage with sklearn `GridSearchCV`

### Special prefixes

The `NeuralNet` class allows to directly access parameters of the `pytorch module` by using the `module__` prefix. So e.g. if you defined the `module` to have a `num_units` parameter, you can set it via the `module__num_units` argument. This is exactly the same logic that allows to access estimator parameters in `sklearn Pipeline`s and `FeatureUnion`s.

This feature is useful in several ways. For one, it allows to set those parameters in the model definition. Furthermore, it allows you to set parameters in an `sklearn GridSearchCV` as shown below.

In addition to the parameters prefixed by `module__`, you may access a couple of other attributes, such as those of the optimizer by using the `optimizer__` prefix (again, see below). All those special prefixes are stored in the `prefixes_` attribute:

In [36]:
print(', '.join(net.prefixes_))

module, iterator_train, iterator_valid, optimizer, criterion, callbacks, dataset


### Performing a grid search

Below we show how to perform a grid search over the learning rate (`lr`), the module's number of hidden units (`module__num_units`), the module's dropout rate (`module__dropout`), and whether the SGD optimizer should use Nesterov momentum or not (`optimizer__nesterov`).

In [12]:
from sklearn.model_selection import GridSearchCV

In [13]:
net = NeuralNetClassifier(
    ClassifierModule,
    max_epochs=20,
    lr=0.1,
    verbose=0,
    optimizer__momentum=0.9,
)

In [14]:
params = {
    'lr': [0.05, 0.1],
    'module__num_units': <YOUR CODE>, # range for number of units
    'module__dropout': <YOUR CODE>, # range for possible dropout rates
    'optimizer__nesterov': [False, True],
}

In [15]:
gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy', verbose=2)

In [16]:
gs.fit(X, y)

Fitting 3 folds for each of 16 candidates, totalling 48 fits
[CV] module__num_units=10, lr=0.05, optimizer__nesterov=False, module__dropout=0 
[CV]  module__num_units=10, lr=0.05, optimizer__nesterov=False, module__dropout=0, total=   0.4s
[CV] module__num_units=10, lr=0.05, optimizer__nesterov=False, module__dropout=0 
[CV]  module__num_units=10, lr=0.05, optimizer__nesterov=False, module__dropout=0, total=   0.4s
[CV] module__num_units=10, lr=0.05, optimizer__nesterov=False, module__dropout=0 
[CV]  module__num_units=10, lr=0.05, optimizer__nesterov=False, module__dropout=0, total=   0.3s
[CV] module__num_units=10, lr=0.05, optimizer__nesterov=True, module__dropout=0 
[CV]  module__num_units=10, lr=0.05, optimizer__nesterov=True, module__dropout=0, total=   0.3s
[CV] module__num_units=10, lr=0.05, optimizer__nesterov=True, module__dropout=0 
[CV]  module__num_units=10, lr=0.05, optimizer__nesterov=True, module__dropout=0, total=   0.3s
[CV] module__num_units=10, lr=0.05, optimizer__n

[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.4s remaining:    0.0s
[Parallel(n_jobs=1)]: Done  48 out of  48 | elapsed:   18.7s finished


GridSearchCV(cv=3, error_score='raise',
       estimator=<class 'skorch.classifier.NeuralNetClassifier'>[uninitialized](
  module=<class '__main__.ClassifierModule'>,
),
       fit_params=None, iid=True, n_jobs=1,
       param_grid={'module__num_units': [10, 20], 'lr': [0.05, 0.1], 'optimizer__nesterov': [False, True], 'module__dropout': [0, 0.5]},
       pre_dispatch='2*n_jobs', refit=False, return_train_score='warn',
       scoring='accuracy', verbose=2)

In [17]:
print(gs.best_score_, gs.best_params_)

0.855 {'module__num_units': 20, 'lr': 0.05, 'optimizer__nesterov': True, 'module__dropout': 0}
