# Interactive prototyping

This notebook is meant to demonstrate a typical setup while developing a new class.

Because the whole training pipeline is modular - we can easily substitute any part of it with a new one.

Because the pipeline is built from a config, we can just add the new class, declare we want to use it 
in a config, and run an experiment - all in a few lines of code. 

Once the class is working, it should be moved - without changes - to the appropriate Python module.

### Imports, loading a standard task

In [1]:
from typing import *
import json
from pathlib import Path
import ipdb
import pandas as pd
from torch import nn
import torch.nn.functional as F
from ulmfit_attention import scenarios
from ulmfit_attention.aggregations import Aggregation
from hyperspace_explorer import configurables

In [2]:
tasks_dir = Path.cwd().resolve().parent / 'tasks'
def load_task_conf(name):
    return json.load((tasks_dir / f'{name}.json').open())
task_conf = load_task_conf('imdb_1k_sample_single')


### The prototype

the actual code we are developing - always a class inheriting from a Configurable parent

In this case it is an 'Aggregation' module - a small part of the neural network, sitting 
within the classifier head. It is responsible for taking variable-length sequences 
and aggregating them to fixed-length representations.

In [3]:
class AveragedConvolutionAggregation(Aggregation):
    """Just a simple example of a module, replacing `BranchingAttentionAggregation`"""
    def __init__(self, dv: int, kernel_size: int, n_filters: int):
        """
        Prepare the module
        
        :param dv: input dimension - calculated by other modules -> NOT A PART OF THE CONFIG
        :param kernel_size: size of the 1d kernel
        :param out_dim: dimensionality of the output
        """
        super().__init__()
        self.n_filters = n_filters
        self.dv = dv
        padding = kernel_size // 2
        self.layer = nn.Conv1d(dv, n_filters, kernel_size, padding=padding)
    
    @classmethod
    def get_default_config(self) -> Dict:
        """Required for every instance of Configurable"""
        return {'kernel_size': 3}
    
    @property
    def output_dim(self) -> int:
        """Required for every instance of Aggregation - specific to this project"""
        return self.n_filters
    
    def forward(self, inp, mask):
        x = inp.permute(0,2,1) # bs, seq_len, dv -> bs, dv, seq_len
        x = self.layer(x)
        x = x.permute(0,2,1)
#         un-comment to interactively examine tensors
#         ipdb.set_trace()
        x = F.relu(x)
        x.masked_fill_(mask.unsqueeze_(-1), 0)
        x = x.mean(1)
        return x

### Configuration

an example config for a new run; most important part: 
```
'Aggregation': {'className': 'AveragedConvolutionAggregation'}
```

will instruct the system to use a `AveragedConvolutionAggregation` class, 
built by the `Aggregation` factory

In [4]:
config = {
    'Classifier': {
        'Aggregation': {
            'className': 'AveragedConvolutionAggregation',
            'n_filters': 50,
        },
        'className': 'AggregatingClassifier',
        'lin_ftrs': [],
    },
    'TrainingSchedule': {
        'className': 'HeadOnlySchedule',
        'cycles': 5,
    }
}

### Running the experiment

one caveat: if random seed is to be set, it has to be explicitly done here.
Doing it automatically elsewhere causes problems for `Sacred`, when running
the `hyperspace_worker`

In [5]:
scenario = scenarios.Scenario.from_config(task_conf['Scenario'])
config['seed'] = task_conf['seed']

In [6]:
result, learner = scenario.single_run(config)

epoch,train_loss,valid_loss,accuracy,time
0,0.441105,#na#,00:08,
1,0.404904,#na#,00:08,
2,0.363094,#na#,00:07,
3,0.321124,#na#,00:07,
4,0.301765,#na#,00:07,


In [7]:
print(result)

0.9141200184822083
