In this notebook, we provide advanced examples to alter a number of objects in the end2you pipeline. In particular, we provide the following examples:

* [Data Provder: Process input](): Pre-process the raw input signal before feeding it to the network.
* [Custom Audio Model](): Use your own network architecture
* [Custom Loss Function](): Use your own loss function for training a model
* [Custom Evaluation Function](): Use your own metric function to evaluate a model

## Data Provder: Process input

We give the ability to pre-process the input data by defining a custom data provider. The custom provider needs to inherit from a data provider of end2you, and the pre-processing is applied in the `process_input` method.

For example, if we want to apply a pre-emphasis filter on the raw audio signal we can define the following class:

In [None]:
import torch
import numpy as np

from end2you.data_provider import AudioProvider
from end2you.data_provider.get_provider import pad_collate
from torch.utils.data import DataLoader


class CustomAudioProvider(AudioProvider):
    """Provides the data for the audio modality."""
    
    def __init__(self, *args, **kwargs):
        self.modality = 'audio'
        super().__init__(*args, **kwargs)
    
    def process_input(self, data, labels):
        """ Pre-process input.
        
        Args:
          data (np.array) (S x T): Raw audio signal with S frames and T number of samples.
          labels (np.array) (S x N) : Data labels with N outputs.
        """
        
        processed_data = []
        for i, di in enumerate(data):
            # Apply pre-emphasis filter 
            processed_data.append(
                np.hstack([di[0], di[1:] - 0.97 * di[:-1]])
            )
        
        processed_data = np.vstack(processed_data)
        
        return data, labels

We can now invoke the `CustomAudioProvider` class and then the `DataLoader` class of pytorch as follows:

In [None]:
def get_dataloader(CustomProvider, params, **kwargs):
    
    provider_class = CustomProvider(params.dataset_path, seq_length=params.seq_length)
    
    return DataLoader(provider_class,
                      batch_size=params.batch_size,
                      shuffle=params.is_training,
                      num_workers=params.num_workers,
                      pin_memory=params.cuda,
                      collate_fn=pad_collate)

A number of parameters can be defined to get instantiate the `DataLoader` class of pytorch. As we have a training and a validation processes, we need to define two loaders; one for training and the other for evaluation. 

We define the `_get_params` method to return the required loaders as follows:

In [None]:
from end2you.utils import Params


def _get_params(modality:str,
               dataset_paths:list,
               seq_length:int,
               batch_size:int,
               cuda:bool):
    train_params = Params(dict_params={
        'modality': modality,
        'dataset_path': dataset_paths[0],
        'seq_length': seq_length,
        'batch_size': batch_size,
        'cuda': cuda,
        'num_workers': 0,
        'is_training': True,

    })
    
    valid_params = Params(dict_params={
        'modality': modality,
        'dataset_path': dataset_paths[1],
        'seq_length': seq_length,
        'batch_size': batch_size,
        'cuda': cuda,
        'num_workers': 0,
        'is_training': False,

    })

    return {
        'train': get_dataloader(CustomAudioProvider, train_params),
        'valid': get_dataloader(CustomAudioProvider, valid_params)
    }


Note you need to define the path to the training and validation `hdf5` files, which were created using the data generator (see [basic](../basic)).

In [None]:
data_providers = _get_params(
    modality='audio',
    dataset_paths=['/path/to/train/hdf5/files', '/path/to/valid/hdf5/files'],
    seq_length=150,
    batch_size=10,
    cuda=True
)

## Custom Audio Model 

We give the flexibility to use your own network architecture. To do so, you just need to define your model under the PyTorch framework and feed it to the training process. 

The examples that follows builds an audio network. An important property of the class is the `num_outs` parameter, which is the number of predictions of the model. In this notebook we use the SEWA database where 3 outputs are provided, and hence  in this case `num_outs = 3`.

In [None]:
import torch
import torch.nn as nn
import numpy as np

from end2you.models.audio import AudioModel
from end2you.models.rnn import RNN
from pathlib import Path

In [None]:
class AudioRNNModel(nn.Module):
    
    def __init__(self,
                 input_size:int,
                 num_outs:int,
                 pretrained:bool = False,
                 model_name:str = None):
        """ Convolutional recurrent neural network model.
        
        Args:
            input_size (int): Input size to the model. 
            num_outs (int): Number of output values of the model.
            pretrained (bool): Use pretrain model (default `False`).
            model_name (str): Name of model to build (default `None`).
        """
        
        super(AudioRNNModel, self).__init__()
        audio_network = AudioModel(model_name=model_name, input_size=input_size)
        self.audio_model = audio_network.model
        num_out_features = audio_network.num_features
        self.rnn, num_out_features = self._get_rnn_model(num_out_features)
        self.linear = nn.Linear(num_out_features, num_outs)
        self.num_outs = num_outs
    
    def _get_rnn_model(self, input_size:int):
        """ Builder method to get RNN instace."""
        
        rnn_args = {
            'input_size': input_size,
            'hidden_size': 64,
            'num_layers': 2,
            'batch_first':True
        }
        return RNN(rnn_args, 'gru'), rnn_args['hidden_size']
    
    def forward(self, x:torch.Tensor):
        """
        Args:
            x ((torch.Tensor) - BS x S x 1 x T)
        """
        
        batch_size, seq_length, t = x.shape
        x = x.view(batch_size*seq_length, 1, t)
        
        audio_out = self.audio_model(x)
        audio_out = audio_out.view(batch_size, seq_length, -1)
        
        rnn_out, _ = self.rnn(audio_out)
        
        output = self.linear(rnn_out)
        
        return output

In [None]:
model = AudioRNNModel(
    input_size=1600, num_outs=3, model_name='emo18')

## Custom Loss Function 

You have the ability to define your own loss function and use it for training your model. To do so, you need to define a class that inherits from end2you's `Losses` class and define your metric as method in the class. 

An example to define MSE loss follows:

In [None]:
import torch

from end2you.training import Losses


class CustomMSELoss(Losses):
    
    def __init__(self, loss_name:str):
        super(CustomMSELoss, self).__init__()
        self._loss = self.custom_mse 
        self.loss_name = loss_name
        
    def custom_mse(self, 
                   predictions:torch.Tensor, 
                   labels:torch.Tensor):
        """ Custom MSE loss function.
        
        Args:
          predictions (torch.Tensor) (BS x 1): Model predictions
          labels (torch.Tensor) (BS x 1): Data labels
            BS: Batch size multiplied by the Sequence length.
                e.g. batch_size = 10 and seq_length = 150
                     => BS = 1500
        """
        
        predictions = predictions.view(-1,)
        labels = labels.view(-1,)
        
        return torch.mean((predictions - labels)**2)

In [None]:
loss_cls = CustomMSELoss('custom_MSE')

## Custom Evaluation Function

You have the ability to define your own metric function and use it for training your model. To do so, you need to define a class that inherits from end2you's `MetricProvider` class and define your metric as method in the class. 

An example to define MSE metric follows:

In [None]:
import torch

from end2you.evaluation import MetricProvider


class CustomMSEMetric(MetricProvider):
    
    def __init__(self, metric_name:str):
        super(CustomMSEMetric, self).__init__()
        self._metric = self.custom_mse
        self.metric_name = metric_name
    
    def custom_mse(self, 
                   predictions:list, 
                   labels:list):
        """ Custom MSE metric function.
        
        Args:
          predictions (list): Model predictions of batch size length
          labels (list): Data labels of batch size length
        """
        
        predictions = np.concatenate(predictions).reshape(-1,)
        labels = np.concatenate(labels).reshape(-1,)
        
        return np.mean((predictions - labels)**2).astype(np.float64)

In [None]:
eval_fn = CustomMSEMetric('custom_mse')

## Rest of parameters 

In [None]:
params = Params(dict_params={
    'train':Params(dict_params={'cuda':True,  
                                'optimizer':'adam',
                                'learning_rate':0.0002,
                                'summarywriter_file':'train_sw',
                                'num_epochs':50,
                                'batch_size':3,
                                'save_summary_steps':10,
                               }),
    'valid':Params(dict_params={'cuda':True,  
                                'modality':'audio',
                                'batch_size':1, 
                              }),
    'root_dir':'./path/to/save/output/files/of/end2you',
    'log_file':'training.log',
    'ckpt_path': None,
    'num_gpus':1
})

In [None]:
import end2you.training.optimizer as optim

from torch.utils.tensorboard import SummaryWriter


processes = ['train', 'valid']

# Model
num_model_params = [
    pmt.numel() for pmt in model.parameters() if pmt.requires_grad is True]

# Optimizer to choose
optimizer = optim.get_optimizer(params.train.optimizer)
optimizer = optimizer(model.parameters(), lr=params.train.learning_rate)

tb_path = Path(params.root_dir) / 'summarywriters' 
summary_writers = {
    process: SummaryWriter(str(tb_path / process))
        for process in processes
}

## Use GPU (optional)

In [None]:
gpus = [str(x) for x in range(params.num_gpus)]
device = torch.device("cuda:{}".format(','.join(gpus)))
torch.cuda.set_device(device)

if torch.cuda.device_count() > 1:
    logging.info('Using', torch.cuda.device_count(), 'GPUs!')
    model = nn.DataParallel(model)

model.to(device)

## Start Training

In [None]:
from end2you.training import Trainer

trainer = Trainer(loss=loss_cls, 
                  evaluator=eval_fn,
                  data_providers=data_providers,
                  summary_writers=summary_writers,
                  root_dir=params.root_dir,
                  model=model,
                  ckpt_path=params.ckpt_path,
                  optimizer=optimizer,
                  params=params)

In [None]:
trainer.start_training()