# API - Training

This notebook illustrates the main features of pyNeVer for training a network on a dataset


## The _datasets_ module

The module *datasets* contains the classes to load a pre-defined dataset or a custom one.

In [None]:
import torch
import torchvision.transforms as tr

from pynever import datasets

"""We provide a direct interface to the common MNIST and Fashion MNIST datasets"""

# Prepare the dataset with a transformation to use it with a fully connected network
transform = tr.Compose([tr.ToTensor(), tr.Normalize(1, 0.5), tr.Lambda(lambda x: torch.flatten(x))])
mnist_training = datasets.TorchMNIST('path_to_data', train=True,
                                     transform=transform)  # The dataset is downloaded if not found in the given path
mnist_test = datasets.TorchMNIST('path_to_data', train=False, transform=transform)
fmnist_test = datasets.TorchFMNIST('path_to_data', train=False)  # Here there is no transform applied

It is also possible to load a custom dataset stored as a plaintext file using the class __GenericFileDataset__. It requires the delimiter character (',' by default), the data type (float by default) and the target index, i.e., the position that separates the inputs from the outputs.

In [None]:
# This dataset has ten inputs and is delimited by ';'
my_custom_dataset = datasets.GenericFileDataset('dataset.txt', 10, dtype=float, delimiter=';')

## The _training_ module

Once a dataset is loaded, it is possible to train a network with the same inputs and outputs. The training and testing strategies are initialized with all the necessary parameters before launch

In [None]:
from pynever.networks import SequentialNetwork
from pynever.nodes import FullyConnectedNode, ReLUNode
from pynever.strategies.training import PytorchTraining, PytorchTesting

mnist_net = SequentialNetwork('MNIST_net', 'X')
mnist_net.append_node(FullyConnectedNode('fc', (784,), 1000))  # Shallow NN with 1000 ReLU neurons
mnist_net.append_node(ReLUNode('relu', (1000,)))
mnist_net.append_node(FullyConnectedNode('fc2', (1000,), 10))

# The training strategy is derived from pyTorch
from torch.optim import Adam, lr_scheduler
from torch.nn import CrossEntropyLoss
from pynever.strategies.training import PytorchMetrics

train_batch_size = 128
validation_batch_size = 64
test_batch_size = 64
validation_percentage = 0.3
opt_params = {'lr': 0.01}
sch_params = {}

# Capture logger
import logging
import sys

logger = logging.getLogger('pynever.strategies.training')
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler(sys.stdout))

training_strategy = PytorchTraining(Adam, opt_params, CrossEntropyLoss(), 10,
                                    validation_percentage, train_batch_size, validation_batch_size,
                                    scheduler_con=lr_scheduler.ReduceLROnPlateau, sch_params=sch_params,
                                    precision_metric=PytorchMetrics.inaccuracy, device='cpu')
trained_net = training_strategy.train(mnist_net, mnist_training)

# Now, to test the results...
test_strategy = PytorchTesting(PytorchMetrics.inaccuracy, {}, test_batch_size, 'cpu')
test_strategy.test(trained_net, mnist_test)