### Quick Start

The `LANfactory` package is a light-weight convenience package for training `likelihood approximation networks` (LANs) in torch (or keras), 
starting from supplied training data.

[LANs](https://elifesciences.org/articles/65074), although more general in potential scope of applications, were conceived in the context of sequential sampling modeling
to account for cognitive processes giving rise to *choice* and *reaction time* data in *n-alternative forced choice experiments* commonly encountered in the cognitive sciences.

In this quick tutorial we will use the [`ssms`](https://github.com/AlexanderFengler/ssm_simulators) package to generate our training data using such a sequential sampling model (SSM). The use is in no way bound to utilize the `ssms` package.

#### Install

To install the `ssms` package type,

`pip install git+https://github.com/AlexanderFengler/ssm_simulators`

To install the `LANfactory` package type,

`pip install git+https://github.com/AlexanderFengler/LANfactory`

Necessary dependency should be installed automatically in the process.

#### Basic Tutorial

In [9]:
# Load necessary packages
import ssms
import lanfactory 
import os
import numpy as np
from copy import deepcopy
import torch

#### Generate Training Data
First we need to generate some training data. As mentioned above we will do so using the `ssms` python package, however without delving into a detailed explanation
of this package. Please refer to the [basic ssms tutorial] (https://github.com/AlexanderFengler/ssm_simulators) in case you want to learn more.

In [4]:
# Make configs

# Initialize the generator config (for MLP LANs)
generator_config = deepcopy(ssms.config.data_generator_config['lan']['mlp'])
# Specify generative model (one from the list of included models mentioned above)
generator_config['dgp_list'] = 'angle' 
# Specify number of parameter sets to simulate
generator_config['n_parameter_sets'] = 100 
# Specify how many samples a simulation run should entail
generator_config['n_samples'] = 1000
# Specify folder in which to save generated data
generator_config['output_folder'] = 'data/lan_mlp/'

# Make model config dict
model_config = ssms.config.model_config['angle']

In [12]:
# MAKE DATA

my_dataset_generator = ssms.dataset_generators.data_generator(generator_config = generator_config,
                                                              model_config = model_config)

training_data = my_dataset_generator.generate_data_training_uniform(save = True)

checking:  data/lan_mlp/
simulation round: 1  of 10
simulation round: 2  of 10
simulation round: 3  of 10
simulation round: 4  of 10
simulation round: 5  of 10
simulation round: 6  of 10
simulation round: 7  of 10
simulation round: 8  of 10
simulation round: 9  of 10
simulation round: 10  of 10
Writing to file:  data/lan_mlp/training_data_0_nbins_0_n_1000/angle/training_data_angle_18ef3b40b6e611ecbc37acde48001122.pickle


In [31]:
# MAKE DATALOADERS

# List of datafiles (here only one)
folder_ = 'data/lan_mlp/training_data_0_nbins_0_n_1000/angle/'
file_list_ = [folder_ + file_ for file_ in os.listdir(folder_)]

# Training dataset
torch_training_dataset = lanfactory.trainers.DatasetTorch(file_IDs = file_list_,
                                                          batch_size = 128)

torch_training_dataloader = torch.utils.data.DataLoader(torch_training_dataset,
                                                         shuffle = True,
                                                         batch_size = None,
                                                         num_workers = 1,
                                                         pin_memory = True)

# Validation dataset
torch_validation_dataset = lanfactory.trainers.DatasetTorch(file_IDs = file_list_,
                                                          batch_size = 128)

torch_validation_dataloader = torch.utils.data.DataLoader(torch_validation_dataset,
                                                          shuffle = True,
                                                          batch_size = None,
                                                          num_workers = 1,
                                                          pin_memory = True)

In [32]:
# SPECIFY NETWORK CONFIGS AND TRAINING CONFIGS

network_config = lanfactory.config.network_configs.network_config_mlp

print('Network config: ')
print(network_config)

train_config = lanfactory.config.network_configs.train_config_mlp

print('Train config: ')
print(train_config)

Network config: 
{'layer_types': ['dense', 'dense', 'dense'], 'layer_sizes': [100, 100, 1], 'activations': ['tanh', 'tanh', 'linear'], 'loss': ['huber'], 'callbacks': ['checkpoint', 'earlystopping', 'reducelr']}
Train config: 
{'batch_size': 128, 'n_epochs': 100, 'optimizer': 'adam', 'learning_rate': 0.002, 'loss': 'huber', 'metrics': [<keras.losses.MeanSquaredError object at 0x12aecdc70>, <keras.losses.Huber object at 0x12ac93820>], 'callbacks': ['checkpoint', 'earlystopping', 'reducelr']}


In [33]:
# LOAD NETWORK
net = lanfactory.trainers.TorchMLP(network_config = deepcopy(network_config),
                                   input_shape = torch_training_dataset.input_dim,
                                   save_folder = '/data/torch_models/',
                                   generative_model_id = 'angle')

tanh
linear


In [34]:
# SAVE CONFIGS
lanfactory.utils.save_configs(model_id = net.model_id + '_torch_',
                                  save_folder = 'data/torch_models/angle/', 
                                  network_config = network_config, 
                                  train_config = train_config, 
                                  allow_abs_path_folder_generation = True)

Found folder:  data
Moving on...
Did not find folder:  data/torch_models
Creating it...
Did not find folder:  data/torch_models/angle
Creating it...
Saved network config
Saved train config


In [36]:
# LOAD MODEL TRAINER
model_trainer = lanfactory.trainers.ModelTrainerTorchMLP(train_config = deepcopy(train_config),
                                                            data_loader_train = torch_training_dataloader,
                                                            data_loader_valid = torch_validation_dataloader,
                                                            model = net,
                                                            output_folder = 'data/torch_models/',
                                                            warm_start = False,
                                                            allow_abs_path_folder_generation = True)

Torch Device:  cpu
Found folder:  data
Moving on...
Found folder:  data/torch_models
Moving on...
Found folder:  data/torch_models/angle
Moving on...


In [39]:
# TRAIN MODEL
model_trainer.train_model(save_history = True,
                          save_model = True,
                          verbose = 2)

epoch: 0 / 100, batch: 0 / 1562, batch_loss: 0.2204161137342453
epoch: 0 / 100, batch: 1000 / 1562, batch_loss: 0.15364894270896912
Epoch took 11.396775960922241 seconds
STARTING VALIDATION:
epoch 0 / 100, validation_loss: 0.1417
epoch: 1 / 100, batch: 0 / 1562, batch_loss: 0.14299717545509338
epoch: 1 / 100, batch: 1000 / 1562, batch_loss: 0.11169125884771347
Epoch took 11.024933815002441 seconds
STARTING VALIDATION:
epoch 1 / 100, validation_loss: 0.1234


KeyboardInterrupt: 