# Basic Usage Tutorial
This notebook aims at showing how to use the multimodal translation model in its basic form.

## 1. Importing the needed parts

In [None]:
# we will need to import a few things first
import numpy as np
import os

# changing working directory
# for imports to work
from pathlib import Path
path = Path(os.getcwd())
os.chdir(path.parent)

# import the encoders and decoders we want to use
from multimodal_autoencoders.model.encoders import DynamicEncoder
from multimodal_autoencoders.model.decoders import DynamicDecoder

# import the Autoencoder class
from multimodal_autoencoders.base.autoencoder import VariationalAutoencoder

# import a discriminator and a classifier
from multimodal_autoencoders.model.classifiers import Discriminator, SimpleClassifier

# import the JointTrainer aka the brains of the operation
from multimodal_autoencoders.joint_trainer import JointTrainer

## 2. Setting up the data
 Therefore the model relies on simple numpy arrays as a data input. Please note that this data should be already processed the way you want.  For the sake of this tutorial, we will work with synthetic data.

The philosiphy of the package is to make it as accesible as possible. Association of certain parts to a given modality get tracked with simple python dictionaries.
These map a string key to the data in a simple numpy array. The model also only supports paired data at the moment, meaning that each row of each modality should be associated to the same sample (e.g. compound).<br>
**Please note that preprocessing the data an making sure all arrays are in the correct order is your due diligence!**
<br>For our the tutorial we will only use some randomly generated data to keep things fast and.

In [None]:
# initialize the random generator
rng = np.random.default_rng()

# create some latent information common to all modalities
train_latent_information = rng.random(size = (100, 25))

# small helper function for our synthetic data
def generate_modality(latent_information: np.array, n_random_dims: int, samples: int = 100):
    ar = np.concatenate((latent_information, rng.random(size = (100, n_random_dims))), axis=1)
    rng.shuffle(ar, axis=1)

    return ar

# define the data dictionary
# this will be the first part you'll need to hold your actual data
train_data_dict = {
    "modality_1": generate_modality(train_latent_information, 25),
    "modality_2": generate_modality(train_latent_information, 50),
    "modality_3": generate_modality(train_latent_information, 75)}


# we will also create a separate validation data set sharing some similarity to the training data
val_latent_information = train_latent_information * 0.8 + rng.random(size = (100, 25)) * 0.2
val_data_dict = {
    "modality_1": generate_modality(val_latent_information, 25),
    "modality_2": generate_modality(val_latent_information, 50),
    "modality_3": generate_modality(val_latent_information, 75)}

## 3. Setting up the models
### 3.1 Autoencoders
The individual autoencoder models also get associated with a modality via a python dictionary. **Please make sure that for each entry in the model dictionary, there exists an entry with the same key in the data dictionary.**<br>

The VariationalAutoencoder class is a one-stop-shop do define everything for an autoencoder to work. It expects a few inputs:
- encoder: object of class encoder
- decoder: object of class decoder
- optimizer: string name of optimizer to use (adam, sgd)
- learning_rate: learning rate to use for this model
- pretrain_epochs: number of epochs to train this model alone
- train_joint: boolean if the model should be further trained during the joint training phase
- optimizer arguments: any further arguments needed for the optimizer can be passed as keyword arguments

The encoder and decoder themselves expect some inputs. These might change in the future in case of more flexble implementations:

#### encoder:
- n_input: number of features of the input
- n_hidden: number of nodes/channels in the hidden layers
- n_layers: number of hidden layers of the encoder

#### decoder:
- n_input: number of features of the input
- n_hidden: number of nodes/channels to use in the hidden layers
- n_z: number of features in the latent space
- n_layers: number of hidden layers of the decoder

In [None]:
model_dict = {
    "modality_1": VariationalAutoencoder(DynamicEncoder(50, 42, 2), DynamicDecoder(50, 42, 36, 2), "adam", 0.001),
    "modality_2": VariationalAutoencoder(DynamicEncoder(75, 50, 2), DynamicDecoder(75, 50, 36, 2), "adam", 0.001),
    "modality_3": VariationalAutoencoder(DynamicEncoder(100, 75, 2), DynamicDecoder(100, 75, 36, 2), "adam", 0.001)}

### 3.2 Support models
The multimodal translation architecture works with two support models: a latent space discriminator and a sample classifier. While the discriminator is mandatory for the model to work, the classifier is optional.

#### 3.2.1 Discriminator
The discriminator helps to align the latent spaces of the different modalities. Similar to the autoencoder class, the discriminator already contains its own optimizer. At the moment it is realized as a feed forward neural network.

Discriminator:
- optimizer: string name of optimizer to use (adam, sgd)
- learning_rate: learning rate to use for this model
- n_z: number of features in the latent space
- n_out: classes to predict (usually the number of modalities)
- n_hidden: number of nodes/channels in the hidden layers

In [None]:
discriminator = Discriminator("adam", 0.001, 36, len(model_dict), 50)

#### 3.2.2 Classifier 
The classifier can support the autoencoders by providing additinal information on how to structure the latent space.

Classifier:
- optimizer: string name of optimizer to use (adam, sgd)
- learning_rate: learning rate to use for this model
- n_z: number of features in the latent space
- n_out: classes to predict (the number of sample classes)

**Cluster labels** <br>
First we need to set up some labels for the classifier to predict. We will assume two classes in the data. Cluster labels are given as a numpy array with the same size as the input data
each label in the numpy array is the label for the samples at this index in the training data. The labels can be strings or integers and will be converted to the needed format internally. **Make sure to not use an ndarray.**

In [None]:
cluster_data = np.concatenate((np.repeat(0, 50), np.repeat(1, 50))).flatten()

now we can initialize the classifier

In [None]:
classifier = SimpleClassifier("adam", 0.001, 36, 2)

## 4. The JointTraner or the brains of the operation
The JointTraner is the major class taking care of training the models and making the models available to use afterwards. It makes use of all the components combined so far and calls everything as needed. It has the following parameters:

- model_dict: dictionary holding the intialized models
- discriminator: intialized discrimnator object
- classifier: intialized classifier object (optional)

In [None]:
model = JointTrainer(
        model_dict = model_dict,
        discriminator = discriminator,
        classifier = classifier)

## 5. Launching Training
Starting the training procedure is done with a single function call to the model. The training call accpets the following inputs:
- train_data_dict: data dictionary containing the training data
- val_data_dict: data dictionary containing the validation data
- batch_size: integer for desired batch size
- max_epochs: the maximal number of epochs to train for
- recon_weight: multplier for the autoencoder reconstruction loss (either integer or dictionary of string to integer for per model scaling)
- beta: float value for influence of variational loss on total loss
- disc_weight: float value for influence of discrimnator on total loss
- anchor_weight: float value for influence of mean absolute error between latent samples on total loss
- cl_weight: float value for influence of sample classifier on total loss (optional)
- cluster_labels: nummpy array of cluster labels (optional, only add if classifier was added in model initialization)
- log_path: string path to store training metric log to (optional, spills to console if not provided)
- use_gpu: boolean if gpu acceleration should be enabled

The train call will return a meter dictionary. It contains meter objects for each loss that gets tracked during training and validation.

In [None]:
meter_dict = model.train(
    train_data_dict = train_data_dict,
    val_data_dict = val_data_dict,
    batch_size = 10,
    max_epochs = 10,
    recon_weight = 3,
    beta = 0.001,
    disc_weight = 3,
    anchor_weight = 1,
    cl_weight = 3,
    cluster_labels = cluster_data,
    use_gpu = False)

print(meter_dict["loss"].avg)

## 6. Using the model
Once you have trained the model and are satisfied with the parameters you have chosen, you can use the model in inference. For that the JointTrainer provides two functions: `forward` and `translate`.<br>
Forward allows you to encode some data using a specific model, e.g. if you are interested in the latent representation. It also returns the reconstruction if you want to do further quality checks.<br>
Translate allows you to translate some data from one modality to another. You will need to provide the names of the *from* and *to* model you want to use.

In [None]:
# first let's create a completely new numpy array
inference_data = rng.random(size = (100, 50))

# we can encode this array with the modality_1 model
reconstructed_inference, encoded_inference = model.forward("modality_1", inference_data)
print(f"Shape of inference data: {inference_data.shape}")
print(f"Shape of reconstructed inference data: {reconstructed_inference.shape}")
print(f"Shape of encoded inference data: {encoded_inference.shape}")

In [None]:
# we can also translate from modality_1 to modality_3

translated_inference = model.translate("modality_1", "modality_3", inference_data)

modality_3_shape = train_data_dict["modality_3"].shape
print(f"Shape of modality_3 data: {modality_3_shape}")
print(f"Shape of inference data: {inference_data.shape}")
print(f"Shape of translated inference data: {translated_inference.shape}")

## 7. Saving the model
Once you are done trainig and using the model you might want to store it for later use or documentation. The JointTrainer provideds the `save_model` function to store all parts of the model. All it needs is the path to a directory to store everything into. The path will be created if it does not exist yet.

In [None]:
model.save_model("./test_save")

## 8. Loading an existing model
If you want to resume the training on an existing model or use a pretrained model for inference, you can provide the path to a stored model when initializing a joint trainer. **You will need to provide the model dict and support models again with the same parameters.** The newly defined objects will be primed with the stored weights internally. **Please make sure to use the same string keys in the model dictionary.** If you don't know the specific details for this checkpoint, please locate the README file inside the checkpoint directory containing all necessary information. **The optimizers will be set up automatically to the state of the checkpoint.**

In [None]:
model = JointTrainer(
        model_dict = model_dict,
        discriminator = discriminator,
        classifier = classifier,
        checkpoint_dir = "./test_save")

reconstructed_inference, encoded_inference = model.forward("modality_1", inference_data)
print(f"Shape of inference data: {inference_data.shape}")
print(f"Shape of reconstructed inference data: {reconstructed_inference.shape}")
print(f"Shape of encoded inference data: {encoded_inference.shape}")