# Beforehand

## Pre requisites

Note that you may want to run this jupyter notebook in a virtual env !  
To do so, create your environment with virtualenv or conda, activate it, install ipykernel and add your virtual environment to the jupyter kernels.  
You should find all the necessary information here: https://janakiev.com/blog/jupyter-virtual-envs/  

Here is what I personally did:  

<details>
    <summary>Click once on <font color="blue"><b>this text</b></font> to show/hide the commands I used</summary>

```bash
    virtualenv .venv
    source .venv/bin/activate
    pip install ipykernel
    python -m ipykernel install --name=venv_prescyent
```
</details>


You should now edit and run this jupyter notebook in your browser, by running jupyter in your terminal with:  `jupyter notebook`  
Or also run jupyters notebooks directly in vscode, selecting your newly created kernel instead of the current selection in the top right corner


# Let's get started

## Install the lib and download a dataset

You should have all necessary information in the readme, and if not it's the perfect occasion to tell me !  
Here we want to install the library from pypi, and load a dataset  
Note that the pypi install of PreScyent install also all of its dependencies, including torch and cuda which can be long to install. You may want to install a custom version of torch beforehand that would still match the dependencies of PreScyent, so any torch above 2.0 (and bellow 3.0 if you are doing this tutorial from the future.)  
In that case, choose a version of pytroch for your environment here: https://pytorch.org/get-started/locally/  


The dataset we want to download is the TeleopIcub Dataset that you can find here: https://zenodo.org/records/5913573  
or directly in its .hdf5 format here https://gitlab.inria.fr/hucebot/datasets/andydata-lab-prescientteleopicub (if you have the access rights)  

If you have the original data from the zenodo website, you have to pre process it into the library's format to be able to load it and use it in the lib.  
Again, please check the readme for the instructions !  


<details>
    <summary>Click once on <font color="blue"><b>this text</b></font> to show/hide the commands I used</summary>

In the virualenv, install a specific version of torch instead of letting the library choose from its dependencies
```bash
    pip install torch --index-url https://download.pytorch.org/whl/cpu
```

Install the lib from pypi
```bash
    pip install prescyent
```

Download and prepare dataset
```bash
    wget https://zenodo.org/records/5913573/files/AndyData-lab-prescientTeleopICub.zip
    unzip AndyData-lab-prescientTeleopICub.zip -d AndyData-lab-prescientTeleopICub/
    wget https://raw.githubusercontent.com/hucebot/prescyent/refs/heads/main/dataset_preprocessing/teleopicubdataset_to_hdf5.py
    python teleopicubdataset_to_hdf5.py --data_path AndyData-lab-prescientTeleopICub/
```

</details>

## Meet the Dataset and Trajectory

Load the downloaded and processed dataset using the corresponding dataset class  


We use Config classes for our main classes such as Datasets, Predictors, Scalers, or TrainingConfig for the Predictor's trainer  
Such classes allows us to define default values and give some constraints or type hints about the possible inputs  
If you use a code editor with auto_completion you'll have default values and types indicated, you can also refer to the user documentation for each config file here:  
https://hucebot.github.io/prescyent/configuration_files.html  


Update the dataset's config and see the corresponding generated tensor pairs and plots using the functions bellow !

In [None]:
from prescyent.evaluator.plotting import plot_trajectory_feature_wise

#? Load and import TeleopIcubDataset here
#
#   We expect dataset = ...
#


# Show shapes of data from the dataloaders as it'll be seen for the models
input_tensor, context, output_tensor = next(iter(dataset.train_dataloader()))
print(f"#######################")
print(f"TRAINING TENSOR SHAPES:")
print(f"#######################")
print(f"input_tensor as shapes {input_tensor.shape}")
print(f"output_tensor as shapes {output_tensor.shape}")
for context_key, context_tensor in context.items():
    print(f"context {context_key} as shapes {context_tensor.shape}")
print(f"#######################")

# Plot data from a test trajectory itself
plot_trajectory_feature_wise(dataset.trajectories.test[0])

## Test Baselines on this dataset

Now that you loaded a dataset, you want to run some predictors over it
Let's meet the predictors with a very simple baseline, such as the ConstantPredictor or DelayedPredictor
Instantiate one of theses baselines bellow and see the resulting prediction plot


In [None]:
from prescyent.evaluator.plotting import plot_prediction_feature_wise


#? Load and import the ConstantPredictor baseline here
#
#  We expect baseline = ...
#


#? predict a new trajectory with the baseline
#
# We expect test_traj = ...
#           baseline_traj = ...
#           baseline_offset = ...
#

# we create a new predicted trajectory from a given predictor, built from the last predicted frame at each timestep
baseline_traj, baseline_offset = ...


#! Here we compare prediction with truth traj
# subset a truth trajectory from the original traj if needed, to compare fairly with prediction
truth_traj = test_traj.create_subtraj(
    dataset.config.out_points, dataset.config.out_features
)

# plot prediction along truth
plot_prediction_feature_wise(
    truth_traj,
    baseline_traj,
    offset=baseline_offset,
)
#? TRY ALSO THE OTHER TWO BASELINES
#? WHAT CAN YOU OBSERVE ABOUT THE PLOTS OF CONSTANT AND DELAYED BASELINES ?


# Notice that the predicted trajectory doesn't start at T = future_size
# Because we need to have an input of size history_size before predicting an output of size future_size
# So actually the first predicted frame is at T = history_size + future_size



## Train a Predictor model and save it

Using our simplest architecture, the MlpPredictor we'll see how to train and save a ML Predictor, using high level methods based on the pytorch_lightning syntax.  

Here you'll see that each of our ML Predictors as their own specific PredictorConfig, as they allow to customize their layers and behavior.  
In addition to their config, to allow some training, you'll have to define a TrainingConfig object, again customizing the training process (number of epochs, early stopping patience, learning rate...)  

It's also the moment to introduce the enums, such as LossFunctions or LearningTypes.  
They are a standard we chose over Literals to describe a set of finite possibilities for a given config value, and define cleaner conditions based on theses values in the code instead of manipulating strings or another type.
You'll find all of them importable from `prescyent.utils.enums`.  
And more details about their values in the doc here: https://hucebot.github.io/prescyent/configuration_files.html  

Once your model is trained, or during its training, you can monitor many training metrics using using tensorboard and providing the path to the directory where the model logs (defined by its configs' `save_path` argument, which default's value is "data/models"):
`tensorboard --logdir data/models`

In [None]:
from pathlib import Path

#? init and train a MlpPredictor with previous dataset
#
# We expect predictor = ...
#

# Save the predictor in an explicit directory describing the settings of the experiment
model_dir = (
    Path( "tutorial")
    / "models"
    / f"{dataset.DATASET_NAME}"
    / f"{predictor.name}"
    / f"version_{predictor.version}"
)
print("Model directory:", model_dir)
predictor.save(model_dir)


## Load a model

You can load a Predictor from disk using its static method `load_pretrained`. You must provide as argument the path to root directory of the model or directly to its config.json file.  
Also when loading a Predictor from disk, you may choose on which `torch.device` you want to load your model's weights, by passing the device as an argument to the `load_pretrained` method of AutoPredictor or Predictor.  
Remember that choosing the device is made through the `TrainingConfig.accelerator` attribute when you are creating a model from scratch.  
The AutoPredictor class allows to load or build a Predictor based on its config file. The class has to be a class from the library in order for the AutoPredictor to function and recognize it (see in the user doc how to create a new predictor and add it to the AutoPredictor class).  
It is perfect to generate an evaluation script that is agnostic to the actual class of predictor.  


Note also that we still use the same loaded dataset, but the AutoDataset class also exists for the same purpose: reloading a Dataset used from a dataset_config


In [None]:
#? Load a predictor here
#
# We expect loaded_predictor =
#

# logs some infos about the predictor
loaded_predictor.describe()

# Evaluate Predictor

We've seen qualitative evaluations with the previous work, here we'll introduce some quantitative metrics  
First, we'll use the `test` method of the Predictor class to run the predictor over the whole test dataloader and return some metrics:  
- ADE  
- FDE  
- MPJPE  

Again you can monitor the results of such test method using tensorboard  



In [None]:
from prescyent.evaluator.plotting import plot_mpjpe, plot_prediction_feature_wise

#? Run the test method on your predictor, and use the two imported plot functions


# Compare Predictors

In addition to the results you can check with tensorboard
our plot methods have a plural variant used to compared trajectories and mpjpe results.  
Also we provide a runner method in evaluator to perform all we did upper with a list of trajectories and predictors, providing also a summary of the evaluations

In [None]:
from prescyent.evaluator.plotting import plot_mpjpes, plot_trajectories_feature_wise
from prescyent.evaluator.runners import eval_predictors, dump_eval_summary_list

#? Use the methods imported above to compare Predictors


# Surcharge the library with your own uses

## Use the CustomDataset

As long as you created the Trajectories object, you can benefit from the libs sampling and methods passing your trajectories to a CustomDataset (you just won't benefit from the AutoDataset).
Note that for more permanent use of the library, you may prefer to create a new TrajectoryDataset instance with its own config class. Please check in the user documentation for more infos.

### Features and conversions

Example with a custom dataset having quaternions as trajectories that produces


In [None]:
from scipy.spatial.transform import Rotation as R
import numpy as np
import torch

# This function creates a tensor with random quaternions and coordinates
def create_random_traj(num_frames: int):
    """x,y,z are linear here and a constant random rotation is generated"""
    linear_x_coordinates = torch.FloatTensor(np.linspace(0, 10, num_frames).tolist()).unsqueeze(0)
    linear_y_coordinates = torch.FloatTensor(np.linspace(0, 10, num_frames).tolist()).unsqueeze(0)
    linear_z_coordinates = torch.FloatTensor(np.linspace(0, 10, num_frames).tolist()).unsqueeze(0)
    random_quat = R.random().as_quat()
    random_quaternions = torch.FloatTensor([random_quat for _ in range(num_frames)])
    tensor = torch.cat((linear_x_coordinates, linear_y_coordinates, linear_z_coordinates)).transpose(0, 1)
    tensor = torch.cat((tensor, random_quaternions), dim=1)
    tensor = tensor.unsqueeze(1)
    return tensor

#? Create an instance of CustomDataset with generated Trajectories
#
# We expect custom_dataset = ...
#

#? Play with features and see the shapes of the tensors

input_tensor, context, output_tensor = next(iter(custom_dataset.train_dataloader()))
# Show shapes of data from the dataloaders as it'll be seen for the models
print(f"#######################")
print(f"TRAINING TENSOR SHAPES:")
print(f"#######################")
print(f"input_tensor as shapes {input_tensor.shape}")
print(f"output_tensor as shapes {output_tensor.shape}")
for context_key, context_tensor in context.items():
    print(f"context {context_key} as shapes {context_tensor.shape}")
print(f"#######################")
# custom feature with distance function


## Implement your own model

Create a torch module with a config and inherit from the base classes to create a custom predictor benefiting from common methods.  
You can take example on the structure of a simple baseline such as the MlpPredictor.  

Train it (and save it !)  
Test it and and plot it as upper models  
How does it compare ?  


In [None]:

from typing import Dict
import torch
from prescyent.predictor.lightning.models.sequence.predictor import SequencePredictor
from prescyent.predictor.lightning.configs.module_config import ModuleConfig
from prescyent.predictor.lightning.torch_module import BaseTorchModule
from prescyent.utils.tensor_manipulation import self_auto_batch


class NewConfig(ModuleConfig):
    """New config for a lightning predictor with a torch module"""
    # pass keys and values here that you may want to see vary in your architecture, like:
    hidden_size: int = 128
    # you can add more constraints on your config's attribute, like validators or min/max values
    # see the Pydantic library's Documentation for more information, or check some examples in our code

class NewTorchModule(BaseTorchModule):
    """New torch module inheriting from forward's decorator methods
        create it's init and forward methods as any pytorch module !
    """
    def __init__(self, config: NewConfig) -> None:
        super().__init__(config)
        # After the super().__init__(), you benefit from some infos from the config like theses:
        self.in_size = self.out_sequence_size * self.num_out_points * self.num_out_dims
        self.out_size = self.out_sequence_size * self.num_out_points * self.num_out_dims
        #
        # YOUR CODE HERE
        #

    @self_auto_batch  # <= auto batch the input, and unbatch the output if input tensor as only 3 shapes
    @BaseTorchModule.deriv_tensor  # <= allows the behaviors described by`deriv_on_last_frame` and `deriv_output`
    def forward(self, input_tensor: torch.Tensor, future_size: int=None, context: Dict[str, torch.Tensor] | None = None) -> torch.Tensor:
        if future_size is None:  # future_size is optional for seq2seq predictors ! so if you intend to index on it, use this !
            future_size = self.out_sequence_size
        #
        # YOUR CODE HERE
        #

class NewPredictor(SequencePredictor):
    """New class used to connect the config and torch module
       while inheriting from all base methods"""

    PREDICTOR_NAME = "NewPredictor"
    """unique name for this predictor"""
    module_class = NewTorchModule
    """LightningModule class used in this predictor"""
    config_class = NewConfig
    """PredictorConfig class used in this predictor"""

    def __init__(self, config: NewConfig, skip_build: bool = False):
        super().__init__(config=config, name=self.PREDICTOR_NAME, skip_build=skip_build)