# Intro to DLC2Action (mini)

DLC2Action is a package for automatic behavior prediction. It offers implementation of SOTA models and keeps track of experiments.

To see how it works, we will experiment on a relatively small [publically available](https://github.com/ETHZ-INS/DLCAnalyzer/tree/master/data/OFT) dataset (Sturman, 2020). Run the code below to download the data.

This is a minimalistic version of this notebook, check out demo_notebook.ipynb for more information.

Note that the results we are getting here are not optimal because we are using very small numbers of epochs and trials to make the execution time fit within a short tutorial.

In [None]:
!pip install gdown
!gdown https://drive.google.com/uc?id=1c-dX7MtRGPSGSrNp3Uaf3aOIzokzuj69
!apt-get install unzip
!unzip OFT.zip -d OFT

... installation

for now:
```
git clone https://github.com/AlexEMG/DLC2Action
cd DLC2Action
conda create --name DLC2Action python=3.9
conda activate DLC2Action
python -m pip install .
```

In [None]:
from dlc2action.project import Project
import os

CURRENT_PATH = os.getcwd()
DATA_PATH = os.path.join(CURRENT_PATH, "OFT", "OFT", "Output_DLC")
LABELS_PATH = os.path.join(CURRENT_PATH, "OFT", "OFT", "Labels")
PROJECTS_PATH = os.path.join(CURRENT_PATH, "DLC2Action")

High-level methods in DLC2Action are almost exclusively accessed through the `dlc2action.project.Project` class. A project instance should loosely correspond to a specific goal (e.g. generating automatic annotations for dataset A with input format X). You can use it to optimize hyperparameters, run experiments, analyze results and generate new data.

**Best practices**
- When you need to do something with a different data type or unrelated files, it's better to create a new project to keep the experiment history easy to understand.
- Each project is associated with a folder on your computer that contains all settings, meta files and experiment outputs. Those folders are created in the folder at `projects_path`. It's generally a good idea to choose one and stick to it throughout projects.

### Creating a project

Let's begin!

We will create a project called `"oft"`, with `"dlc_track"` input and `"csv"` annotation format. 

You can run `Project.print_data_types()` and `Project.print_annotation_types()` to find out more about other options.

In [None]:
# Project.remove_project("oft", projects_path=PROJECTS_PATH)
project = Project(
    "oft",
    data_path=DATA_PATH,
    annotation_path=LABELS_PATH,
    projects_path=PROJECTS_PATH,
    data_type="dlc_track",
    annotation_type="csv",
)

## Setting parameters

After the project is created, it's time to configure the parameter settings. 

The first step is to check which essential parameters are missing with `project.list_blanks()`.

In [None]:
project.list_blanks()

We can copy this code, fill in the blanks and run it. 

We will also set the number of epochs here. Normally the default should be fine but for the purpose of this tutorial we want to set it smaller so that our experiments can finish in time.

In [None]:
project.update_parameters(
    {
        "data": {
            "data_suffix": "DeepCut_resnet50_Blockcourse1May9shuffle1_1030000.csv", # set; the data files should have the format of {video_id}{data_suffix}, e.g. video1_suffix.pickle, where video1 is the video is and _suffix.pickle is the suffix
            "canvas_shape": [928, 576], # list; the size of the canvas where the pose was defined
            "annotation_suffix": ".csv", # str | set, optional the suffix or the set of suffices such that the annotation files are named {video_id}{annotation_suffix}, e.g, video1_suffix.pickle where video1 is the video id and _suffix.pickle is the suffix
        },
        "general": {
            "exclusive": True, # bool; if true, single-label classification is used; otherwise multi-label
        },
        "training": {
            "num_epochs": 15,
        }
    }
)

Now we're all set and can start training models.

## Hyperparameter search

There are many hyperparameters in model training, like the number of layers in a model or loss coefficients. The default settings for those parameters should generate reasonable results on most datasets but in order to get the most out of our data we can run a hyperparameter search.

The easiest way to find a good set of hyperparameters for your data is to run `project.run_default_hyperparameter_search()`.

In [None]:
project.run_default_hyperparameter_search(
    "test_search",
    num_epochs=10,
    n_trials=5,
)

## Training models

Now we can train models with the best hyperparameters.

In [None]:
project.run_episode(
    "test_best",
    load_search="test_search", # loading the search
    force=True, # when force=True, if an episode with this name already exists it will be overwritten -> use with caution!
)

## Evaluation

Now that we've trained our best models, we can analyze the results.

In [None]:
project.plot_episodes(
    ["test_best"],
    metrics=["f1"], # F1 score
    title="Best model training curve"
)

We can also check out more metrics now. See `project.help("metrics")` to see other options.

In [None]:
project.evaluate(
    ["test_best"],
    parameters_update={
        "general": {"metric_functions": ["segmental_f1", "mAP", "f1"]},
        "metrics": {
            "f1": {"average": "none"}
        }
    }
)

## Using trained models

When you find that you are happy with the results, you can use the model to generate predictions for new data. 

Predictions here are probabilities of each behavior being seen in each frame while suggestions are suggested intervals generated from those probabilities.

Let's generate a prediction with one of our models and look at one of the resulting files. Note that you can use multiple models and average over their predictions.

In [None]:
project.run_prediction(
    "test_best_prediction",
    episode_names=[f"test_best"],
    force=True
)

In [None]:
import pickle
import os


# picking a random file from the prediction folder
prediction_folder = project.prediction_path("test_best_prediction")
prediction_file = os.listdir(prediction_folder)[0]
prediction_file = os.path.join(prediction_folder, prediction_file)

with open(prediction_file, "rb") as f: # open the file
    prediction = pickle.load(f)

for key, value in prediction.items(): # explore the contents
    if key not in ["max_frames", "min_frames", "video_tag", "behaviors"]:
        print(f'{key}: {value.shape}')
    
behaviors_order = prediction["behaviors"]

start = 50
end = 70
action = "Unsupported"

index = behaviors_order.index(action)

print(f'The mean probability of {action} between frames {start} and {end} is {prediction["ind0"][index, start: end].mean()}')

We will now remove unnecessary data to clean the memory.

In [None]:
project.remove_saved_features()
project.remove_extra_checkpoints()