# ⚡ Generate a lightning-pose (LP) dataset and train a LP model ⚡
This notebook shows how to convert the DeepLabCut (DLC) project to Lightning-Pose (LP) format and train LP model step by step. 

Here, we take Han's bottom-view DLC project and VBN project for example.
<!-- * [Environment setup](#Environment-setup) -->
* [Data preparation](#Data-preparation)
<!-- * [Monitor optimization in real time (via TensorBoard UI)](#Monitor-training) -->
* [Training](#Training)


<div class="alert alert-block alert-info">
    
<b>Materials for Lightning Pose:</b>
    
- [Paper](https://www.biorxiv.org/content/10.1101/2023.04.28.538703v1) shows a detailed mathematical description of the LP algorithm.

- [GitHub](https://github.com/danbider/lightning-pose) and [Documentation](https://lightning-pose.readthedocs.io/en/latest/index.html) show how to implement LP.

- Reference for this notebook at [here](https://github.com/danbider/lightning-pose/blob/7da5b5e701cb315ffd6d3ac8847191ee6715c46e/scripts/litpose_training_demo.ipynb).

<div class="alert alert-block alert-info">
<b>Make sure to attach the data asset:</b>

To do so, go to data/:
* click the "Manage Data Assets" button 
* for Han's data: search "han_video_s3"
* for VBN data: search "vbn_dlc_all_4"
</div>

In [1]:
import hydra
from omegaconf import DictConfig, OmegaConf
import os
import lightning.pytorch as pl

from lightning_pose.utils import pretty_print_cfg
from lightning_pose.utils.io import (
    return_absolute_data_paths,
)
from lightning_pose.utils.scripts import (
    get_data_module,
    get_dataset,
    get_imgaug_transform,
    get_loss_factories,
    get_model,
    get_callbacks,
    calculate_train_batches,
)

from pathlib import Path
import pandas as pd
import numpy as np
from PIL import Image

import yaml
import os
from datetime import datetime
import shutil


from funcs import (
    get_keypoint_names,
    get_videos_in_dir,
    mask_df,
    closest_multiple128,
    dlc2lp
)

# Data preparation


**To create a LP dataset, follow these steps:**

- [Step 1: Converting the DLC project to Lightning Pose format](#Step-1:-Converting-the-DLC-project-to-Lightning-Pose-format)
- [Step 2: Update the yaml config file](#Step-2:-update-the-yaml-config-file)
- [Step 3: Check if the training data exist](#Step-3:-Check-if-the-training-data-exist)


### Step 1: Converting the DLC project to Lightning Pose format

**DeeplabCut assumes the following project directory structure:**
```console
    /path/to/DLC_project/
      ├── labeled-data/
      └── videos/
```
* `labeled-data`: This directory stores the frames used to create the DLC training dataset. Frames from different videos are stored in separate subdirectories. Each frame has a filename related to the temporal index within the corresponding video, which allows the user to trace every frame back to its origin.

* `videos`: Directory of video links or videos. 

An example of DLC project is `/root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/`



**Lightning Pose assumes the following [project directory structure](https://lightning-pose.readthedocs.io/en/latest/source/user_guide/directory_structure.html)**, as in the example dataset
provided in [mirror-mouse](https://github.com/danbider/lightning-pose/tree/main/data/mirror-mouse-example).
```console
    /path/to/LP_project/
      ├── <LABELED_DATA_DIR>/
      ├── <VIDEO_DIR>/
      └── <YOUR_LABELED_FRAMES>.csv
```
* `<YOUR_LABELED_FRAMES>.csv`: a table with keypoint labels (rows: frames; columns: keypoints). 
Note that this file can take any name, and needs to be specified in the config file under 
`data.csv_file`.

* `<LABELED_DATA_DIR>/`: contains images that correspond to the labels, and can include subdirectories.
The directory name, any subdirectory names, and image names are all flexible, as long as they are
consistent with the first column of `<YOUR_LABELED_FRAMES>.csv`.

* `<VIDEO_DIR>/`: when training semi-supervised models, the videos in this directory will be used 
for computing the unsupervised losses. This directory can take any name, and needs to be specified 
in the config file under `data.video_dir`.

Let's convert a DLC project to LP format

In [2]:
# ----------------------------------------------------------------------------------
# set up the path to DLC project, LP training data and LP outputs 
# ----------------------------------------------------------------------------------

# use Han's data
# set up the path to the DLC project
scorer_name  = "Han_behavior_data"   
project_name = "Foraging_Bot-Han_Lucas-2022-04-27" 
DLC_data_dir = os.path.join("/root/capsule/data/s3_video/DLC_projects", project_name)

# columns_to_pick indicates which keypoints included in training
# if columns_to_pick is null, using all the keypoints for training 
columns_to_pick = []
# assume dlc format
header_rows = [0, 1, 2]


# # use VBN data
# scorer_name  = "VBN_behavior_DLC"
# scorer_name  = "VBN_behavior_DLC_test"
# project_name = "face"
# DLC_data_dir = os.path.join("/root/capsule/data/vbn_dlc_all_4", project_name)
# columns_to_pick = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 23, 24, 27, 28, 39]
# header_rows = [0, 1, 2]



In [3]:
# ---------------------------------------
# get the videos which contain labeled frames
# ---------------------------------------
DLC_video_dir   = os.path.join(DLC_data_dir, 'videos/')

DLC_video_files = get_videos_in_dir(DLC_video_dir)
video_names = []
for video_file in DLC_video_files:
    video_names.append(video_file.split('/')[-1][:-4])
print(f"Video names: {video_names}")
print(f"The number of videos: {len(video_names)}")



video_dir: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/
Video names: ['bottom_face_888-0000', 'bottom_face_6-0000', 'bottom_face_670-0000', 'bottom_face_115-0000', 'bottom_face_49-0000', 'bottom_face_533-0000', 'bottom_face_1-0000', 'bottom_face_186-0000', 'bottom_face_593-0000', 'bottom_face_110-0000', 'bottom_face_857-0000', 'bottom_face_37-0000', 'bottom_face_484-0000', 'bottom_face_5-0000', 'bottom_face_521-0000', 'bottom_face_718-0000', 'bottom_face_137-0000', 'bottom_face_164-0000', 'bottom_face_65-0000', 'bottom_face_116-0000', 'bottom_face_171-0000', 'bottom_face_205-0000', 'bottom_face_866-0000', 'bottom_face_114-0000', 'bottom_face_20-0000', 'bottom_face_101-0000', 'bottom_face_861-0000', 'bottom_face_41-0000', 'bottom_face_197-0000']
The number of videos: 29


In [4]:
# ---------------------------------------
# select videos you want to train on
# ---------------------------------------

# for Han's data
videos_picked = ['bottom_face_1-0000',
                 'bottom_face_5-0000',
                 'bottom_face_20-0000',
                 'bottom_face_49-0000',
                 'bottom_face_116-0000',
                 'bottom_face_484-0000',
                 'bottom_face_533-0000',
                 'bottom_face_670-0000']

# # for VBN data
# videos_picked = ['1128520325_585326_20210915.face', 
#                  '1122903357_570302_20210818.face', 
#                  '1052533639_530862_20200924.face', 
#                  ]

# if videos_picked is null, using all labels instead of only using the labels in selected videos
if len(videos_picked) == 0:
    videos_picked = video_names
num_videos = len(videos_picked)


In [5]:
# ---------------------------------------
# set up the path to the LP data and outputs
# ---------------------------------------
# LP_data_dir = os.path.join("/root/capsule/results",
#                             project_name)
# Path(LP_data_dir).mkdir(parents=True, exist_ok=True)

LP_data_dir = os.path.abspath("../results")
print(f"Save LP model to {LP_data_dir}")
LP_output_dir = os.path.join(LP_data_dir, "outputs")
Path(LP_output_dir).mkdir(parents=True, exist_ok=True)


Save LP model to /root/capsule/results


In [6]:
print(LP_data_dir)
! ls -la $LP_data_dir

/root/capsule/results
lrwxrwxrwx. 1 root root 8 Aug 26 16:50 /root/capsule/results -> /results


In [7]:
# ---------------------------------------
# Convert DLC project to LP format
# ---------------------------------------
print(f"Converting a DLC project to LP format .....")

# Call dlc2lp() to generate the LP dataset with the following directory struture
#     /path/to/LP_project/
#       ├── <LABELED_DATA_DIR>/
#       ├── <VIDEO_DIR>/
#       └── <YOUR_LABELED_FRAMES>.csv

# set up the path to <YOUR_LABELED_FRAMES>.csv
# LP_labels_file_all contains all the bodyparts
LP_labels_file_all = os.path.join(LP_data_dir, f"CollectedData.csv")

model_name = f"trained_with_{num_videos}videos" # the name for LP model
df_data = dlc2lp(DLC_data_dir, 
               LP_data_dir, 
               model_name, 
               LP_labels_file_all,
               videos_picked)


Converting a DLC project to LP format .....

Converting DLC project located at /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27 to LP project located at /root/capsule/results
Start generating <YOUR_LABELED_FRAMES>.csv!
---- bottom_face_1-0000 ----
csv_file:/root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/labeled-data/bottom_face_1-0000/CollectedData_Han_Lucas.csv
The number of keypoint: 17
---- bottom_face_5-0000 ----
csv_file:/root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/labeled-data/bottom_face_5-0000/CollectedData_Han_Lucas.csv
The number of keypoint: 17
---- bottom_face_20-0000 ----
csv_file:/root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/labeled-data/bottom_face_20-0000/CollectedData_Han_Lucas.csv
The number of keypoint: 17
---- bottom_face_49-0000 ----
csv_file:/root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/labeled-data/bottom_face_49-0000/Collec

                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_1-0000.mp4
Working on: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/bottom_face_5-0000.avi
Converting avi video files to be mp4 format!
outputfile: /root/capsule/results/videos/bottom_face_5-0000.mp4
Moviepy - Building video /root/capsule/results/videos/bottom_face_5-0000.mp4.
Moviepy - Writing video /root/capsule/results/videos/bottom_face_5-0000.mp4



                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_5-0000.mp4
Working on: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/bottom_face_20-0000.avi
Converting avi video files to be mp4 format!
outputfile: /root/capsule/results/videos/bottom_face_20-0000.mp4
Moviepy - Building video /root/capsule/results/videos/bottom_face_20-0000.mp4.
Moviepy - Writing video /root/capsule/results/videos/bottom_face_20-0000.mp4



                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_20-0000.mp4
Working on: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/bottom_face_49-0000.avi
Converting avi video files to be mp4 format!
outputfile: /root/capsule/results/videos/bottom_face_49-0000.mp4
Moviepy - Building video /root/capsule/results/videos/bottom_face_49-0000.mp4.
Moviepy - Writing video /root/capsule/results/videos/bottom_face_49-0000.mp4



                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_49-0000.mp4
Working on: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/bottom_face_116-0000.avi
Converting avi video files to be mp4 format!
outputfile: /root/capsule/results/videos/bottom_face_116-0000.mp4
Moviepy - Building video /root/capsule/results/videos/bottom_face_116-0000.mp4.
Moviepy - Writing video /root/capsule/results/videos/bottom_face_116-0000.mp4



                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_116-0000.mp4
Working on: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/bottom_face_484-0000.avi
Converting avi video files to be mp4 format!
outputfile: /root/capsule/results/videos/bottom_face_484-0000.mp4
Moviepy - Building video /root/capsule/results/videos/bottom_face_484-0000.mp4.
Moviepy - Writing video /root/capsule/results/videos/bottom_face_484-0000.mp4



                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_484-0000.mp4
Working on: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/bottom_face_533-0000.avi
Converting avi video files to be mp4 format!
outputfile: /root/capsule/results/videos/bottom_face_533-0000.mp4
Moviepy - Building video /root/capsule/results/videos/bottom_face_533-0000.mp4.
Moviepy - Writing video /root/capsule/results/videos/bottom_face_533-0000.mp4



                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_533-0000.mp4
Working on: /root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/videos/bottom_face_670-0000.avi
Converting avi video files to be mp4 format!
outputfile: /root/capsule/results/videos/bottom_face_670-0000.mp4
Moviepy - Building video /root/capsule/results/videos/bottom_face_670-0000.mp4.
Moviepy - Writing video /root/capsule/results/videos/bottom_face_670-0000.mp4



                                                                 

Moviepy - Done !
Moviepy - video ready /root/capsule/results/videos/bottom_face_670-0000.mp4
Finish generating <VIDEO_DIR>/!
---------------------------------------------------------------------------

Start generating <LABELED_DATA_DIR>/!
Finish generating <LABELED_DATA_DIR>/!
---------------------------------------------------------------------------

The number of labeled frames: 154


In [8]:
print(LP_data_dir)
! ls -la $LP_data_dir

/root/capsule/results
lrwxrwxrwx. 1 root root 8 Aug 26 16:50 /root/capsule/results -> /results


In [9]:
# ---------------------------------------
# select keypoints to train
# we could include all keypoints or certain keypoints for training 
# LP_labels_file_all: <.csv> inludes all keypoints
# LP_labels_file_masked: <.masked.csv> only contains the columns of selected keypoints
# ---------------------------------------

# get the keypoints names
keypoint_names = get_keypoint_names(LP_labels_file_all, header_rows)
print(f"Keypoint names: {keypoint_names}, {len(keypoint_names)}")

# select keypoints for training
if len(columns_to_pick) == 0:
    keypoint_to_pick = keypoint_names # if columns_to_pick is null, using all the keypoints for training 
else:
    keypoint_to_pick = [ keypoint_names[col] for col in columns_to_pick ]
print(f"\nSelected keypoints: {keypoint_to_pick}, {len(keypoint_to_pick)}")

# extract the columns of selected keypoints
df_data_masked = mask_df(LP_labels_file_all, header_rows, keypoint_to_pick)

# save masked annotation files which only contains the labels of the selected keypoints
LP_labels_file_masked = LP_labels_file_all.replace(".csv", ".masked.csv")
df_data_masked.to_csv(LP_labels_file_masked)
print(f"\nSave masked LP label file to {LP_labels_file_masked}")

# ---------------------------------------
# set up the LP annotation file
# ---------------------------------------
LP_labels_file = LP_labels_file_masked 
# LP_labels_file = LP_labels_file_all
print(f"\nThe LP label file for training: {LP_labels_file}")

Keypoint names: ['tongueTip', 'tongueLeftFront', 'tongueRightFront', 'tongueLeftBack', 'tongueRightBack', 'LickportLeft', 'LickportRight', 'nosetip', 'jaw', 'pawL', 'pawR', 'WLup', 'WLmid', 'WLbot', 'WRup', 'WRmid', 'WRbot'], 17

Selected keypoints: ['tongueTip', 'tongueLeftFront', 'tongueRightFront', 'tongueLeftBack', 'tongueRightBack', 'LickportLeft', 'LickportRight', 'nosetip', 'jaw', 'pawL', 'pawR', 'WLup', 'WLmid', 'WLbot', 'WRup', 'WRmid', 'WRbot'], 17

Save masked LP label file to /root/capsule/results/CollectedData.masked.csv

The LP label file for training: /root/capsule/results/CollectedData.masked.csv


In [10]:
# ---------------------------------------
# change the working dir
# ---------------------------------------
# LP_output_dir = os.getcwd()
%pwd
%cd $LP_output_dir
%pwd
! ls -lt $LP_output_dir
print(f"\nStore training and testing results to {LP_output_dir}")


/results/outputs
total 0

Store training and testing results to /root/capsule/results/outputs


### Step 2: update the yaml config file

After generating LP dataset, you will need to update your config file with the correct paths. This file points to data directories, defines the type of models to fit, and specifies a wide range of hyperparameters. The default configuration file at [here](https://github.com/danbider/lightning-pose/blob/main/scripts/configs/config_default.yaml) enumerates all possible hyperparameters needed for building and training a model. See [here](https://lightning-pose.readthedocs.io/en/latest/source/user_guide/config_file.html) for more information.

**To create the yaml config file, follow these steps:**
* [(1) update the path to the training data](#(1)-update-the-path-to-the-training-data)
* [(2) update the testing video path](#(2)-update-the-testing-video-path)
* [(3) update the image dimensions ](#(3)-update-the-image-dimensions)
* [(4) update the keypoint info](#(4)-update-the-keypoint-info)
* [(5) set up training parameters](#(5)-set-up-training-parameters)
* [(6) set up unsupervised losses](#(6)-set-up-unsupervised-losses)
<!-- * [(7) set up the fully-supervised training](#(7)-set-up-fully-supervised-training) -->
* [(7) (Optional) transfer learning: load a pretrained model](#(7)-(Optional)-Transfer-learning:-load-a-pretrained-model)
* [(8) save the updated LP config file](#(8)-save-the-updated-LP-config-file)

<div class="alert alert-block alert-info"> 
 Below is a list of some commonly modified arguments in a LP config file related to model architecture/training. When training a model on a new dataset, you should copy/paste the default config and update the
arguments to match your data. 

  - data.csv_file: location of labels
- data.video_dir: location of unlabeled videos
- data.num_keypoints: total number of keypoints
- data.columns_for_singleview_pca: list of indices of keypoints used for pca singleview loss
<br/><br/>

- training.train_batch_size (default: `16`) - batch size for labeled data
- training.train_prob (default: `0.8`) - fraction of labeled data used for training
- training.val_prob (default: `0.1`) - fraction of labeled data used for validation (remaining used for test)
- training.min_epochs (default: `300`)
- training.max_epochs (default: `750`)
<br/><br/>

- model.model_type (default: `heatmap`)
  - regression: model directly outputs an (x, y) prediction for each keypoint; not recommended
  - heatmap: model outputs a 2D heatmap for each keypoint
  - heatmap_mhcrnn: the "multi-head convolutional RNN", this model takes a temporal window of
    frames as input, and outputs two heatmaps: one "context-aware" and one "static". The prediction
    with the highest confidence is automatically chosen. Must also set `model.do_context=True`.
- model.losses_to_use (default: `[]`) - this argument relates to the unsupervised losses. An empty
  list indicates a fully supervised model. Each element of the list corresponds to an unsupervised
  loss. For example,
  `model.losses_to_use=[pca_multiview,temporal]` will fit both a pca_multiview loss and a temporal
  loss. Options include:
  - pca_multiview: penalize inconsistencies between multiple camera views
  - pca_singleview: penalize implausible body configurations
  - temporal: penalize large temporal jumps
<br/><br/>

- eval.test_videos_directory - str with an absolute path to a directory containing videos for prediction.
- eval.confidence_thresh_for_vid (default: `0.9`) - confidence threshold for plotting a vid.



In [12]:
# load LP default config file, and update parameter wrt your own behavior data
LP_config_template = "/code/configs/config_default.yaml"
with open(LP_config_template, 'r') as file:
    param_updated = yaml.safe_load(file)

#### (1) update the path to the training data

In [13]:
# absolute path to a directory containing LP labeled frames. 
# Frames from different videos are stored in separate subdirectories.
param_updated['data']["data_dir"] = LP_data_dir 

# absolute path to the LP annotation file. 
# Each frame has a filename related to the temporal index within the corresponding video, 
# which allows the user to trace every frame back to its origin.
param_updated['data']["csv_file"] = LP_labels_file

# absolute path to a directory containing videos for training
LP_video_dir = os.path.join(LP_data_dir, 'videos/')
param_updated['data']["video_dir"] = LP_video_dir


#### (2) update the testing video path

In [14]:
# absolute path to a directory containing videos for prediction.
param_updated['eval']['test_videos_directory'] = LP_video_dir

#### (3) update the image dimensions 

In [15]:
# labels_df['scorer'].to_list()

In [16]:
# load ground truth
labels_df = pd.read_csv(LP_labels_file)

# get the absolute path to the first labeled frame 
frame_1st = os.path.join( LP_data_dir, labels_df['scorer'].to_list()[2])
print(f"The path to the first labeled frame:{frame_1st}")

# get the image dimension
image = Image.open(frame_1st).convert("RGB")
# set up resize dimension, LP requires its a multiple of 128 to accelerate training.
# Optional: limit image size to 640 to avoid OOM
param_updated['data']["image_resize_dims"]["width"]  = closest_multiple128(image.size[0]) if image.size[0] < 640 else 640
param_updated['data']["image_resize_dims"]["height"] = closest_multiple128(image.size[1]) if image.size[1] < 640 else 640


The path to the first labeled frame:/root/capsule/results/labeled-data/bottom_face_1-0000/img0026.png


#### (4) update the keypoint info

In [17]:
# get the keypoints names
keypoint_names = get_keypoint_names(LP_labels_file, header_rows)
num_bodyparts  = len(keypoint_names)
print(f"keypoint names: {keypoint_names}")
print(f"The number of keypoints: {num_bodyparts}")

param_updated['data']["num_keypoints"] = num_bodyparts


keypoint names: ['tongueTip', 'tongueLeftFront', 'tongueRightFront', 'tongueLeftBack', 'tongueRightBack', 'LickportLeft', 'LickportRight', 'nosetip', 'jaw', 'pawL', 'pawR', 'WLup', 'WLmid', 'WLbot', 'WRup', 'WRmid', 'WRbot']
The number of keypoints: 17


#### (5) set up training parameters

If training frames include both visible and occluded keypoint, Lightning pose will output the confidence value always close to 1 for occluded keypoints (i.e, tongue). 
To address this issue, LP has a non-default option that includes missing data in the loss by comparing the predicted heatmap to a uniform heatmap. 
To set the non-default option, add the following option to your config yaml file (under the "training" key):
```
training:
    uniform_heatmaps_for_nan_keypoints: true
```
See [here](https://lightning-pose.readthedocs.io/en/latest/source/faqs.html#faq-nan-heatmaps) for more information about why the network produce high confidence values for keypoints even when they are occluded.

In [18]:
param_updated["training"]["uniform_heatmaps_for_nan_keypoints"] = True

#### (6) set up unsupervised losses 
For a detailed mathematical description of the losses, see the [Lightning Pose paper](https://www.biorxiv.org/content/10.1101/2023.04.28.538703v1). 

See [here](https://lightning-pose.readthedocs.io/en/latest/source/user_guide_advanced/unsupervised_losses.html) for more details on how to use unsupervised losses (i.e., Temporal continuity, Pose plausibility and Multiview consistency) and set up hyperparameters.

If you plan to use the PCA losses (Pose PCA or multiview PCA) then all training images must be the same size, otherwise the PCA subspace will erroneously contain variance related to image size, see [here](https://github.com/danbider/lightning-pose/blob/47ee289110fb2ef2519091b49a5658fab07b4bf4/docs/source/user_guide/config_file.rst#L28) for more details.


To apply unsupervised losses on unlabeled video data, model.losses_to_use must be non-empty (which indicates a fully supervised model). 

In [19]:
# param_updated['model']["losses_to_use"] = ["pca_singleview"] # or multiple losses: [temporal,pca_singleview]

columns_for_singleview_pca: list of indices of keypoints used for pca singleview loss.

Ensure the number of samples is greater than the obervation dimensions (have more rows than columns after doing nan filtering)!
Since each keypoint is 2-dimensional (x, y coords), if there are K keypoints labeled on each frame then each pose 
is described by a 2K-dimensional vector. Therefore, at least 2K frames need to be labeled to compute the PCA subspace.

If the error massage "cannot fit PCA with N samples < M observation dimensions" occures, 
reselect or reduce the columns_for_singleview_pca or enlarge the training data size.

It is up to the user to select which keypoints are included in the Pose plausibility loss. 
Including static keypoints (e.g. those marking a corner of an arena) are generally not helpful. 
Also be careful to not include keypoints that are often occluded, like the tongue. If these keypoints 
are included the loss will try to localize them even when they are occluded, which might be unhelpful if you 
want to use the confidence of the outputs as a lick detector.

In [20]:
for ind, name in enumerate(keypoint_names):
    print(f"{ind}.{name}")

0.tongueTip
1.tongueLeftFront
2.tongueRightFront
3.tongueLeftBack
4.tongueRightBack
5.LickportLeft
6.LickportRight
7.nosetip
8.jaw
9.pawL
10.pawR
11.WLup
12.WLmid
13.WLbot
14.WRup
15.WRmid
16.WRbot


In [21]:
# param_updated['data']["columns_for_singleview_pca"] = [ i for i in range(num_bodyparts) ] 
# The numbers used should correspond to the order of the keypoints in the labeled csv file. 
param_updated['data']["columns_for_singleview_pca"] = [ 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 ] 

#### (7) (Optional) Transfer learning: load a pretrained model

model.checkpoint: to initialize weights from an existing checkpoint, update this parameter to the absolute path of a pytorch .ckpt file,
see [here](https://github.com/danbider/lightning-pose/blob/47ee289110fb2ef2519091b49a5658fab07b4bf4/docs/source/user_guide/config_file.rst#L28).
If you would like to train a model from scratch, just comment the following cell.

In [22]:
# pretrained_model_path = "/root/capsule/data/pretrained_models/Han_behavior_data/Foraging_Bot-Han_Lucas-2022-04-27/epoch=99-step=800_8videos.ckpt"
# param_updated['model']["checkpoint"] = pretrained_model_path

#### (8) save the updated LP config file

In [23]:
# absolute path to save updated LP config file
LP_config_file = os.path.join(LP_data_dir,  
                              'training.config.yaml') 

# save  
with open(LP_config_file, 'w') as yaml_file:
    yaml.dump(param_updated, yaml_file)

In [24]:
LP_config_file

'/root/capsule/results/training.config.yaml'

### Step 3: Check if the training data exist

```console
    /path/to/LP_project/
      ├── <LABELED_DATA_DIR>/
      ├── <VIDEO_DIR>/
      └── <YOUR_LABELED_FRAMES>.csv
```

In [25]:
# load config file
cfg = OmegaConf.load(LP_config_file)

# print("Our Hydra config file:")
# pretty_print_cfg(cfg)

# path handling for the dataset
data_dir, video_dir = return_absolute_data_paths(data_cfg=cfg.data)

# <LABELED_DATA_DIR>: cfg.data.data_dir, the absolute path to the labeled frames
assert os.path.isdir(cfg.data.data_dir), "data_dir not a valid directory"

# <VIDEO_DIR>: cfg.data.video_dir, the absolute path to the videos
assert os.path.isdir(cfg.data.video_dir), "video_dir not a valid directory"

# <YOUR_LABELED_FRAMES>.csv: cfg.data.csv_file, the absolute path to the annotation file
df_tmp = pd.read_csv(cfg.data.csv_file, header=header_rows, index_col=0)
for img in df_tmp.index:
    assert os.path.exists(os.path.join(cfg.data.data_dir, img))
    
print(f"LP training data: {data_dir}")
print(f"LP training videos: {video_dir}")
print(f"LP annotation file: {cfg.data.csv_file}")

LP training data: /root/capsule/results
LP training videos: /root/capsule/results/videos/
LP annotation file: /root/capsule/results/CollectedData.masked.csv


# Training

In [26]:
! pwd

/results/outputs


In [27]:
# build dataset, model, and trainer

# make training short for a demo (we usually do 300)
# (approx 2 mins for training Han's data using fully-supervised learning with epoch=55)
# (approx 6 mins for training Han's data using semi-supervised learning (losses_to_use: ['pca_singleview']) with epoch=55)
cfg.training.min_epochs = 100
cfg.training.max_epochs = 200
cfg.training.batch_size = 8

# build imgaug transform
imgaug_transform = get_imgaug_transform(cfg=cfg)

# build dataset
dataset = get_dataset(cfg=cfg, data_dir=data_dir, imgaug_transform=imgaug_transform)

# build datamodule; breaks up dataset into train/val/test
data_module = get_data_module(cfg=cfg, dataset=dataset, video_dir=video_dir)

# build loss factory which orchestrates different losses
loss_factories = get_loss_factories(cfg=cfg, data_module=data_module)

# build model
model = get_model(cfg=cfg, data_module=data_module, loss_factories=loss_factories)


# ----------------------------------------------------------------------------------
# Set up and run training
# ----------------------------------------------------------------------------------

# logger
logger = pl.loggers.TensorBoardLogger("tb_logs", name=cfg.model.model_name)

# early stopping, learning rate monitoring, model checkpointing, backbone unfreezing
callbacks = get_callbacks(cfg)

# calculate number of batches for both labeled and unlabeled data per epoch
limit_train_batches = calculate_train_batches(cfg, dataset)

# set up trainer
trainer = pl.Trainer(
    accelerator="gpu",
    devices=1,
    max_epochs=cfg.training.max_epochs,
    min_epochs=cfg.training.min_epochs,
    check_val_every_n_epoch=cfg.training.check_val_every_n_epoch,
    log_every_n_steps=cfg.training.log_every_n_steps,
    callbacks=callbacks,
    logger=logger,
    limit_train_batches=limit_train_batches,
)


using dlc image augmentation pipeline

 Initializing a HeatmapTracker instance.
Downloading: "https://download.openmmlab.com/mmpose/animal/resnet/res50_ap10k_256x256-35760eb8_20211029.pth" to /root/.cache/torch/hub/checkpoints/res50_ap10k_256x256-35760eb8_20211029.pth


100%|██████████| 130M/130M [00:03<00:00, 45.0MB/s] 
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


In [28]:
start_time = datetime.now()

# train model!
# Train the model 
# (approx 2 mins for training Han's data using fully-supervised learning with epoch=55)
# (approx 6 mins for training Han's data using semi-supervised learning (losses_to_use: ['pca_singleview']) with epoch=55)
trainer.fit(model=model, datamodule=data_module)

end_time = datetime.now()
print('\nTraining duration: {}'.format(end_time - start_time))

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type        | Params | Mode 
----------------------------------------------------------
0 | backbone          | Sequential  | 23.5 M | train
1 | loss_factory      | LossFactory | 0      | train
2 | upsampling_layers | Sequential  | 81.0 K | train
----------------------------------------------------------
23.6 M    Trainable params
0         Non-trainable params
23.6 M    Total params
94.356    Total estimated model params size (MB)
154       Modules in train mode
0         Modules in eval mode


Number of labeled images in the full dataset (train+val+test): 154
Dataset splits -- train: 147, val: 7, test: 0


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=200` reached.



Training duration: 0:07:52.602906


In [29]:
LP_output_dir

'/root/capsule/results/outputs'

In [30]:

# Once training has completed, use the checkpoint that corresponds to the best performance you found during the training process.
# a checkpoint: a version of the model. 
# check the trained model
! ls -lt "/root/capsule/results/outputs/tb_logs/test/version_0/checkpoints/"


total 276888
-rw-r--r--. 1 root root 283529831 Aug 26 17:00 'epoch=169-step=1700-best.ckpt'


## Copy best_model to the results folder

In [31]:
hydra_output_directory = os.getcwd()
print(f"Hydra output directory: {hydra_output_directory}")

# get best ckpt
best_ckpt = os.path.abspath(trainer.checkpoint_callback.best_model_path)
print(f"Best checkpoint: {best_ckpt}")

# check if best_ckpt is a file
if not os.path.isfile(best_ckpt):
    raise FileNotFoundError("Cannot find checkpoint. Have you trained for too few epochs?")

print(f"Copying {best_ckpt} to {LP_data_dir}")
shutil.copy(best_ckpt, LP_data_dir)
print(f"Finish copying model")


Hydra output directory: /results/outputs
Best checkpoint: /results/outputs/tb_logs/test/version_0/checkpoints/epoch=169-step=1700-best.ckpt
Copying /results/outputs/tb_logs/test/version_0/checkpoints/epoch=169-step=1700-best.ckpt to /root/capsule/results
Finish copying model


### (optional) delete outputs/ folder 
Below cell is used to delete outputs/ folder to only keep one .ckpt file in the results/ for a pretrained model data asset.

In [32]:
# if os.path.exists(hydra_output_directory):
#     shutil.rmtree(hydra_output_directory)