# DataJoint Element for Pose Estimation with DeepLabCut

**Open-source Data Pipeline for Markerless Pose Estimation in Neurophysiology**

This tutorial focuses on providing a comprehensive understanding of the open-source data pipeline offered by `Element-DeepLabCut`. 

![pipeline](../images/flowchart.svg)

The package is designed to facilitate pose estimation analyses and streamline the organization of data using `DataJoint`. 

![pipeline](../images/pipeline.svg)

By the end of this tutorial, participants will have a clear grasp of how to set up, utilize, ad optimize the package for their specific pose estimation projects. 

**Key Components and Objectives**

- Setup

- Design the DataJoint Pipeline

- Step 1 - Register an existing model in DataJoint pipeline

- Step 2 - Insert Subject, Session, and Behavior Videos

- Step 3 - DLC inference task

- Step 4 - Visualization of results

For detailed documentation and tutorials on general DataJoint principles that support collaboration, automation, reproducibility, and visualizations:

[`DataJoint for Python - Interactive Tutorials`](https://github.com/datajoint/datajoint-tutorials) - Fundamentals including table tiers, query operations, fetch operations, automated computations with the make function, etc.

[`DataJoint for Python - Documentation`](https://datajoint.com/docs/core/datajoint-python/0.14/)

[`DataJoint Element for DeepLabCut - Documentation`](https://datajoint.com/docs/elements/element-deeplabcut/0.2/)

## Setup

#### Steps to run the Element¶

The Element assumes you:

- Have a DLC project folder on your machine
- Have labeled data in your DLC project folder

This tutorial includes a DLC project folder with example data and its results in `example_data`. In the following tutorial consists of studying the behavior of a freely-moving mouse in an open-field environment. The objective is to extract pose estimations of the animal's head and tail base from video footage. This information can provide valuable insights into the animal's movements, postures, and interactions within the environment. The results of this Element example could be combined with other modalities to assemble a complete pipeline. 

After running this tutorial, you can try `Element-DeepLabCut` with your own dataset. To do so, create a new `DeepLabCut` folder with your own videos and a training dataset. Then, remember to change the path in the configuration file (`config.yaml`) in your new `DeepLabCut project` folder accordingly.

#### Challenges
**Complex Background**: The open field environment introduces complex backgrounds and varying lighting conditions, making accurate pose estimation challenging.

**Multiple Body Parts**: Extracting the pose of multiple body parts (head, tail) adds complexity to the analysis due to potential occlusions and variations in appearance.

**Data Management**: Managing the large volume of video data generated in the field and ensuring consistent annotation requires an efficient data pipeline.

### Expected Outcomes
Upon completing this tutorial, you will have acquired practical proficiency in employing the `Element-DeepLabCut` package to effectively tackle the complexities of pose estimation. 

This tutorial and sample dataset will serve as a practical foundation for your learning journey with the Element package, enabling you to apply these techniques to your own research projects. 

By integrating this element package with other Elements of DataJoint, you unlock a powerful data pipeline that provides numerous benefits for your research workflow. 

In [None]:
import os
if os.path.basename(os.getcwd())=='notebooks': os.chdir('..')
assert os.path.basename(os.getcwd())=='element-deeplabcut', ("Please move to the "
                                                              + "element directory")

First start by importing the packages necessary to run this pipeline.

In [None]:
import datajoint as dj
from pathlib import Path
import yaml

Let's connect to the database server. 

In [None]:
dj.conn()

## Design the DataJoint Pipeline

### Combine multiple Elements into a pipeline

Each DataJoint Element is a modular set of tables that can be combined into a complete pipeline.

Each Element contains one or more modules, and each module declares its own schema in the database. Schemas are conceptually related sets of tables. 

This tutorial pipeline is assembled from four DataJoint Elements.

| Element | Source Code | Documentation | Description |
| -- | -- | -- | -- |
| Element Lab | [Link](https://github.com/datajoint/element-lab) | [Link](https://datajoint.com/docs/elements/element-lab) | Lab management related information, such as Lab, User, Project, Protocol, Source. |
| Element Animal | [Link](https://github.com/datajoint/element-animal) | [Link](https://datajoint.com/docs/elements/element-animal) | General subject meta data, genotype, and surgery information. |
| Element Session | [Link](https://github.com/datajoint/element-session) | [Link](https://datajoint.com/docs/elements/element-session) | General information of experimental sessions. |
| Element DeepLabCut | [Link](https://github.com/datajoint/element-deeplabcut) | [Link](https://datajoint.com/docs/elements/element-deeplabcut) | DataJoint schemas (Train and Model) for storing and running analysis of markerless pose estimation with DeepLabCut.

The Elements are imported and activated in the next code cell.

In [None]:
from tutorial_pipeline import lab, subject, session, train, model  

By importing the modules for the first time, the schemas and tables will be created in the database.  

In [None]:
dj.list_schemas()

In [None]:
dj.config

In [None]:
(
    dj.Diagram(subject) 
    + dj.Diagram(lab) 
    + dj.Diagram(session) 
    + dj.Diagram(model) 
    + dj.Diagram(train)
)

In [None]:
dj.Diagram(model) + dj.Diagram(train)

## Step 1 - Register an existing model in DataJoint pipeline

A DeepLabCut model is defined in a DLC-specific folder structure with a file named `config.yaml` that contains the specifications of a DLC model.

To "register" this DLC model with DataJoint, you can just specify this config file. See example below

In [None]:
config_file_rel = "./example_data/inbox/from_top_tracking-DataJoint-2023-10-11/config.yaml"

In [None]:
model.Model.insert_new_model(model_name='from_top_tracking_model_test',
                             dlc_config=config_file_rel,
                             shuffle=1,
                             trainingsetindex=0,
                             model_description='Model in example data: from_top_tracking model')

## Step 2 - Insert Subject, Session, and Behavior Videos

In [None]:
subject.Subject()

In [None]:
# Subject and Session tables
subject.Subject.insert1(
    dict(
        subject="subject6",
        sex="F",
        subject_birth_date="2020-01-01",
        subject_description="hneih_E105",
    ),
    skip_duplicates=True,
)

In [None]:
#Definition of the dictionary named "session_keys"
session_keys = [
    dict(subject="subject6", session_datetime="2021-06-02 14:04:22"),
    dict(subject="subject6", session_datetime="2021-06-03 14:43:10"),
]

#Insert this dictionary in the Session table
session.Session.insert(session_keys, skip_duplicates=True)
session.Session()

In [None]:
### VideoRecording
recording_key = {'subject': 'subject6',
       'session_datetime': '2021-06-02 14:04:22',
       'recording_id': '1'}
model.VideoRecording.insert1({**recording_key, 'device': 'Camera1'}, skip_duplicates=True)

In [None]:
### VideoRecording.File

video_files = ["./example_data/inbox/from_top_tracking-DataJoint-2023-10-11/videos/test.mp4"]

model.VideoRecording.File.insert({
    **recording_key, 
    'file_id': v_idx, 
    'file_path': Path(f)} for v_idx, f in enumerate(video_files))

In [None]:
### RecordingInfo
model.RecordingInfo.populate()
model.RecordingInfo()

Element DeepLabCut has the capability to train a new model as well. To train the network, we need to add the parameter set (`TrainingParamSet`) of the model training (`train`). 

dj.Diagram(train)

## Step 3 - DLC inference task

In [None]:
recording_key

In [None]:
task_key = {**recording_key, 'model_name': 'from_top_tracking_model_test'}

In [None]:
model.PoseEstimationTask.insert1(
    {**task_key,
     'task_mode': 'load',
     'pose_estimation_output_dir': './example_data/outbox/from_top_tracking-DataJoint-2023-10-11/videos/device_1_recording_1_model_from_top_tracking_100000_maxiters'
     })

In [None]:
### PoseEstimation
model.PoseEstimation.populate()

In [None]:
### Results
model.PoseEstimation.BodyPartPosition()

In [None]:
df = (model.PoseEstimation.BodyPartPosition & task_key).fetch(format='frame').reset_index()

In [None]:
df

In [None]:
df = df.explode(['frame_index', 'x_pos', 'y_pos', 'likelihood']).reset_index()
df

## Step 4 - Visualization of results

In [4]:
import pandas as pd
df_xy = df.iloc[:, df.columns.get_level_values(2).isin(["x", "y"])][model.Model.fetch1("model_name")]
df_xy.mean()
df_xy.plot().legend(loc="right")

NameError: name 'df' is not defined

In [None]:
df_flat = df_xy.copy()
df_flat.columns = df_flat.columns.map('_'.join)


In [None]:
import matplotlib.pyplot as plt 
fig,ax=plt.subplots()
df_flat.plot(x='Head_x',y='Head_y', ax=ax)
df_flat.plot(x='Tailbase_x',y='Tailbase_y', ax=ax)