## Position using DeepLabCut from a Pre-Trained DLC Project

**Note: make a copy of this notebook and run the copy to avoid git conflicts in the future**

This is a tutorial on how to extract position given a pre-trained DeepLabCut (DLC) model using the Spyglass pipeline used in Loren Frank's lab, UCSF. It will walk through adding your DLC model to Spyglass, executing pose estimation on a novel behavioral video, processing the pose estimation output to extract a centroid and orientation, and inserting the resulting information into the `IntervalPositionInfo` table.<br>
-> This tutorial assumes you've completed [tutorial 0](0_intro.ipynb)<br>
**Note 2: Make sure you are running this within the spyglass Conda environment**

In [None]:
from pathlib import Path, PosixPath, PurePath
import os
import numpy as np
import pandas as pd
import pynwb
import datajoint as dj
import spyglass.common as sgc
import spyglass.position.v1 as sgp
from spyglass.position import PositionOutput

#### Here is a schematic showing the tables used in this notebook.<br>
![dlc_existing.png|2000x900](./../notebook-images/dlc_existing.png)

### Table of Contents<a id='ToC'></a>
[`DLCProject`](#DLCProject)<br>
[`DLCModel`](#DLCModel)<br>
[`DLCPoseEstimation`](#DLCPoseEstimation)<br>
[`DLCSmoothInterp`](#DLCSmoothInterp)<br>
[`DLCCentroid`](#DLCCentroid)<br>
[`DLCOrientation`](#DLCOrientation)<br>
[`DLCPos`](#DLCPos)<br>
[`DLCPosVideo`](#DLCPosVideo)<br>
[`PositionOutput`](#PositionOutput)<br>

#### [DLCProject](#ToC) <a id='DLCProject'></a>
__You can click on any header to return to the Table of Contents__

Let us begin with visualizing the contents of the BodyPart table. This table will store standard names of body parts used within DLC models throughout the lab with a concise description.<br>
>*Please do not add to this table unless necessary.*

In [None]:
sgp.BodyPart()

To use an existing DLC project we can use the `insert_existing_project` method on the `DLCProject` table.<br>This function will return a dictionary that can be used to query `DLCProject` in the future and expects:<br>
>`project_name`: a short, unique, descriptive name of your project that will be referenced throughout the pipeline<br>`lab_team`: the name of your team from the Spyglass table `LabTeam`<br>`config_path`: string of the path to your existing DLC project's config.yaml<br>`bodyparts`: a list of bodyparts used in your project (optional)<br>`frames_per_video`: number of frames to extract for training from each video (optional)

In [None]:
project_name = "tutorial_DG"
lab_team = "LorenLab"
project_key = sgp.DLCProject.insert_existing_project(
    project_name=project_name,
    lab_team=lab_team,
    config_path="/nimbus/deeplabcut/projects/tutorial_model-LorenLab-2022-07-15/config.yaml",
    bodyparts=["redLED_C", "greenLED", "redLED_L", "redLED_R", "tailBase"],
    frames_per_video=200,
    skip_duplicates=True,
)

In [None]:
sgp.DLCProject() & {"project_name": project_name}

#### [DLCModel](#ToC) <a id='DLCModel'></a>

Lets take a look at the `DLCModelInput` table next. This table has `dlc_model_name` and `project_name` as primary keys and `project_path` as a secondary key. 

In [None]:
sgp.DLCModelInput()

Next we can modify the `project_key` to replace `config_path` with `project_path` to fit with the fields in `DLCModelInput`

In [None]:
print(f"current project_key:\n{project_key}")
if not "project_path" in project_key:
    project_key["project_path"] = os.path.dirname(project_key["config_path"])
    del project_key["config_path"]
    print(f"updated project_key:\n{project_key}")

Here we can set a unique name for our model using the `dlc_model_name` variable.<br>We then combine this with the updated `project_key` to insert into `DLCModelInput`.

In [None]:
dlc_model_name = "tutorial_model_DG"
sgp.DLCModelInput().insert1(
    {"dlc_model_name": dlc_model_name, **project_key}, skip_duplicates=True
)
sgp.DLCModelInput()

Inserting an entry into `DLCModelInput` will also populate `DLCModelSource`. `DLCModelSource` is a table that is used to switch between models trained using Spyglass and pre-existing projects.

In [None]:
sgp.DLCModelSource() & project_key

Notice the `source` field in the table above. It will only accept _"FromImport"_ or _"FromUpstream"_ as entries. Let's checkout the `FromImport` part table attached to `DLCModelSource` below.

In [None]:
sgp.DLCModelSource.FromImport() & project_key

Next we'll get ready to populate the `DLCModel` table, which holds all the relevant information for both pre-trained models and models trained within Spyglass.<br>First we'll need to determine a set of parameters for our model to select the correct model file.<br>We can visualize a default set below:

In [None]:
sgp.DLCModelParams.get_default()

> Here is the syntax to add your own parameter set:
>```python
dlc_model_params_name = "make_this_yours"
params = {
            "params": {},
            "shuffle": 1,
            "trainingsetindex": 0,
            "model_prefix": "",
        }
sgp.DLCModelParams.insert1({"dlc_model_params_name": dlc_model_params_name, "params": params}, skip_duplicates=True)
```

Now let's fetch the primary keys from `DLCModelSource` to make our lives a bit easier when we insert into `DLCModelSelection`.

In [None]:
temp_model_key = (sgp.DLCModelSource.FromImport() & project_key).fetch1("KEY")

And insert into `DLCModelSelection` to allow for population of `DLCModel`

In [None]:
sgp.DLCModelSelection().insert1(
    {**temp_model_key, "dlc_model_params_name": "default"}, skip_duplicates=True
)

Let's populate `DLCModel`!!

In [None]:
model_key = (sgp.DLCModelSelection & temp_model_key).fetch1("KEY")
sgp.DLCModel.populate(model_key)

And of course make sure it populated correctly

In [None]:
sgp.DLCModel() & model_key

#### [DLCPoseEstimation](#ToC) <a id='DLCPoseEstimation'></a>

<div class="alert alert-block alert-warning">
<b>
The following steps should be run on a GPU cluster</b></div>

Alright, now that we brought our trained model into Spyglass we're ready to set-up Pose Estimation on a behavioral video of your choice.<br>For this tutorial, you can choose to use an epoch of your choice, we can also use the one specified below. If you'd like to use your own video, just specify the `nwb_file_name` and `epoch` number and make sure it's in the `VideoFile` table!

In [None]:
nwb_file_name = "J1620210529_.nwb"
epoch = 2

In [None]:
sgc.VideoFile() & {"nwb_file_name": nwb_file_name, "epoch": epoch}

<div class="alert alert-block alert-info">
    <b>Setting up Pose Estimation</b><br>
<code>gputouse</code> determines which GPU core to use for pose estimation. Run the cell below to determine which core has space and set the <code>gputouse</code> variable accordingly.

In [None]:
sgp.dlc_utils.get_gpu_memory()

<div class="alert alert-block alert-warning">
Set GPU core here</div>

In [None]:
gputouse = 0  ## 0-9

To set up pose estimation, we need to make sure a few things are in order. Using `insert_estimation_task` will take care of these steps for us!<br>Briefly, it will convert out video to be in .mp4 format (DLC struggles with .h264) and determine the directory in which we'll store the pose estimation results.<br>
>**`task_mode`** determines whether or not populating `DLCPoseEstimation` runs a new pose estimation, or loads an existing. Use _'trigger'_ unless you've already run this specific pose estimation.<br>**`video_file_num`** will be 0 in almost all cases.<br>**`check_crop`** is a boolean True/False and will trigger a prompt for the user to enter the cropping coordinates. A frame of the video with coordinates will be provided for reference.

<div class="alert alert-block alert-info"> 
    <b>When prompted for crop, the behavior takes place on the left-hand maze. The coordinates I used were: <code>50, 500, 50, 800</code>. Feel free to play around with these!</b><br>

In [None]:
pose_estimation_key = sgp.DLCPoseEstimationSelection.insert_estimation_task(
    {
        "nwb_file_name": nwb_file_name,
        "epoch": epoch,
        "video_file_num": 0,
        **model_key,
    },
    task_mode="trigger",
    params={"gputouse": gputouse, "videotype": "mp4", "cropping": None},
    check_crop=True,
)

And now we populate `DLCPoseEstimation`! This might take a bit...

In [None]:
sgp.DLCPoseEstimation().populate(pose_estimation_key)

Let's visualize the output from Pose Estimation

In [None]:
(sgp.DLCPoseEstimation() & pose_estimation_key).fetch_dataframe()

#### [DLCSmoothInterp](#ToC) <a id='DLCSmoothInterp'></a>

Now that we've completed pose estimation, it's time to identify NaNs and optionally interpolate over low likelihood periods and smooth the resulting positions.<br>First we need to define some parameters for smoothing and interpolation. We can see the default parameter set below.<br>__Note__: it is recommended to use the `just_nan` parameters here and save interpolation and smoothing for the centroid step as this provides for a better end result.

In [None]:
# The default parameter set to interpolate and smooth over each LED individually
print(sgp.DLCSmoothInterpParams.get_default())

In [None]:
# The just_nan parameter set that identifies NaN indices and leaves smoothing and interpolation to the centroid step
print(sgp.DLCSmoothInterpParams.get_nan_params())
si_params_name = "just_nan"

> If you'd like to change any of these parameters, here is the syntax to do that
>```python
si_params_name = 'your_unique_param_name'
params = {
    "smoothing_params": {
        "smoothing_duration": 0.##,
        "smooth_method": "moving_avg",
    },
    "interp_params": {
        "likelihood_thresh": 0.##,
    },
    "max_plausible_speed": ###,
    "speed_smoothing_std_dev": 0.###,
}
sgp.DLCSmoothInterpParams().insert1(
    {
        'dlc_si_params_name': si_params_name,
        "params": params,
    },
    skip_duplicates=True)
```

Here we'll create a dictionary with the correct set of keys for the `DLCSmoothInterpSelection` table

In [None]:
si_key = pose_estimation_key.copy()
fields = list(sgp.DLCSmoothInterpSelection.fetch().dtype.fields.keys())
si_key = {key: val for key, val in si_key.items() if key in fields}
si_key

And now we can insert all of the bodyparts we want to process into `DLCSmoothInterpSelection`<br>
First lets visualize the bodyparts we have available to us.<br>

In [None]:
print((sgp.DLCPoseEstimation.BodyPart & pose_estimation_key).fetch("bodypart"))

We can use `insert1` to insert a single bodypart, but would suggest using `insert` to insert a list of keys with different bodyparts.

>_Syntax to insert a single bodypart_
>```python
sgp.DLCSmoothInterpSelection.insert1(
    {
        **si_key,
        'bodypart': 'greenLED',
        'dlc_si_params_name': si_params_name,
    },
    skip_duplicates=True)
```

Lets set a list of bodyparts we want to insert and then insert them into `DLCSmoothInterpSelection`.

In [None]:
bodyparts = ["greenLED", "redLED_C"]
sgp.DLCSmoothInterpSelection.insert(
    [
        {
            **si_key,
            "bodypart": bodypart,
            "dlc_si_params_name": si_params_name,
        }
        for bodypart in bodyparts
    ],
    skip_duplicates=True,
)

And to make sure that all of the bodyparts we want made it into the the selection table, we can visualize the table below.

In [None]:
sgp.DLCSmoothInterpSelection() & si_key

Now we can populate `DLCSmoothInterp`, which will perform smoothing and interpolation on all of the bodyparts we specified.<br>We can limit the populate using `si_key` since it is bodypart agnostic.

In [None]:
sgp.DLCSmoothInterp().populate(si_key)

And let's visualize the resulting position data using a scatter plot

In [None]:
(
    sgp.DLCSmoothInterp() & {**si_key, "bodypart": bodyparts[0]}
).fetch1_dataframe().plot.scatter(x="x", y="y", s=1, figsize=(5, 5))

#### [DLCSmoothInterpCohort](#ToC) <a id='DLCSmoothInterpCohort'></a>

Now that we've smoothed and interpolated our position data for each bodypart, we need to form a set of bodyparts from which we want to derive a centroid and orientation (or potentially a second set for orientation). This is the goal of the `DLCSmoothInterpCohort` table.

First, let's make a key that represents the 'cohort' we want to form.
> We'll set the `dlc_si_cohort_selection_name` to a concise name<br>We'll also form a dictionary with the bodypart name as the key and the smoothing/interpolation parameter name used for that bodypart as the value.

In [None]:
cohort_key = si_key.copy()
if "bodypart" in cohort_key:
    del cohort_key["bodypart"]
if "dlc_si_params_name" in cohort_key:
    del cohort_key["dlc_si_params_name"]
cohort_key["dlc_si_cohort_selection_name"] = "green_red_led"
cohort_key["bodyparts_params_dict"] = {
    "greenLED": si_params_name,
    "redLED_C": si_params_name,
}
print(cohort_key)

Here we'll insert the cohort into the `DLCSmoothInterpCohortSelection` table<br>..and populate `DLCSmoothInterpCohort`, which collates the separately smoothed and interpolated bodyparts into a single entry.

In [None]:
sgp.DLCSmoothInterpCohortSelection().insert1(cohort_key, skip_duplicates=True)
sgp.DLCSmoothInterpCohort.populate(cohort_key)

And of course, let's make sure that the table populated correctly. 

In [None]:
sgp.DLCSmoothInterpCohort.BodyPart() & cohort_key

#### [DLCCentroid](#ToC) <a id='DLCCentroid'></a>

We now have a cohort of smoothed and interpolated bodyparts from which to determine a centroid!<br>To start, we'll need a set of parameters to use for determining the centroid. For this tutorial, we can use the default.

In [None]:
# Here is the default set
print(sgp.DLCCentroidParams.get_default())
centroid_params_name = "default"

>Here is the syntax to add your own parameters:
>```python
centroid_params = {
    'centroid_method': 'two_pt_centroid',
    'points' : {
        'point1': 'greenLED',
        'point2': 'redLED_C',},
    'speed_smoothing_std_dev': 0.100,
}
centroid_params_name = 'your_unique_param_name'
sgp.DLCCentroidParams.insert1({'dlc_centroid_params_name': centroid_params_name,
                                'params': centroid_params},
                                skip_duplicates=True)
```

And now let's make a key to insert into `DLCCentroidSelection`.

In [None]:
centroid_key = cohort_key.copy()
fields = list(sgp.DLCCentroidSelection.fetch().dtype.fields.keys())
centroid_key = {key: val for key, val in centroid_key.items() if key in fields}
centroid_key["dlc_centroid_params_name"] = centroid_params_name
print(centroid_key)

Let's insert it into `DLCCentroidSelection` and then populate `DLCCentroid` !

In [None]:
sgp.DLCCentroidSelection.insert1(centroid_key, skip_duplicates=True)
sgp.DLCCentroid.populate(centroid_key)

Here we can visualize the resulting centroid position

In [None]:
(sgp.DLCCentroid() & centroid_key).fetch1_dataframe().plot.scatter(
    x="position_x",
    y="position_y",
    c="speed",
    colormap="viridis",
    alpha=0.5,
    s=0.5,
    figsize=(10, 10),
)

#### [DLCOrientation](#ToC) <a id='DLCOrientation'></a>

We'll now go through a similar process to identify the orientation!<br>To start, we'll need a set of parameters to use for determining the orientation. For this tutorial, we can use the default.

In [None]:
print(sgp.DLCOrientationParams.get_default())
dlc_orientation_params_name = "default"

Here we'll prune the `cohort_key` we used above and add our `dlc_orientation_params_name` to make it suitable for `DLCOrientationSelection`.

In [None]:
fields = list(sgp.DLCOrientationSelection.fetch().dtype.fields.keys())
orient_key = {key: val for key, val in cohort_key.items() if key in fields}
orient_key["dlc_orientation_params_name"] = dlc_orientation_params_name
print(orient_key)

And now let's insert into `DLCOrientationSelection` and populate `DLCOrientation` to determine the orientation!

In [None]:
sgp.DLCOrientationSelection().insert1(orient_key, skip_duplicates=True)
sgp.DLCOrientation().populate(orient_key)

We can fetch the output of `DLCOrientation` as a dataframe to make sure everything looks appropriate.

In [None]:
(sgp.DLCOrientation() & orient_key).fetch1_dataframe()

#### [DLCPos](#ToC) <a id='DLCPos'></a>

Ok, we're now done with processing the position data! We just have to do some table manipulations to make sure everything ends up in the same format and same location.<br>
To summarize, we brought in a pretrained DLC project, used that model to run pose estimation on a new behavioral video, smoothed and interpolated the result, formed a cohort of bodyparts, and determined the centroid and orientation of this cohort. **_Whew!_**<br>
Now let's populate `DLCPos` with our centroid and orientation entries from above.<br>----<br>
To begin, we'll make a key that combines the cohort names we used for the orientation and centroid as well as the params names for both.

In [None]:
fields = list(sgp.DLCPosV1.fetch().dtype.fields.keys())
dlc_key = {key: val for key, val in centroid_key.items() if key in fields}
dlc_key["dlc_si_cohort_centroid"] = centroid_key["dlc_si_cohort_selection_name"]
dlc_key["dlc_si_cohort_orientation"] = orient_key[
    "dlc_si_cohort_selection_name"
]
dlc_key["dlc_orientation_params_name"] = orient_key[
    "dlc_orientation_params_name"
]
print(dlc_key)

Now we can insert into `DLCPosSelection` and populate `DLCPos` with our `dlc_key`

In [None]:
sgp.DLCPosSelection().insert1(dlc_key, skip_duplicates=True)
sgp.DLCPosV1().populate(dlc_key)

We can also make sure that all of our data made it through by fetching the dataframe attached to this entry.<br>We should expect 8 columns:
>time<br>video_frame_ind<br>position_x<br>position_y<br>orientation<br>velocity_x<br>velocity_y<br>speed

In [None]:
(sgp.DLCPosV1() & dlc_key).fetch1_dataframe()

And even more, we can fetch the `pose_eval_result` that is calculated during this step. This field contains the percentage of frames that each bodypart was below the likelihood threshold of 0.95 as a means of assessing the quality of the pose estimation.

In [None]:
(sgp.DLCPosV1() & dlc_key).fetch1("pose_eval_result")

#### [DLCPosVideo](#ToC) <a id='DLCPosVideo'></a>

Here we can create a video with the centroid and orientation overlaid on the animal's behavioral video. This will also plot the likelihood of each bodypart used in the cohort. This is completely optional, but a good idea to make sure everything looks correct.

In [None]:
sgp.DLCPosVideoParams.insert_default()

In [None]:
params = {
    "percent_frames": 0.05,
    "incl_likelihood": True,
}
sgp.DLCPosVideoParams.insert1(
    {"dlc_pos_video_params_name": "five_percent", "params": params},
    skip_duplicates=True,
)

In [None]:
sgp.DLCPosVideoSelection.insert1(
    {**dlc_key, "dlc_pos_video_params_name": "five_percent"},
    skip_duplicates=True,
)

In [None]:
sgp.DLCPosVideo().populate(dlc_key)

#### [PositionOutput](#ToC) <a id='PositionOutput'></a>

`PositionOutput` is the final table of the position pipeline and is automatically populated when we populate `DLCPos`! Let's make sure that our entry made it in.

In [None]:
PositionOutput() & dlc_key

`PositionOutput` also has a part table, similar to the `DLCModelSource` table above. Let's check that out as well.

In [None]:
PositionOutput.DLCPosV1() & dlc_key

#### [PositionVideo](#ToC)<a id='PositionVideo'></a>

Bonus points if you made it this far... We can use the `PositionVideo` table to create a video that overlays just the centroid and orientation (regardless of upstream source) on the behavioral video. This table uses the parameter `plot` to determine whether to plot the entry deriving from the DLC arm or from the Trodes arm of the position pipeline. This parameter also accepts 'all', which will plot both (if they exist) in order to compare results.

In [None]:
sgp.PositionVideoSelection().insert1(
    {
        "nwb_file_name": "J1620210604_.nwb",
        "interval_list_name": "pos 13 valid times",
        "trodes_position_id": 0,
        "dlc_position_id": 1,
        "plot": "DLC",
        "output_dir": "/home/dgramling/Src/",
    }
)

In [None]:
sgp.PositionVideo.populate({"plot": "DLC"})

### _CONGRATULATIONS!!_
Please treat yourself to a nice tea break :-)

### [`Return To Table of Contents`](#ToC)<br>