# DeepLabCut Ingestion/Inference

`Dev notes:` Currently, the path structure assumes you have one DLC project directory for all models, as specified within `adamacs.pipeline.get_dlc_root_data_dir`. The parallel function `get_dlc_processed_data_dir` can specify the output directory. 

## Setup

### Connect to the database

If you are don't have your login information, contact the administrator.

Using local config file (see [01_pipeline](./01_pipeline.ipynb)):

In [1]:
import os
# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd())=='notebooks': os.chdir('..')
assert os.path.basename(os.getcwd())=='adamacs', ("Please move to the main directory")
import datajoint as dj; dj.conn()

Connecting cbroz@dss-db.datajoint.io:3306


DataJoint connection (connected) cbroz@dss-db.datajoint.io:3306

Manual entry:

In [None]:
# Manual Entry
import datajoint as dj; import getpass
dj.config['database.host'] = '172.26.128.53'        # Put the server name between these apostrophe
dj.config['database.user'] = 'danielmk'             # Put your user name between these apostrophe
dj.config['database.password'] = getpass.getpass()  # Put your password in the prompt
dj.config['custom']['dlc_root_data_dir'] = 'path'   # Path of your DLC project folder
dj.conn()

### Imports and activation

Importing schema from `adamacs.pipeline` automatically activates items.

In [2]:
from adamacs.pipeline import subject, train, model

## Ingesting videos and training parameters

### Automated

Refer the `user_data` folder in the `adamacs` directory contains CSVs for inserting values into DeepLabCut tables.

1. `config_params.csv` is used for training parameter sets in `train.TrainingParamSet`. The following items are required, but others will also be passed to DLC's `train_network` function when called 

In [None]:
# 2. `train_videosets.csv` and `model_videos.csv` pass values to `train.VideoSet` and `model.VideoRecording` respectively.
# 3. `adamacs.ingest.dlc.ingest_dlc_items` will load each of these CSVs
#
# For more information, see [this notebook](https://github.com/CBroz1/workflow-deeplabcut/blob/main/notebooks/04-Automate_Optional.ipynb)

In [12]:
from adamacs.ingest.dlc import ingest_dlc_items
ingest_dlc_items()


---- Inserting 0 entry(s) into #model_training_param_set ----

---- Inserting 0 entry(s) into video_set ----

---- Inserting 0 entry(s) into video_set__file ----

---- Inserting 1 entry(s) into video_recording ----

---- Inserting 1 entry(s) into video_recording__file ----


### Manual

The same training parameters as above can be manually inserted as follows.

In [7]:
import yaml
from element_interface.utils import find_full_path
from adamacs.paths import get_dlc_root_data_dir

paramset_idx = 1; paramset_desc='from_top_5iters'
config_path = find_full_path(get_dlc_root_data_dir(), 
                             'DLC_tracking/config.yaml')
with open(config_path, 'rb') as y:
    config_params = yaml.safe_load(y)
training_params = {'shuffle': '1',
                   'trainingsetindex': '0',
                   'maxiters': '5',
                   'scorer_legacy': 'False',
                   'maxiters': '5', 
                   'multianimalproject':'False'}
config_params.update(training_params)
train.TrainingParamSet.insert_new_params(paramset_idx=paramset_idx,
                                         paramset_desc=paramset_desc,
                                         params=config_params)

In [None]:
key = {'subject': 'subject',
       'session_id': 'id',
       'recording_id': 1, 
       'scanner': 1, # Currently 'scanner' due to in equipment tables
       'recording_start_time': '0000-00-00 00:00:00'}
model.VideoRecording.insert1(key)
# do not include an initial `/` in relative file paths   
key.update({'file_path': 'relative/path'})
model.VideoRecording.File.insert1(key, ignore_extra_fields=True)

## Model Training

The `TrainingTask` table queues up training. To launch training from a different machine, one needs to edit DLC's config files to reflect updated paths. For training, this includes `dlc-models/*/*/train/pose_cfg.yaml`

`CB DEV NOTE:` I'm missing the following videos used to originally train the model:
- top_video2022-02-17T15_56_10.mp4
- top_video2022-02-21T12_18_09.mp4

In [6]:
key={'video_set_id': 1, 'paramset_idx':1,
     'training_id':1, # uniquely defines training task
     'project_path':'DLC_tracking/' # relative to dlc_root in dj.config
    }
train.TrainingTask.insert1(key, skip_duplicates=True)
train.TrainingTask()

video_set_id,paramset_idx,training_id,model_prefix,project_path  DLC's project_path in config relative to root
1,1,1,,DLC_tracking/


In [None]:
train.ModelTraining.populate()

In [10]:
train.ModelTraining()

video_set_id,paramset_idx,training_id,"latest_snapshot  latest exact snapshot index (i.e., never -1)",config_template  stored full config file
1,1,1,5,=BLOB=


To start training from a previous instance, one would need to 
[edit the relevant config file](https://github.com/DeepLabCut/DeepLabCut/issues/70) and
adjust the `maxiters` paramset (if present) to a higher threshold (e.g., 10 for 5 more itterations).
Emperical work from the Mathis team suggests 200k iterations for any true use-case.

## Tracking Joints/Body Parts

The `model` schema uses a lookup table for managing Body Parts tracked across models.

In [11]:
model.BodyPart.heading

body_part            : varchar(32)                  # 
---
body_part_description="" : varchar(1000)                # 

This table is equipped with two helper functions. First, we can identify all the new body parts from a given config file.

In [14]:
from adamacs.paths import get_dlc_root_data_dir
config_path = get_dlc_root_data_dir()[0] + "/DLC_tracking/config.yaml"
model.BodyPart.extract_new_body_parts(config_path)

Existing body parts: []
New body parts: ['bodycenter' 'head' 'tailbase']


array(['bodycenter', 'head', 'tailbase'], dtype='<U10')

Now, we can make a list of descriptions in the same order, and insert them into the table

In [15]:
bp_desc=['Body Center', 'Head', 'Base of Tail']
model.BodyPart.insert_from_config(config_path,bp_desc)

Existing body parts: []
New body parts: ['bodycenter' 'head' 'tailbase']
New descriptions: ['Body Center', 'Head', 'Base of Tail']


Insert 3 new body part(s)? [yes, no]:  yes


If we skip this step, body parts (without descriptions) will be added when we insert a model. We can [update](https://docs.datajoint.org/python/v0.13/manipulation/3-Cautious-Update.html) empty descriptions at any time.

## Declaring a Model

If training appears successful, the result can be inserted into the `Model` table for automatic evaluation.

In [16]:
model.Model.insert_new_model(model_name='from_top_5iters',dlc_config=config_path,
                             shuffle=1,trainingsetindex=0,
                             model_description='From Top, trained 5 iterations',
                             paramset_idx=1)

--- DLC Model specification to be inserted ---
	model_name: from_top_5iters
	model_description: From Top, trained 5 iterations
	scorer: DLCmobnet100fromtoptrackingFeb23shuffle1
	task: from_top_tracking
	date: Feb23
	iteration: 0
	snapshotindex: -1
	shuffle: 1
	trainingsetindex: 0
	project_path: DLC_tracking
	paramset_idx: 1
	-- Template for config.yaml --
		Task: from_top_tracking
		TrainingFraction: [0.95]
		batch_size: 8
		cropping: False
		date: Feb23
		iteration: 0
		project_path: /Users/cb/Documents/Bonn/DLC_tracking
		snapshotindex: -1
		x1: 0
		x2: 640
		y1: 277
		y2: 624


Proceed with new DLC model insert? [yes, no]:  yes


Existing body parts: ['bodycenter' 'head' 'tailbase']
New body parts: []


Insert 0 new body part(s)? [yes, no]:  yes


In [9]:
model.Model()

model_name  user-friendly model name,task  task in the config yaml,date  date in the config yaml,iteration  iteration/version of this model,"snapshotindex  which snapshot for prediction (if -1, latest)",shuffle  which shuffle of the training dataset,trainingsetindex  which training set fraction to generate model,scorer  scorer/network name - DLC's GetScorerName(),config_template  dictionary of the config for analyze_videos(),project_path  DLC's project_path in config relative to root,model_prefix,model_description,paramset_idx
from_top_5iters,from_top_tracking,Feb23,0,-1,1,0,DLCmobnet100fromtoptrackingFeb23shuffle1,=BLOB=,DLC_tracking,,"From Top, trained 5 iterations",1


In [10]:
model.Model.BodyPart()

model_name  user-friendly model name,body_part
from_top_5iters,bodycenter
from_top_5iters,head
from_top_5iters,tailbase


## Model Evaluation

Next, all inserted models can be evaluated with a similar `populate` method, which will
insert the relevant output from DLC's `evaluate_network` function.

In [11]:
model.ModelEvaluation.heading

model_name           : varchar(64)                  # user-friendly model name
---
train_iterations     : int                          # Training iterations
train_error=null     : float                        # Train error (px)
test_error=null      : float                        # Test error (px)
p_cutoff=null        : float                        # p-cutoff used
train_error_p=null   : float                        # Train error with p-cutoff
test_error_p=null    : float                        # Test error with p-cutoff

If your project was initialized in a version of DeepLabCut other than the one you're currently using, model evaluation may report key errors. Specifically, your `config.yaml` may not specify `multianimalproject: false`.

In [12]:
model.ModelEvaluation.populate()

DLC loaded in light mode; you cannot use any GUI (labeling, relabeling and standalone GUI)
Running  DLC_mobnet_100_from_top_trackingFeb23shuffle1_103000  with # of training iterations: 103000
This net has already been evaluated!


In [20]:
model.ModelEvaluation()

model_name  user-friendly model name,train_iterations  Training iterations,train_error  Train error (px),test_error  Test error (px),p_cutoff  p-cutoff used,train_error_p  Train error with p-cutoff,test_error_p  Test error with p-cutoff
from_top_5iters,10300,3.2,25.28,0.6,3.2,25.28


## Pose Estimation

In [13]:
model.VideoRecording.File()

session_id,scanner,recording_id,"file_path  filepath of video, relative to root data directory"
sess9FB2LN5C,Equipment,1,DLC_tracking/videos/exp9FANLWRZ_top_video2022-02-21T12_18_09-copy.mp4


For demonstration purposes, we'll make a shorter video that will process relatively quickly `ffmpeg`, a DLC dependency ([more info here](https://github.com/datajoint/workflow-deeplabcut/blob/main/notebooks/00-DataDownload_Optional.ipynb))

In [15]:
from adamacs.paths import get_dlc_root_data_dir
vid_path =  get_dlc_root_data_dir()[0] + '/DLC_tracking/videos/exp9FANLWRZ_top_video2022-02-21T12_18_09'
print(vid_path)
cmd = (f'ffmpeg -n -hide_banner -loglevel error -ss 0 -t 2 -i {vid_path}.mp4 '
       + f'-vcodec copy -acodec copy {vid_path}-copy.mp4')
import os; os.system(cmd)

/Users/cb/Documents/Bonn//DLC_tracking/videos/exp9FANLWRZ_top_video2022-02-21T12_18_09


File '/Users/cb/Documents/Bonn//DLC_tracking/videos/exp9FANLWRZ_top_video2022-02-21T12_18_09-copy.mp4' already exists. Exiting.


256

Next, we need to specify if the `PoseEstimation` table should load results from an existing file or trigger the estimation command. Here, we can also specify parameters accepted by the `analyze_videos` function as a dictionary. `task_mode` determines if pose estimation results should be loaded or triggered (i.e., load vs. trigger).

In [13]:
key = (model.VideoRecording & {'recording_id': '1'}).fetch1('KEY')
key.update({'model_name': 'from_top_5iters', 'task_mode': 'trigger'})
key

{'session_id': 'sess9FB2LN5C',
 'scanner': 'Equipment',
 'recording_id': 1,
 'model_name': 'from_top_5iters',
 'task_mode': 'trigger'}

The `PoseEstimationTask` table queues items for pose estimation. Additional parameters are passed to DLC's `analyze_videos` function.

In [14]:
model.PoseEstimationTask.insert_estimation_task(key,params={'save_as_csv':True})

In [15]:
model.PoseEstimation.populate()

Using snapshot-103000 for model /Users/cb/Documents/Bonn/DLC_tracking/dlc-models/iteration-0/from_top_trackingFeb23-trainset95shuffle1


  outputs = layer.apply(inputs, training=is_training)


Starting to analyze %  /Users/cb/Documents/Bonn/DLC_tracking/videos/exp9FANLWRZ_top_video2022-02-21T12_18_09-copy.mp4
The videos are analyzed. Now your research can truly start! 
 You can create labeled videos with 'create_labeled_video'
If the tracking is not satisfactory for some videos, consider expanding the training set. You can use the function 'extract_outlier_frames' to extract a few representative outlier frames.


In [16]:
model.PoseEstimation()

session_id,scanner,recording_id,model_name  user-friendly model name,post_estimation_time  time of generation of this set of DLC results
sess9FB2LN5C,Equipment,1,from_top_5iters,2022-04-13 15:54:24


By default, DataJoint will store the results of pose estimation in a subdirectory
>  processed_dir / videos / device_<#>_recording_<#>_model_<name>

Pulling processed_dir from `get_dlc_processed_dir`, and device/recording information 
from the `VideoRecording` table. The model name is taken from the primary key of the
`Model` table, with spaced replaced by hyphens.
    
We can get this estimation directly as a pandas dataframe.

In [18]:
model.PoseEstimation.BodyPartPosition()

session_id,scanner,recording_id,model_name  user-friendly model name,body_part,frame_index  frame index in model,x_pos,y_pos,z_pos,likelihood
sess9FB2LN5C,Equipment,1,from_top_5iters,bodycenter,=BLOB=,=BLOB=,=BLOB=,=BLOB=,=BLOB=
sess9FB2LN5C,Equipment,1,from_top_5iters,head,=BLOB=,=BLOB=,=BLOB=,=BLOB=,=BLOB=
sess9FB2LN5C,Equipment,1,from_top_5iters,tailbase,=BLOB=,=BLOB=,=BLOB=,=BLOB=,=BLOB=


In [19]:
model.PoseEstimation.get_trajectory(key)

scorer,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters,from_top_5iters
bodyparts,bodycenter,bodycenter,bodycenter,bodycenter,head,head,head,head,tailbase,tailbase,tailbase,tailbase
coords,x,y,z,likelihood,x,y,z,likelihood,x,y,z,likelihood
0,263.233643,295.701965,0.0,0.999968,270.341034,312.154083,0.0,0.999293,256.796326,279.632538,0.0,0.999994
1,263.131531,296.109619,0.0,0.999968,271.406586,313.803741,0.0,0.999348,256.890808,279.653290,0.0,0.999994
2,263.089874,296.135345,0.0,0.999969,271.372955,313.819061,0.0,0.999315,256.887085,279.740051,0.0,0.999994
3,263.164276,296.068878,0.0,0.999970,271.475769,313.860046,0.0,0.999379,256.704987,279.777924,0.0,0.999992
4,263.548950,295.879852,0.0,0.999967,271.408325,313.501617,0.0,0.999314,256.435089,279.724518,0.0,0.999987
...,...,...,...,...,...,...,...,...,...,...,...,...
118,250.145859,338.050140,0.0,0.999997,242.960190,352.420898,0.0,0.999814,256.850525,321.103699,0.0,0.999994
119,250.232315,338.017792,0.0,0.999997,243.117798,352.093384,0.0,0.999695,256.709595,321.358917,0.0,0.999993
120,250.193817,338.008331,0.0,0.999997,242.929077,351.984344,0.0,0.999509,256.314819,321.240753,0.0,0.999990
121,249.979782,337.976837,0.0,0.999997,242.714447,351.682861,0.0,0.999384,256.012238,318.641510,0.0,0.999986
