# DataJoint Element DeepLabCut

## Interactively run the workflow


The workflow requires a DeepLabCut project with labeled data.
- If you don't have data, refer to [00-DataDownload](./00-DataDownload_Optional.ipynb) and [01-Configure](./01-Configure.ipynb).
- For an overview of the schema, refer to [02-WorkflowStructure](02-WorkflowStructure_Optional.ipynb).
- For a more automated approach, refer to [03-Automate](03-Automate_Optional.ipynb).

Let's change the directory to load the local config, `dj_local_conf.json`.

In [1]:
import os
if os.path.basename(os.getcwd())=='notebooks': os.chdir('..')

import datajoint as dj
from pathlib import Path
import yaml

# PATHS OF INPUT FILES: Extract abs and rel paths from .json file
dj.conn()

### DLC Project
dlc_project_path_abs = Path(dj.config["custom"]["dlc_root_data_dir"]) / Path(
    dj.config["custom"]["current_project_folder"]
)  # use pathlib to join; abs path
dlc_project_folder = Path(
    dj.config["custom"]["current_project_folder"]
)  # relative path

### Config file
config_file_abs = dlc_project_path_abs / "config.yaml"  # abs path
assert (
    config_file_abs.exists()
), "Please check the that you have the Top_tracking folder"

### Labeled-data
labeled_data_path_abs = dlc_project_path_abs / "labeled-data"
labeled_files_abs = list(
    list(labeled_data_path_abs.rglob("*"))[1].rglob("*")
)  # substitute 'training_files'; absolute path
labeled_files_rel = []
for file in labeled_files_abs:
    labeled_files_rel.append(
        file.relative_to(dlc_project_path_abs)
    )  # substitute 'training_files'; relative path


from pipeline import lab, subject, session, train, model  # after creating json file

# Empty the session in case of rerunning
# session.Session.delete()
# train.TrainingTask.delete()
# train.TrainingParamSet.delete()
# train.VideoSet.delete()

# Insert some data in session and train tables
# TO-DO: substitute lab.project by project schema.

[2023-08-04 16:31:58,365][INFO]: Connecting milagrosmarin@rds.datajoint.io:3306
[2023-08-04 16:31:58,753][INFO]: Connected milagrosmarin@rds.datajoint.io:3306


In [None]:
dj.Diagram(subject) + dj.Diagram(lab) + dj.Diagram(session) + dj.Diagram(model) + dj.Diagram(train)

In [None]:
subject.Subject()

In [None]:
# Subject and Session tables
subject.Subject.insert1(
    dict(
        subject="subject6",
        sex="F",
        subject_birth_date="2020-01-01",
        subject_description="hneih_E105",
    ),
    skip_duplicates=True,
)
session_keys = [
    dict(subject="subject6", session_datetime="2021-06-02 14:04:22"),
    dict(subject="subject6", session_datetime="2021-06-03 14:43:10"),
]

session.Session.insert(session_keys, skip_duplicates=True)
session.Session() & "session_datetime > '2021-06-01 12:00:00'" & "subject='subject6'"

In [None]:
# Videoset tabley
train.VideoSet.insert1({"video_set_id": 0}, skip_duplicates=True)

# training_files = #['labeled-data/train1_trimmed/CollectedData_DataJoint.h5',
#'labeled-data/train1_trimmed/CollectedData_DataJoint.csv']
#'labeled-data/train1_trimmed/img00674.png'] #TO-DO: CHECK IF ALL THE PNGS ARE NECESSARY FOR TRAINING
#'videos/train1.mp4']
# for idx, filename in enumerate(training_files):
for idx, filename in enumerate(labeled_files_rel):
    train.VideoSet.File.insert1(
        {"video_set_id": 0, "file_id": idx, "file_path": dlc_project_folder / filename},
        skip_duplicates=True,
    )  # Changed from + to /; #relative_path

In [None]:
train.VideoSet.File()

In [None]:
dj.list_schemas()

In [None]:
model.schema.drop()
train.schema.drop()


In [None]:
# Restrict the training interations to 5 modifying the default parameters in config.yaml
paramset_idx = 0
paramset_desc = "First training test with DLC using shuffle 1 and maxiters = 5"

# default parameters
with open(config_file_abs, "rb") as y:
    config_params = yaml.safe_load(y)
config_params.keys()

# new parameters
training_params = {
    "shuffle": "1",
    "trainingsetindex": "0",
    "maxiters": "5",
    "scorer_legacy": "False",  # For DLC ≤ v2.0, include scorer_legacy = True in params
    "maxiters": "5",
    "multianimalproject": "False",
}
config_params.update(training_params)

train.TrainingParamSet.insert_new_params(
    paramset_idx=paramset_idx, paramset_desc=paramset_desc, params=config_params
)

In [None]:
# TrainingTask table
key = {
    "video_set_id": 0,
    "paramset_idx": 0,
    "training_id": 1,
    "project_path": dlc_project_folder,
}
train.TrainingTask.insert1(key, skip_duplicates=True)
train.TrainingTask()

In [2]:
train.ModelTraining.populate(display_progress=True)
train.ModelTraining.fetch()

ModelTraining:   0%|          | 0/1 [00:00<?, ?it/s]

Loading DLC 2.2.3...
DLC loaded in light mode; you cannot use any GUI (labeling, relabeling and standalone GUI)


Config:
{'all_joints': [[0], [1]],
 'all_joints_names': ['Head', 'Tailbase'],
 'alpha_r': 0.02,
 'apply_prob': 0.5,
 'batch_size': 1,
 'contrast': {'clahe': True,
              'claheratio': 0.1,
              'histeq': True,
              'histeqratio': 0.1},
 'convolution': {'edge': False,
                 'emboss': {'alpha': [0.0, 1.0], 'strength': [0.5, 1.5]},
                 'embossratio': 0.1,
                 'sharpen': False,
                 'sharpenratio': 0.3},
 'crop_pad': 0,
 'cropratio': 0.4,
 'dataset': 'training-datasets/iteration-0/UnaugmentedDataSet_Top_trackingAug3/Top_tracking_DataJoint95shuffle1.mat',
 'dataset_type': 'imgaug',
 'decay_steps': 30000,
 'deterministic': False,
 'display_iters': 1000,
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'init_weights': '/Users/milagros/Documents/DeepLabCut_testing/DeepLabCut/deeplabcut/pose_estimation_tensorflow/models/pretrained/resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 

Selecting single-animal trainer
Batch Size is 1




Loading ImageNet-pretrained resnet_50


2023-08-04 16:32:09.138337: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-08-04 16:32:09.350845: W tensorflow/c/c_api.cc:300] Operation '{name:'pose/locref_pred/block4/biases/Momentum/Assign' id:6191 op device:{requested: '', assigned: ''} def:{{{node pose/locref_pred/block4/biases/Momentum/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](pose/locref_pred/block4/biases/Momentum, pose/locref_pred/block4/biases/Momentum/Initializer/zeros)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.


Max_iters overwritten as 5
Training parameter:
{'stride': 8.0, 'weigh_part_predictions': False, 'weigh_negatives': False, 'fg_fraction': 0.25, 'mean_pixel': [123.68, 116.779, 103.939], 'shuffle': True, 'snapshot_prefix': '/Users/milagros/Documents/DeepLabCut_testing/Top_tracking-DataJoint-2023-08-03/dlc-models/iteration-0/Top_trackingAug3-trainset95shuffle1/train/snapshot', 'log_dir': 'log', 'global_scale': 0.8, 'location_refinement': True, 'locref_stdev': 7.2801, 'locref_loss_weight': 0.05, 'locref_huber_loss': True, 'optimizer': 'sgd', 'intermediate_supervision': False, 'intermediate_supervision_layer': 12, 'regularize': False, 'weight_decay': 0.0001, 'crop_pad': 0, 'scoremap_dir': 'test', 'batch_size': 1, 'dataset_type': 'imgaug', 'deterministic': False, 'mirror': False, 'pairwise_huber_loss': False, 'weigh_only_present_joints': False, 'partaffinityfield_predict': False, 'pairwise_predict': False, 'all_joints': [[0], [1]], 'all_joints_names': ['Head', 'Tailbase'], 'alpha_r': 0.02, '

2023-08-04 16:32:13.682675: W tensorflow/core/kernels/queue_base.cc:277] _0_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
Exception in thread Thread-5:
Traceback (most recent call last):
  File "/Users/milagros/miniconda3/envs/dlc_pip/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1378, in _do_call
    return fn(*args)
  File "/Users/milagros/miniconda3/envs/dlc_pip/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1361, in _run_fn
    return self._call_tf_sessionrun(options, feed_dict, fetch_list,
  File "/Users/milagros/miniconda3/envs/dlc_pip/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1454, in _call_tf_sessionrun
    return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled
	 [[{{node fifo_queue_enqueue}}]]

During handling of the above exception, another exception occurred:

Traceback

The network is now trained and ready to evaluate. Use the function 'evaluate_network' to evaluate the network.





array([(0, 0, 1, 5, {'Task': 'Top_tracking', 'scorer': 'DataJoint', 'date': 'Aug3', 'multianimalproject': 'False', 'identity': None, 'project_path': '/Users/milagros/Documents/DeepLabCut_testing/Top_tracking-DataJoint-2023-08-03', 'video_sets': {'/Users/milagros/Documents/DeepLabCut_testing/test_data/Top_tracking-DataJoint-2023-08-03/videos/train1_trimmed.mp4': {'crop': '0, 500, 0, 500'}}, 'bodyparts': ['Head', 'Tailbase'], 'start': 0, 'stop': 1, 'numframes2pick': 5, 'skeleton': [['bodypart1', 'bodypart2'], ['objectA', 'bodypart3']], 'skeleton_color': 'black', 'pcutoff': 0.6, 'dotsize': 12, 'alphavalue': 0.7, 'colormap': 'rainbow', 'TrainingFraction': [0.95], 'iteration': 0, 'default_net_type': 'resnet_50', 'default_augmenter': 'default', 'snapshotindex': -1, 'batch_size': 8, 'cropping': False, 'x1': 0, 'x2': 640, 'y1': 277, 'y2': 624, 'corner2move2': [50, 50], 'move2corner': True, 'shuffle': '1', 'trainingsetindex': '0', 'maxiters': '5', 'scorer_legacy': 'False', 'modelprefix': '', 't

# Old notebook ----------------------------------------------

In [None]:
import os
# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd())=='notebooks': os.chdir('..')
assert os.path.basename(os.getcwd())=='workflow-deeplabcut', ("Please move to the "
                                                              + "workflow directory")

`Pipeline.py` activates the DataJoint `elements` and declares other required tables.

In [None]:
import datajoint as dj
from workflow_deeplabcut.pipeline import lab, subject, session, train, model

# Directing our pipeline to the appropriate config location
from element_interface.utils import find_full_path
from workflow_deeplabcut.paths import get_dlc_root_data_dir
config_path = find_full_path(get_dlc_root_data_dir(), 
                             'from_top_tracking/config.yaml')

## Manually Inserting Entries

### Upstream tables

We can insert entries into `dj.Manual` tables (green in diagrams) by providing values as a dictionary or a list of dictionaries. 

In [None]:
session.Session.heading

In [None]:
subject.Subject.insert1(dict(subject='subject6', 
                             sex='F', 
                             subject_birth_date='2020-01-01', 
                             subject_description='hneih_E105'))
session_keys = [dict(subject='subject6', session_datetime='2021-06-02 14:04:22'),
                dict(subject='subject6', session_datetime='2021-06-03 14:43:10')]
session.Session.insert(session_keys)

We can look at the contents of this table and restrict by a value.

In [None]:
session.Session() & "session_datetime > '2021-06-01 12:00:00'" & "subject='subject6'"

#### DeepLabcut Tables

The `VideoSet` table in the `train` schema retains records of files generated in the video labeling process (e.g., `h5`, `csv`, `png`). DeepLabCut will refer to the `mat` file located under the `training-datasets` directory.

We recommend storing all paths as relative to the root in your config.

In [None]:
train.VideoSet.insert1({'video_set_id': 0})
project_folder = 'from_top_tracking/'
training_files = ['labeled-data/train1/CollectedData_DJ.h5',
                  'labeled-data/train1/CollectedData_DJ.csv',
                  'labeled-data/train1/img00674.png',
                  'videos/train1.mp4']
for idx, filename in enumerate(training_files):
    train.VideoSet.File.insert1({'video_set_id': 0,
                                 'file_id': idx,
                                 'file_path': (project_folder + filename)})

In [None]:
train.VideoSet.File()

### Training a Network

First, we'll add a `ModelTrainingParamSet`. This is a lookup table that we can reference when training a model.

In [None]:
train.TrainingParamSet.heading

The `params` longblob should be a dictionary that captures all items for DeepLabCut's `train_network` function. At minimum, this is the contents of the project's config file, as well as `suffle` and `trainingsetindex`, which are not included in the config. 

In [None]:
from deeplabcut import train_network
help(train_network) # for more information on optional parameters

Here, we give these items, load the config contents, and overwrite some defaults, including `maxiters`, to restrict our training iterations to 5.

In [None]:
import yaml

paramset_idx = 0; paramset_desc='from_top_tracking'

with open(config_path, 'rb') as y:
    config_params = yaml.safe_load(y)
training_params = {'shuffle': '1',
                   'trainingsetindex': '0',
                   'maxiters': '5',
                   'scorer_legacy': 'False',
                   'maxiters': '5', 
                   'multianimalproject':'False'}
config_params.update(training_params)
train.TrainingParamSet.insert_new_params(paramset_idx=paramset_idx,
                                         paramset_desc=paramset_desc,
                                         params=config_params)

Now, we add a `TrainingTask`. As a computed table, `ModelTraining` will reference this to start training when calling `populate()`

In [None]:
train.TrainingTask.heading

In [None]:
key={'video_set_id': 0,
     'paramset_idx':0,
     'training_id': 1,
     'project_path':'from_top_tracking/'
     }
train.TrainingTask.insert1(key, skip_duplicates=True)
train.TrainingTask()

In [None]:
train.ModelTraining.populate()

(Output cleared for brevity)
```
The network is now trained and ready to evaluate. Use the function 'evaluate_network' to evaluate the network.
```

In [None]:
train.ModelTraining()

To resume training from a checkpoint, we would need to 
[edit the relevant config file](https://github.com/DeepLabCut/DeepLabCut/issues/70) (see also `update_pose_cfg` in `workflow_deeplabcut.load_demo_data`).
Emperical work suggests 200k iterations for any true use-case.

For better quality predictions in this demo, we'll revert the checkpoint file and use a pretrained model.

In [None]:
from workflow_deeplabcut.load_demo_data import revert_checkpoint_file
revert_checkpoint_file()

### Tracking Joints/Body Parts

The `model` schema uses a lookup table for managing Body Parts tracked across models.

In [None]:
model.BodyPart.heading

Helper functions allow us to first, identify all the new body parts from a given config, and, second, insert them with user-friendly descriptions.

In [None]:
model.BodyPart.extract_new_body_parts(config_path)

In [None]:
bp_desc=['Body Center', 'Head', 'Base of Tail']
model.BodyPart.insert_from_config(config_path,bp_desc)

### Declaring/Evaluating a Model

We can insert into `Model` table for automatic evaluation

In [None]:
model.Model.insert_new_model(model_name='FromTop-latest',dlc_config=config_path,
                             shuffle=1,trainingsetindex=0,
                             model_description='FromTop - latest snapshot',
                             paramset_idx=0,
                             params={"snapshotindex":-1})

In [None]:
model.Model()

`ModelEvaluation` will reference the `Model` using the `populate` method and insert the  output from DeepLabCut's `evaluate_network` function

In [None]:
model.ModelEvaluation.heading

In [None]:
model.ModelEvaluation.populate()

In [None]:
model.ModelEvaluation()

### Pose Estimation

To use our model, we'll first need to insert a session recoring into `VideoRecording`

In [None]:
model.VideoRecording()

In [None]:
key = {'subject': 'subject6',
       'session_datetime': '2021-06-02 14:04:22',
       'recording_id': '1', 'device': 'Camera1'}
model.VideoRecording.insert1(key)

_ = key.pop('device') # get rid of secondary key from master table
key.update({'file_id': 1, 
            'file_path': 'from_top_tracking/videos/test-2s.mp4'})
model.VideoRecording.File.insert1(key)

In [None]:
model.VideoRecording.File()

`RecordingInfo` automatically populates with file information

In [None]:
model.RecordingInfo.populate()
model.RecordingInfo()

Next, we specify if the `PoseEstimation` table should load results from an existing file or trigger the estimation command. Here, we can also specify parameters for DeepLabCut's `analyze_videos` as a dictionary.

In [None]:
key = (model.VideoRecording & {'recording_id': '1'}).fetch1('KEY')
key.update({'model_name': 'FromTop-latest', 'task_mode': 'trigger'})
key

In [None]:
model.PoseEstimationTask.insert_estimation_task(key,params={'save_as_csv':True})
model.PoseEstimation.populate()

By default, DataJoint will store results in a subdirectory
>       <processed_dir> / videos / device_<name>_recording_<#>_model_<name>
where `processed_dir` is optionally specified in the datajoint config. If unspecified, this will be the project directory. The device and model names are specified elsewhere in the schema.

We can get this estimation directly as a pandas dataframe.

In [None]:
model.PoseEstimation.get_trajectory(key)

In the [next notebook](./04-Automate_Optional.ipynb), we'll look at additional tools in the workflow for automating these steps.