# DeepLabCut Toolbox
https://github.com/DeepLabCut/DeepLabCut

This notebook demonstrates the necessary steps to use DeepLabCut modified for the WMAZE

If you already have a trained network jump to the analyze videos section or the bottom - Batch Processing

## Create a new project
This function creates a new project with subdirectories and a basic configuration file in the user defined directory otherwise the project is created in the current working directory.
You can always add new videos (for labeling more data) to the project at any stage of the project.

In [None]:
import deeplabcut
import os

In [3]:
task='WMaze' # Enter the name of your experiment Task
experimenter='SS' # Enter the name of the experimenter
video=[r'C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\CutVideosForTraining'] # Enter the paths of your videos OR FOLDER you want to grab frames from. ONLY FRAMES FOR INITIAL NOT ALL VIDEOS
working_dir=r'C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT' # Directory where the file should be created
path_config_file=deeplabcut.create_new_project(task,experimenter,video,working_dir,copy_videos=True)

# NOTE: The function returns the path, where your project is. 
# You could also enter this manually (e.g. if the project is already created and you want to pick up, where you stopped...)
#path_config_file = '/home/Mackenzie/Reaching/config.yaml' # Enter the path of the config file that was just created from the above step (check the folder)

Created "C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\videos"
Created "C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\labeled-data"
Created "C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\training-datasets"
Created "C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\dlc-models"
1  videos from the directory C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\CutVideosForTraining were added to the project.
Copying the videos
C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\videos\VT1.mp4
Generated "C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\config.yaml"

A new project with name WMaze-SS-2024-11-25 is created at C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT and a configurable file (conf

## Now, go edit the config.yaml file that was created! 
Add your body part labels, edit the number of frames to extract per video, etc.

### Note that you can see more information about ANY function by adding a ? at the end,  i.e.
- e.g. deeplabcut.extract_frames?

## Extract frames from videos 
A key point for a successful feature detector is to select diverse frames, which are typical for the behavior you study that should be labeled.
This function selects N frames either uniformly sampled from a particular video (or folder) ('uniform'). Note: this might not yield diverse frames, if the behavior is sparsely distributed (consider using kmeans), and/or select frames manually etc.
Also make sure to get select data from different (behavioral) sessions and different animals if those vary substantially (to train an invariant feature detector).
Individual images should not be too big (i.e. < 850 x 850 pixel). Although this can be taken care of later as well, it is advisable to crop the frames, to remove unnecessary parts of the frame as much as possible.
Always check the output of cropping. If you are happy with the results proceed to labeling.

In [6]:
%matplotlib inline
#there are other ways to grab frames, such as uniformly; please see the paper:

#AUTOMATIC:
deeplabcut.extract_frames(path_config_file,algo="kmeans",slider_width=15)

Config file read successfully.
Do you want to extract (perhaps additional) frames for video: C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\videos\VT1.mp4 ?
Extracting frames based on kmeans ...
Kmeans-quantization based extracting of frames from 0.0  seconds to 4745.13  seconds.
Extracting and downsampling... 142354  frames from the video.


2560it [00:23, 108.29it/s]


KeyboardInterrupt: 

## Label the extracted frames

Only videos in the config file can be used to extract the frames. Extracted labels for each video are stored in the project directory under the subdirectory **'labeled-data'**. Each subdirectory is named after the name of the video. The toolbox has a labeling toolbox which could be used for labeling. 

In [None]:
# napari will pop up!
# Please go to plugin > deeplabcut to start
# then, drag-and-drop the project configuration file into the viewer (the value of path_config_file)
# finally, drop the folder containing the images (in 'labeled-data') in the viewer

%gui qt6
import napari
napari.Viewer()

## Check the labels
[OPTIONAL] Checking if the labels were created and stored correctly is beneficial for training, since labeling is one of the most critical parts for creating the training dataset. The DeepLabCut toolbox provides a function `check\_labels'  to do so. It is used as follows:
If the labels need adjusted, you can use relauch the labeling GUI to move them around, save, and re-plot!

In [None]:
deeplabcut.check_labels(path_config_file) #this creates a subdirectory with the frames + your labels

In [None]:
# Additional Check where the distance between two body parts is greater than a given threshold
import numpy as np
import pandas as pd

max_dist = 100
df = pd.read_hdf('path_to_your_labeled_data_file')
bpt1 = df.xs('neck', level='bodyparts', axis=1).to_numpy()
bpt2 = df.xs('tail', level='bodyparts', axis=1).to_numpy()
# We calculate the vectors from a point to the other
# and group them per frame and per animal.
try:
    diff = (bpt1 - bpt2).reshape((len(df), -1, 2))
except ValueError:
    diff = (bpt1 - bpt2).reshape((len(df), -1, 3))
dist = np.linalg.norm(diff, axis=2)
mask = np.any(dist >= max_dist, axis=1)
flagged_frames = df.iloc[mask].index

## Image Augmentation
Image augmentation is the process of artificially expanding the training set by applying various transformations to images (e.g., rotation or rescaling) in order to make models more robust and more accurate. The following section of code is not necessary. Only pass if you want more augmentation - It edits the pose_cfg.yaml Some other parameters that might be useful to change are as follows
- ``sharpening`` - default set to False can set to true
- ``sharpenratio`` - default set to 0.3
- `` edge `` - default set to false - edge contrast of image

In [None]:
#Image Augmentation Code -  Add more image augmentation if needed
import deeplabcut
train_pose_config, _ = deeplabcut.return_train_network_path(path_config_file)
augs = {
    "gaussian_noise": True,
    "elastic_transform": True,
    "rotation": 180,
    "covering": True,
    "motion_blur": True,
}
deeplabcut.auxiliaryfunctions.edit_config(
    train_pose_config,
    augs,
)


## Creation of training data set
This function generates the training data information for network training based on the pandas dataframes that hold label information. The user can set the fraction of the training set size (from all labeled image in the hd5 file) in the config.yaml file.After running this script the training dataset is created and saved in the project directory under the subdirectory **'training-datasets'**
- Multiple shuffles to run different iterations -benchmark performance. Default is 1 -parameterers to change in the pose_cfg.yaml in the train folder are below
- `` freeze_bn_stats`` - default is true which is good for a CPU but make false for a GPU
- `` batch size`` - 8/16/32

Creates new subdirectories under **dlc-models** and appends the project config.yaml file with the correct path to the training and testing pose configuration file. These files hold the parameters for training the network. Such an example file is provided with the toolbox and named as **pose_cfg.yaml**. Things to change in pose_cfg.yaml if you want - this is the default one not the one that is in the train folder
- ``global_scale`` - default is 0.8 but change to 1 for low res images
- `` batch_size `` - default is 1. but increase batch-size to limit of GPU memory to train for a lower no. of iterations/epochs but not linear.  (can set batch size to 8/16/32)
- `` epochs`` - default in pytorch is 200 but can reduce if batch-size is higher
- ``pos_dist_thresh`` - default is 17, size of window within which detections are considered positive training samples - non-intuitive
- `` pafwidth`` - default is 20, learns associations between pairs of body parts

In [None]:
deeplabcut.create_training_dataset(path_config_file,augmenter_type='imgaug')

## Start training:

This function trains the network for a specific shuffle of the training dataset.  The commented out portion- caps the GPU use if needed. This is only in tensor flow not pytorch

In [None]:
deeplabcut.train_network(path_config_file,allow_growth=True) #Allow growth =true dynamically grows GPU memory as needed

#Can also cap GPU to only use a fraction of the memory Uncomment the section below
#import tensorflow as tf
#gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.25) # only uses 1/4th of the GPU
#sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

## Start evaluating
This function evaluates a trained model for a specific shuffle/shuffles at a particular state or all the states on the data set (images)
and stores the results as .csv file in a subdirectory under **evaluation-results**

In [None]:
deeplabcut.evaluate_network(path_config_file, plotting=True)

## Start Analyzing videos
This function analyzes the new video. The user can choose the best model from the evaluation results and specify the correct snapshot index for the variable **snapshotindex** in the **config.yaml** file. Otherwise, by default the most recent snapshot is used to analyse the video.
The results are stored in hd5 file in the same directory where the video resides.
- If split_videos is set to True then the videos are cut

In [None]:
# Videos may need to be broken up if they are too long
import deeplabcut
import os
from deeplabcut.utils.auxfun_videos import VideoWriter

split_videos=True
if split_videos:
    video_path = ['videos/video3.avi','videos/video4.avi'] #Enter a folder OR a list of videos to analyze.
    _, ext = os.path.splitext(video_path)
    vid = VideoWriter(video_path)
    clips = vid.split(n_splits=10)

#Run Analysis on the videos
deeplabcut.analyze_videos(path_config_file, clips, ext)
# Alternate just run videos
deeplabcut.analyze_videos(path_config_file,video_path, videotype='.avi')

## Extract outlier frames [optional step]

This is an optional step and is used only when the evaluation results are poor i.e. the labels are incorrectly predicted. In such a case, the user can use the following function to extract frames where the labels are incorrectly predicted. This step has many options, so please look at:

In [3]:
deeplabcut.extract_outlier_frames?

In [None]:
deeplabcut.extract_outlier_frames(path_config_file,['/videos/video3.avi']) #pass a specific video

## Refine Labels [optional step]
Following the extraction of outlier frames, the user can use the following function to move the predicted labels to the correct location. Thus augmenting the training dataset. 

In [None]:
%gui qt6
deeplabcut.refine_labels(path_config_file)

**NOTE:** Afterwards, if you want to look at the adjusted frames, you can load them in the main GUI by running: ``deeplabcut.label_frames(path_config_file)``


#### Once all folders are relabeled, check the labels again! If you are not happy, adjust them in the main GUI:

``deeplabcut.label_frames(path_config_file)``

Check Labels:

``deeplabcut.check_labels(path_config_file)``

In [7]:
#NOW, merge this with your original data:

deeplabcut.merge_datasets(path_config_file)

The following folder was not manually refined,... C:\Users\sahanasrivathsa\Documents\SYNC\Work\BarnesLab\CODE\DEEPLABCUT\WMaze-SS-2024-11-25\labeled-data\VT1
Please label, or remove the un-corrected folders.


## Create a new iteration of training dataset [optional step]
Following the refinement of labels and appending them to the original dataset, this creates a new iteration of training dataset. This is automatically set in the config.yaml file.

In [None]:
deeplabcut.create_training_dataset(path_config_file)

## Create labeled video
This function is for visualiztion purpose and can be used to create a video in .mp4 format with labels predicted by the network. This video is saved in the same directory where the original video resides. 

THIS HAS MANY FUN OPTIONS! 

``deeplabcut.create_labeled_video(config, videos, videotype='avi', shuffle=1, trainingsetindex=0, filtered=False, save_frames=False, Frames2plot=None, delete=False, displayedbodyparts='all', codec='mp4v', outputframerate=None, destfolder=None, draw_skeleton=False, trailpoints=0, displaycropped=False)``

So please check:

In [4]:
deeplabcut.create_labeled_video?

In [None]:
deeplabcut.create_labeled_video(path_config_file,videofile_path)

## Plot the trajectories of the analyzed videos
This function plots the trajectories of all the body parts across the entire video. Each body part is identified by a unique color.

In [None]:
%matplotlib notebook #for making interactive plots.
deeplabcut.plot_trajectories(path_config_file,videofile_path)

## BATCH PROCESSING

This script can run a video analysis over all folders

In [None]:
import os

import deeplabcut

def getsubfolders(folder):
    ''' returns list of subfolders '''
    return [os.path.join(folder,p) for p in os.listdir(folder) if os.path.isdir(os.path.join(folder,p))]

project='ComplexWheelD3-12-Fumi-2019-01-28'

shuffle=1

prefix='/home/alex/DLC-workshopRowland'

projectpath=os.path.join(prefix,project)
config=os.path.join(projectpath,'config.yaml')

basepath='/home/alex/BenchmarkingExperimentsJan2019' #data'

'''

Imagine that the data (here: videos of 3 different types) are in subfolders:
    /January/January29 ..
    /February/February1
    /February/February2

    etc.

'''

subfolders=getsubfolders(basepath)
for subfolder in subfolders: #this would be January, February etc. in the upper example
    print("Starting analyze data in:", subfolder)
    subsubfolders=getsubfolders(subfolder)
    for subsubfolder in subsubfolders: #this would be Febuary1, etc. in the upper example...
        print("Starting analyze data in:", subsubfolder)
        for vtype in ['.mp4','.m4v','.mpg']:
            deeplabcut.analyze_videos(config,[subsubfolder],shuffle=shuffle,videotype=vtype,save_as_csv=True)
