## Converting the DLC project to Lightning Pose format

Data organization of DLC project

We assume DeeplabCut project has the following two project directory structures:


1/ The first project structure contains only one DLC project folder which includes all sessions (videos):
```console
    /path/to/DLC_project/
      ├── labeled-data/
      │   ├── <video_name1>/
      │   │   ├── <CollectedData_experimenter>.csv
      │   │   ├── <CollectedData_experimenter>.h5
      │   │   ├── <Images>.png
      │   │            
      │   ├── <video_name2>/
      │   │   ├── <CollectedData_experimenter>.csv
      │   │   ├── <CollectedData_experimenter>.h5
      │   │   ├── <Images>.png
      │   └── ......
      └── videos/
          ├── <video_name1>.mp4(.avi)
          ├── <video_name2>.mp4(.avi)
          └── ......
```
* `labeled-data`: This directory stores the frames used to create the DLC training dataset. Frames from different videos are stored in separate subdirectories. Each frame has a filename related to the temporal index within the corresponding video, which allows the user to trace every frame back to its origin.

* `videos`: Directory of video links or videos. 

An example of DLC project is `/root/capsule/data/s3_video/DLC_projects/Foraging_Bot-Han_Lucas-2022-04-27/`



2/ The second project structure contains multiple DLC project folders for each session (each likely annotated by a different individual):

```console
    /path/to/DLC_project/
       ├── DLC_project1  
       │      ├── labeled-data/
       │      │   ├── <video_name1>/
       │      │   │   ├── <CollectedData_experimenter>.csv
       │      │   │   ├── <CollectedData_experimenter>.h5
       │      │   │   └── <Images>.png
       │      └── videos/
       │          └── <video_name1>.mp4(.avi)
       │
       ├── DLC_project2  
       │      ├── labeled-data/
       │      │   ├── <video_name2>/
       │      │   │   ├── <CollectedData_experimenter>.csv
       │      │   │   ├── <CollectedData_experimenter>.h5
       │      │   │   └── <Images>.png
       │      └── videos/
       │          └── <video_name2>.mp4(.avi)        
       └── ......      

```
* `labeled-data`: This directory stores the frames used to create the DLC training dataset. Frames from different videos are stored in separate subdirectories. Each frame has a filename related to the temporal index within the corresponding video, which allows the user to trace every frame back to its origin.

* `videos`: Directory of video links or videos. 

An example of this structure is `/root/capsule/data/Shailaja_behavior_data`

## Converting the second DLC project structure to the first

To run the LP code in this capsule, all data must be organized according to the first DLC project structure. Following code will convert the second DLC project structure to the first. 

In [40]:
import os
import pathlib
import shutil

data_asset = 'VBN_DLC'
path_to_dlc = '/root/capsule/data/' + data_asset
path_to_lp = '/root/capsule/scratch/' + 'LP_' + data_asset

# 1. Make the following folder-tree:
#  /path/to/DLC_project_all/
#       ├── labeled-data/
#       └── videos/

sub_folders = ['labeled-data', 'videos']
if not os.path.exists(path_to_lp):
    os.mkdir(path_to_lp)
    [os.mkdir(path_to_lp + '/' + folder) for folder in sub_folders]
    
    
# 2. From each DLC_session folder, move the following folders.  
#    2.1. Folder in DLC_session/labeled-data to DLC_project_all/labeled-data  
#    2.2. Video in DLC_session/videos to DLC_project_all/videos  
    
for filepath in pathlib.Path(path_to_dlc).glob('*'):
    filepath = str(filepath.absolute())
    print(filepath)
    if os.path.isdir(filepath):
        for folder in sub_folders:
            src_dir = filepath + '/' + folder + '/'
            dest_dir = path_to_lp + '/' + folder + '/'
            shutil.copytree(src_dir, dest_dir, dirs_exist_ok=True)

/root/capsule/data/Shailaja_behavior_data/1125713722_578257_20210901.behavior-Corbett-2023-06-26
/root/capsule/data/Shailaja_behavior_data/1065908084_544838_20201124.behavior-Shailaja-2023-06-27


To convert the DLC project to a data asset, please follow the steps: 

1. Shut down the capsule. 
2. Under the scratch directory, find the newly created DLC project ("LP_(data_asset)").
3. Click on the folder, and from the dropdown list click Create Data Asset.
4.  Complete the fields:  
    Data Asset Name (required)—Use a meaningful name so that others can find the dataset easily.   
    Folder Name (required)—The folder name inside a capsule. Use a name that’s similar to the dataset name. Spaces and some special characters are not allowed here.  
    Description (optional)—Add some text to make the dataset easy to find and understand.  
    Tags (required)—Tags are another way to help people find your dataset.  
    
5. The converted format will be saved as a data asset, which will have to be attached manually.
