# Training and inference on your own data using Google Drive

In this notebook we'll install SLEAP, import training data into Colab using [Google Drive](https://www.google.com/drive), and run training and inference.

## Install SLEAP
Note: Before installing SLEAP check [SLEAP releases](https://github.com/talmolab/sleap/releases) page for the latest version.

In [None]:
!pip uninstall -y opencv-python opencv-contrib-python
!pip install sleap

Found existing installation: opencv-python 4.7.0.72
Uninstalling opencv-python-4.7.0.72:
  Successfully uninstalled opencv-python-4.7.0.72
Found existing installation: opencv-contrib-python 4.7.0.72
Uninstalling opencv-contrib-python-4.7.0.72:
  Successfully uninstalled opencv-contrib-python-4.7.0.72
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting sleap
  Downloading sleap-1.3.1-py3-none-any.whl (64.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.4/64.4 MB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
Collecting attrs<=21.4.0,>=21.2.0 (from sleap)
  Downloading attrs-21.4.0-py2.py3-none-any.whl (60 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.6/60.6 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting cattrs==1.1.1 (from sleap)
  Downloading cattrs-1.1.1-py3-none-any.whl (16 kB)
Collecting jsonpickle==1.2 (from sleap)
  Downloading jsonpickle-1.2-py2.py3-none-any.whl

## Import training data into Colab with Google Drive
We'll first prepare and export the training data from SLEAP, then upload it to Google Drive, and then mount Google Drive  into this Colab notebook.

### Create and export the training job package
A self-contained **training job package** contains a .slp file with labeled data and images which will be used for training, as well as .json training configuration file(s).

A training job package can be exported in the SLEAP GUI fron the "Run Training.." dialog under the "Predict" menu.

### Upload training job package to Google Drive
To be consistent with the examples in this notebook, name the SLEAP project `colab` and create a directory called `sleap` in the root of your Google Drive. Then upload the exported training job package `colab.slp.training_job.zip` into `sleap` directory.

If you place your training pckage somewhere else, or name it differently, adjust the paths/filenames/parameters below accordingly.

### Mount your Google Drive
Mounting your Google Drive will allow you to accessed the uploaded training job package in this notebook. When prompted to log into your Google account, give Colab access and the copy the authorization code into a field below (+ hit 'return').

In [None]:
from google.colab import drive
drive.mount('/content/drive/')

Let's set your current working directory to the directory with your training job package and unpack it there. Later on the output from training (i.e., the models) and from interence (i.e., predictions) will all be saved in this directory as well.

In [None]:
import os
os.chdir("/content/drive/MyDrive/COCOHorseSleapPose")
!unzip /content/drive/MyDrive/COCOHorseSleapPose/HorseCOCO.slp.training_job.zip
!ls

Archive:  /content/drive/MyDrive/COCOHorseSleapPose/HorseCOCO.slp.training_job.zip
  inflating: centered_instance.json  
  inflating: centroid.json           
  inflating: HorseCOCO.pkg.slp       
  inflating: inference-script.sh     
  inflating: jobs.yaml               
  inflating: train-script.sh         
centered_instance.json	HorseCOCO.slp.training_job.zip	train-script.sh
centroid.json		inference-script.sh
HorseCOCO.pkg.slp	jobs.yaml


## Train a model

Let's train a model with the training profile (.json file) and the project data (.slp file) you have exported from SLEAP.


### Note on training profiles
Depending on the pipeline you chose in the training dialog, the config filename(s) will be:

- for a **bottom-up** pipeline approach: `multi_instance.json` (this is the pipeline we assume here),

- for a **top-down** pipeline, you'll have a different profile for each of the models: `centroid.json` and `centered_instance.json`,

- for a **single animal** pipeline: `single_instance.json`.


### Note on training process
When you start training, you'll first see the training parameters and then the training and validation loss for each training epoch.

As soon as you're satisfied with the validation loss you see for an epoch during training, you're welcome to stop training by clicking the stop button. The version of the model with the lowest validation loss is saved during training, and that's what will be used for inference.

If you don't stop training, it will run for 200 epochs or until validation loss fails to improve for some number of epochs (controlled by the early_stopping fields in the training profile).

In [None]:
!sleap-train multi_instance.json colab.pkg.slp

## Train Top_down Model

In [None]:
!sleap-train /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/centroid.json /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/HorseCOCO.pkg.slp

INFO:numexpr.utils:NumExpr defaulting to 2 threads.
INFO:sleap.nn.training:Versions:
SLEAP: 1.3.0
TensorFlow: 2.8.4
Numpy: 1.22.4
Python: 3.10.11
OS: Linux-5.15.107+-x86_64-with-glibc2.31
INFO:sleap.nn.training:Training labels file: /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/HorseCOCO.pkg.slp
INFO:sleap.nn.training:Training profile: /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/centroid.json
INFO:sleap.nn.training:
INFO:sleap.nn.training:Arguments:
INFO:sleap.nn.training:{
    "training_job_path": "/content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/centroid.json",
    "labels_path": "/content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/HorseCOCO.pkg.slp",
    "video_paths": [
        ""
    ],
    "val_labels": null,
    "test_labels": null,
    "base_checkpoint": null,
    "tensorboard": false,
    "save_viz": false,
    "zmq": false,
    "run_name": "",
    "prefix": "",
    "suffix": "",
    "cpu": false,
    "first_gpu": false,
    "last_gpu": false,
    "gpu

In [None]:
!sleap-train /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/centered_instance.json /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/HorseCOCO.pkg.slp

INFO:numexpr.utils:NumExpr defaulting to 2 threads.
INFO:sleap.nn.training:Versions:
SLEAP: 1.3.0
TensorFlow: 2.8.4
Numpy: 1.22.4
Python: 3.10.11
OS: Linux-5.15.107+-x86_64-with-glibc2.31
INFO:sleap.nn.training:Training labels file: /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/HorseCOCO.pkg.slp
INFO:sleap.nn.training:Training profile: /content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/centered_instance.json
INFO:sleap.nn.training:
INFO:sleap.nn.training:Arguments:
INFO:sleap.nn.training:{
    "training_job_path": "/content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/centered_instance.json",
    "labels_path": "/content/drive/MyDrive/COCOHorseSleapPose/TrainJopPgk/HorseCOCO.pkg.slp",
    "video_paths": [
        ""
    ],
    "val_labels": null,
    "test_labels": null,
    "base_checkpoint": null,
    "tensorboard": false,
    "save_viz": false,
    "zmq": false,
    "run_name": "",
    "prefix": "",
    "suffix": "",
    "cpu": false,
    "first_gpu": false,
    "last_gpu

If instead of bottom-up you've chosen the top-down pipeline (with two training configs), you would need to invoke two separate training jobs in sequence:

- `!sleap-train centroid.json colab.pkg.slp`
- `!sleap-train centered_instance.json colab.pkg.slp`


## Predicting and tracking instances in uploaded video

### Inference with top-down models using single video

If you trained the pair of models needed for top-down inference, you can call `sleap-track` with `-m path/to/model` for each model, like so:

In [None]:
!sleap-track /content/3.mp4 \
    --frames 0-5 \
    --tracking.tracker simple \
    -m /content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centered_instance \
    -m /content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centroid

INFO:numexpr.utils:NumExpr defaulting to 2 threads.
Started inference at: 2023-05-15 09:51:29.640459
Args:
[1m{[0m
[2;32m│   [0m[32m'data_path'[0m: [32m'/content/3.mp4'[0m,
[2;32m│   [0m[32m'models'[0m: [1m[[0m
[2;32m│   │   [0m[32m'/content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centered_instance'[0m,
[2;32m│   │   [0m[32m'/content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centroid'[0m
[2;32m│   [0m[1m][0m,
[2;32m│   [0m[32m'frames'[0m: [32m'0-5'[0m,
[2;32m│   [0m[32m'only_labeled_frames'[0m: [3;91mFalse[0m,
[2;32m│   [0m[32m'only_suggested_frames'[0m: [3;91mFalse[0m,
[2;32m│   [0m[32m'output'[0m: [3;35mNone[0m,
[2;32m│   [0m[32m'no_empty_frames'[0m: [3;91mFalse[0m,
[2;32m│   [0m[32m'verbosity'[0m: [32m'rich'[0m,
[2;32m│   [0m[32m'video.dataset'[0m: [3;35mNone[0m,
[2;32m│   [0m[32m'video.input_format'[0m: [32m'channels_last'[0m,
[2;32m│   [0m[32m'video.index'[0m: [32m''[0m,
[2;3

### Inference with top-down models using Multiple video

In [None]:
import os
os.chdir("/content/drive/MyDrive/COCOHorseSleapPose")
!unzip /content/drive/MyDrive/COCOHorseSleapPose/Videos.zip
!ls

Archive:  /content/drive/MyDrive/COCOHorseSleapPose/Videos.zip
  inflating: Videos/AHTCDBME.mp4     
  inflating: Videos/AKPNRXGD.mp4     
  inflating: Videos/AKXVOXGD.mp4     
  inflating: Videos/AMVQFXGD.mp4     
  inflating: Videos/AMZSRXGD.mp4     
  inflating: Videos/ANGLMXGD.mp4     
  inflating: Videos/ANMSEBME.mp4     
  inflating: Videos/APZQNXGD.mp4     
  inflating: Videos/AQCYDXGD.mp4     
  inflating: Videos/ARNCVXGD.mp4     
  inflating: Videos/ASHUOMHZ.mp4     
  inflating: Videos/ATSFLXGD.mp4     
  inflating: Videos/BERFBXGD.mp4     
  inflating: Videos/BOEFDXGD.mp4     
  inflating: Videos/BYPHZXGD.mp4     
  inflating: Videos/CZORWBME.mp4     
  inflating: Videos/DGIRLTCN.mp4     
  inflating: Videos/DHMLCBME.mp4     
  inflating: Videos/YYLUPXGD.mp4     
 AKXVOXGD.mp4						    models
 AKXVOXGD.mp4.predictions.slp				    TrainJopPgk
 BERFBXGD.mp4						    Videos
 BERFBXGD.mp4.predictions.slp				    Videos.zip
'Copy of Training_and_inference_using_Google_Drive.ipynb'


In [None]:
import os

def get_video_paths(folder_path):
    video_paths = []
    valid_extensions = ['.mp4', '.avi', '.mkv']  # Add more extensions if needed

    for root, dirs, files in os.walk(folder_path):
        for file in files:
            _, extension = os.path.splitext(file)
            if extension.lower() in valid_extensions:
                video_path = os.path.join(root, file)
                video_paths.append(video_path)

    return video_paths

# Example usage
folder_path = '/content/drive/MyDrive/COCOHorseSleapPose/Videos'  # Replace with the actual folder path

video_paths = get_video_paths(folder_path)
for path in video_paths:
    print(path)


/content/drive/MyDrive/COCOHorseSleapPose/Videos/AHTCDBME.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/AKPNRXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/AKXVOXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/AMVQFXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/AMZSRXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/ANGLMXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/ANMSEBME.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/APZQNXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/AQCYDXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/ARNCVXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/ASHUOMHZ.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/ATSFLXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/BERFBXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/BOEFDXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/BYPHZXGD.mp4
/content/drive/MyDrive/COCOHorseSleapPose/Videos/CZORWBME.mp4
/content

In [None]:
import cv2

def get_frame_count(video_path):
    cap = cv2.VideoCapture(video_path)
    frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    cap.release()
    return frame_count

In [None]:
for video in video_paths :
  frame_count = get_frame_count(video) - 1
  !sleap-track $video \
    --frames 0-$frame_count \
    --tracking.tracker simple \
    -m /content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centered_instance \
    -m /content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centroid

INFO:numexpr.utils:NumExpr defaulting to 2 threads.
Started inference at: 2023-05-18 07:31:12.073084
Args:
[1m{[0m
[2;32m│   [0m[32m'data_path'[0m: [32m'/content/drive/MyDrive/COCOHorseSleapPose/Videos/AHTCDBME.mp4'[0m,
[2;32m│   [0m[32m'models'[0m: [1m[[0m
[2;32m│   │   [0m[32m'/content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centered_instance'[0m,
[2;32m│   │   [0m[32m'/content/drive/MyDrive/COCOHorseSleapPose/models/230513_124413.centroid'[0m
[2;32m│   [0m[1m][0m,
[2;32m│   [0m[32m'frames'[0m: [32m'0-103'[0m,
[2;32m│   [0m[32m'only_labeled_frames'[0m: [3;91mFalse[0m,
[2;32m│   [0m[32m'only_suggested_frames'[0m: [3;91mFalse[0m,
[2;32m│   [0m[32m'output'[0m: [3;35mNone[0m,
[2;32m│   [0m[32m'no_empty_frames'[0m: [3;91mFalse[0m,
[2;32m│   [0m[32m'verbosity'[0m: [32m'rich'[0m,
[2;32m│   [0m[32m'video.dataset'[0m: [3;35mNone[0m,
[2;32m│   [0m[32m'video.input_format'[0m: [32m'channels_last'[0m,
[2;32m│