# Training and inference on an example dataset

In this notebook we'll install SLEAP, download a sample dataset, run training and inference on that dataset using the SLEAP command-line interface, and then download the predictions.

## Install SLEAP
Note: Before installing SLEAP check [SLEAP releases](https://github.com/talmolab/sleap/releases) page for the latest version.

In [1]:
!pip uninstall -qqq -y opencv-python opencv-contrib-python
!pip install -qqq "sleap[pypi]>=1.3.3"

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m904.1/904.1 kB[0m [31m15.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.2/88.2 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m19.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.6/60.6 kB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.8/16.8 MB[0m [31m58.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.9/60.9 MB[0m [31m13.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m60.2 MB/s[0m e

## Download sample training data into Colab
Let's download a sample dataset from the SLEAP [sample datasets repository](https://github.com/talmolab/sleap-datasets) into Colab.

In [2]:
!apt-get install tree
!wget -O dataset.zip https://github.com/talmolab/sleap-datasets/releases/download/dm-courtship-v1/drosophila-melanogaster-courtship.zip
!mkdir dataset
!unzip dataset.zip -d dataset
!rm dataset.zip
!tree dataset

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  tree
0 upgraded, 1 newly installed, 0 to remove and 45 not upgraded.
Need to get 47.9 kB of archives.
After this operation, 116 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 tree amd64 2.0.2-1 [47.9 kB]
Fetched 47.9 kB in 1s (84.2 kB/s)
Selecting previously unselected package tree.
(Reading database ... 121918 files and directories currently installed.)
Preparing to unpack .../tree_2.0.2-1_amd64.deb ...
Unpacking tree (2.0.2-1) ...
Setting up tree (2.0.2-1) ...
Processing triggers for man-db (2.10.2-1) ...
--2024-05-14 11:56:02--  https://github.com/talmolab/sleap-datasets/releases/download/dm-courtship-v1/drosophila-melanogaster-courtship.zip
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting

## Train models
For the top-down pipeline, we'll need train two models: a centroid model and a centered-instance model.

Using the command-line interface, we'll first train a model for centroids using the default **training profile**. The training profile determines the model architecture, the learning rate, and other parameters.

When you start training, you'll first see the training parameters and then the training and validation loss for each training epoch.

As soon as you're satisfied with the validation loss you see for an epoch during training, you're welcome to stop training by clicking the stop button. The version of the model with the lowest validation loss is saved during training, and that's what will be used for inference.

If you don't stop training, it will run for 200 epochs or until validation loss fails to improve for some number of epochs (controlled by the `early_stopping` fields in the training profile).

In [None]:
!sleap-train baseline.centroid.json "dataset/drosophila-melanogaster-courtship/courtship_labels.slp" --run_name "courtship.centroid" --video-paths "dataset/drosophila-melanogaster-courtship/20190128_113421.mp4"

INFO:numexpr.utils:NumExpr defaulting to 2 threads.
2024-05-14 11:56:10.045743: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/dist-packages/cv2/../../lib64:/usr/lib64-nvidia
2024-05-14 11:56:10.045778: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
INFO:sleap.nn.training:Versions:
SLEAP: 1.4.0
TensorFlow: 2.8.4
Numpy: 1.22.4
Python: 3.10.12
OS: Linux-6.1.58+-x86_64-with-glibc2.35
INFO:sleap.nn.training:Training labels file: dataset/drosophila-melanogaster-courtship/courtship_labels.slp
INFO:sleap.nn.training:Training profile: /usr/local/lib/python3.10/dist-packages/sleap/training_profiles/baseline.centroid.json
INFO:sleap.nn.training:
INFO:sleap.nn.training:Arguments:
INFO:sleap.nn.training:{
    "training_j

Let's now train a centered-instance model.

In [None]:
!sleap-train baseline_medium_rf.topdown.json "dataset/drosophila-melanogaster-courtship/courtship_labels.slp" --run_name "courtship.topdown_confmaps" --video-paths "dataset/drosophila-melanogaster-courtship/20190128_113421.mp4"

The models (along with the profiles and ground truth data used to train and validate the model) are saved in the `models/` directory:

In [None]:
!tree models/

## Inference
Let's run inference with our trained models for centroids and centered instances.

In [None]:
!sleap-track "dataset/drosophila-melanogaster-courtship/20190128_113421.mp4" --frames 0-100 -m "models/courtship.centroid" -m "models/courtship.topdown_confmaps"

When inference is finished, predictions are saved in a file. Since we didn't specify a path, it will be saved as `<video filename>.predictions.slp` in the same directory as the video:

In [None]:
!tree dataset/drosophila-melanogaster-courtship

You can inspect your predictions file using `sleap-inspect`:

In [None]:
!sleap-inspect dataset/drosophila-melanogaster-courtship/20190128_113421.mp4.predictions.slp

If you're using Chrome you can download your trained models like so:

In [None]:
# Zip up the models directory
!zip -r trained_models.zip models/

# Download.
from google.colab import files
files.download("/content/trained_models.zip")

And you can likewise download your predictions:

In [None]:
from google.colab import files
files.download('dataset/drosophila-melanogaster-courtship/20190128_113421.mp4.predictions.slp')

In some other browsers (Safari) you might get an error and you can instead download using the "Files" tab in the side panel (it has a folder icon). Select "Show table of contents" in the "View" menu if you don't see the side panel.