<a href="https://colab.research.google.com/github/talmolab/sleap/blob/main/docs/notebooks/Interactive_and_resumable_training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Interactive and resumable training

Most of the time, you will be training models through the GUI or using the [`sleap-train` CLI](https://sleap.ai/guides/cli.html#sleap-train).

If you'd like to customize the training process, however, you can use SLEAP's low-level training functionality interactively. This allows you to define scripts that train models according to your own workflow, for example, to **resume training** on an already trained model. Another possible application would be to train a model using **transfer learning**, where a pretrained model can be used to initialize the weights of the new model.

In this notebook we will explore how to set up a training job and train a model for multiple rounds without the GUI or CLI.

## 1. Setup SLEAP

Run this cell first to install SLEAP. If you get a dependency error in subsequent cells, just click **Runtime** → **Restart runtime** to reload the packages.

Don't forget to set **Runtime** → **Change runtime type** → **GPU** as the accelerator.

In [1]:
# This should take care of all the dependencies on colab:
!pip uninstall -y opencv-python opencv-contrib-python && pip install sleap[pypi]


# But to do it locally, we'd recommend the conda package (available on Windows + Linux):
# conda create -n sleap -c sleap -c conda-forge -c nvidia sleap

Found existing installation: opencv-python 4.5.5.64
Uninstalling opencv-python-4.5.5.64:
  Successfully uninstalled opencv-python-4.5.5.64
Collecting opencv-python<=4.6.0,>=4.2.0 (from sleap[pypi])
  Using cached opencv_python-4.5.5.64-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (60.5 MB)
Installing collected packages: opencv-python
Successfully installed opencv-python-4.5.5.64


Import SLEAP to make sure it installed correctly and print out some information about the system:

In [2]:
import sleap
sleap.versions()
sleap.system_summary()

2023-08-31 12:14:25.388813: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-31 12:14:25.453889: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-08-31 12:14:25.455792: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/talmolab/micromamba/envs/sleap_jupyter/lib/python3.7/site-packages/cv2/../../

SLEAP: 1.3.2
TensorFlow: 2.11.0
Numpy: 1.21.6
Python: 3.7.12
OS: Linux-5.15.0-78-generic-x86_64-with-debian-bookworm-sid
GPUs: None detected.


2023-08-31 12:14:26.597020: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-08-31 12:14:26.597822: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/talmolab/micromamba/envs/sleap_jupyter/lib/python3.7/site-packages/cv2/../../lib64:
2023-08-31 12:14:26.597859: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/talmolab/micromamba/envs/sleap_jupyter/lib/python3.7/site-packages/cv2/../../lib64:
2023-08-31 12:14:26.597888: W tensorflow/compiler/xla/stream_executor

## 2. Setup training data

Here we will download an existing training dataset package. This is an `.slp` file that contains both the labeled poses, as well as the image data for labeled frames.

If running on Google Colab, you'll want to replace this with mounting your Google Drive folder containing your own data, or if running locally, simply change the path to your labels below in `TRAINING_SLP_FILE`.

In [3]:
# !curl -L --output labels.pkg.slp https://www.dropbox.com/s/b990gxjt3d3j3jh/210205.sleap_wt_gold.13pt.pkg.slp?dl=1
!curl -L --output labels.pkg.slp https://storage.googleapis.com/sleap-data/datasets/wt_gold.13pt/tracking_split2/train.pkg.slp
!ls -lah

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  619M  100  619M    0     0  45.7M      0  0:00:13  0:00:13 --:--:-- 53.0M
total 779M
drwxrwxr-x 3 talmolab talmolab 4.0K Aug 31 12:14  .
drwxrwxr-x 7 talmolab talmolab 4.0K Aug 31 11:39  ..
-rw-rw-r-- 1 talmolab talmolab  82M May 20  2021  190719_090330_wt_18159206_rig1.2@15000-17560.mp4.1
-rw-rw-r-- 1 talmolab talmolab 1.6M May 20  2021  190719_090330_wt_18159206_rig1.2@15000-17560.slp
-rw-rw-r-- 1 talmolab talmolab 1.6M May 20  2021  190719_090330_wt_18159206_rig1.2@15000-17560.slp.1
drwxrwxr-x 2 talmolab talmolab 4.0K Jun 20 10:00  analysis_example
-rw-rw-r-- 1 talmolab talmolab 713K Jun 20 10:00  Analysis_examples.ipynb
-rw-rw-r-- 1 talmolab talmolab 6.1M May 20  2021 'centroid.fast.210504_182918.centroid.n=1800.zip'
-rw-rw-r-- 1 talmolab talmolab 6.1M May 20  2021 'centroid.fast.210504_182918.centroid.n=1800.zip.1'
-rw-r

In [4]:
TRAINING_SLP_FILE = "labels.pkg.slp"

## 3. Setup training job

A SLEAP `TrainingJobConfig` is a structure that contains all of the hyperparameters needed to train a SLEAP model. This is typically saved out to `initial_config.json` and `training_config.json` in the model folder so that training runs can be reproduced if needed, as well as to store metadata necessary for inference.

Normally, these are generated interactively by the GUI, or manually by editing an existing JSON file in a text editor. Here, we will define a configuration interactively entirely in Python.

In [5]:
from sleap.nn.config import *

# Initialize the default training job configuration.
cfg = TrainingJobConfig()

# Update path to training data we just downloaded.
cfg.data.labels.training_labels = TRAINING_SLP_FILE
cfg.data.labels.validation_fraction = 0.1

# Preprocesssing and training parameters.
cfg.data.instance_cropping.center_on_part = "thorax"
cfg.optimization.augmentation_config.rotate = True
cfg.optimization.epochs = 10  # This is the maximum number of training rounds.

# These configures the actual neural network and the model type:
cfg.model.backbone.unet = UNetConfig(
    filters=16,
    output_stride=4
)
cfg.model.heads.centered_instance = CenteredInstanceConfmapsHeadConfig(
    anchor_part="thorax",
    sigma=1.5,
    output_stride=4
)

# Setup how we want to save the trained model.
cfg.outputs.run_name = "baseline_model.topdown"

Existing configs can also be loaded from a `.json` file with:

```python
cfg = sleap.load_config("training_config.json")
```

## 4. Training
Next we will create a SLEAP `Trainer` from the configuration we just specified. This handles all the nitty gritty mechanics necessary to setup training in the backend.

In [6]:
trainer = sleap.nn.training.Trainer.from_config(cfg)

INFO:sleap.nn.training:Loading training labels from: labels.pkg.slp
INFO:sleap.nn.training:Creating training and validation splits from validation fraction: 0.1
INFO:sleap.nn.training:  Splits: Training = 1440 / Validation = 160.


Great, now we're ready to do the first round of training. This is when the model will actually start to improve over time:

In [7]:
trainer.train()

INFO:sleap.nn.training:Setting up for training...
INFO:sleap.nn.training:Setting up pipeline builders...
INFO:sleap.nn.training:Setting up model...
INFO:sleap.nn.training:Building test pipeline...


2023-08-31 12:14:43.006704: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


INFO:sleap.nn.training:Loaded test example. [1.487s]
INFO:sleap.nn.training:  Input shape: (160, 160, 1)
INFO:sleap.nn.training:Created Keras model.
INFO:sleap.nn.training:  Backbone: UNet(stacks=1, filters=16, filters_rate=2, kernel_size=3, stem_kernel_size=7, convs_per_block=2, stem_blocks=0, down_blocks=4, middle_block=True, up_blocks=2, up_interpolate=False, block_contraction=False)
INFO:sleap.nn.training:  Max stride: 16
INFO:sleap.nn.training:  Parameters: 2,101,501
INFO:sleap.nn.training:  Heads: 
INFO:sleap.nn.training:    [0] = CenteredInstanceConfmapsHead(part_names=['head', 'thorax', 'abdomen', 'wingL', 'wingR', 'forelegL4', 'forelegR4', 'midlegL4', 'midlegR4', 'hindlegL4', 'hindlegR4', 'eyeL', 'eyeR'], anchor_part='thorax', sigma=1.5, output_stride=4, loss_weight=1.0)
INFO:sleap.nn.training:  Outputs: 
INFO:sleap.nn.training:    [0] = KerasTensor(type_spec=TensorSpec(shape=(None, 40, 40, 13), dtype=tf.float32, name=None), name='CenteredInstanceConfmapsHead/BiasAdd:0', descr

2023-08-31 12:14:44.410509: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


INFO:sleap.nn.training:  Learning rate schedule: LearningRateScheduleConfig(reduce_on_plateau=True, reduction_factor=0.5, plateau_min_delta=1e-06, plateau_patience=5, plateau_cooldown=3, min_learning_rate=1e-08)
INFO:sleap.nn.training:  Early stopping: EarlyStoppingConfig(stop_training_on_plateau=True, plateau_min_delta=1e-06, plateau_patience=10)
INFO:sleap.nn.training:Setting up outputs...
INFO:sleap.nn.training:Created run path: models/baseline_model.topdown
INFO:sleap.nn.training:Setting up visualization...


2023-08-31 12:14:45.345356: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


INFO:sleap.nn.training:Finished trainer set up. [3.3s]
INFO:sleap.nn.training:Creating tf.data.Datasets for training data generation...


2023-08-31 12:14:46.041376: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }
2

INFO:sleap.nn.training:Finished creating training datasets. [14.3s]
INFO:sleap.nn.training:Starting training loop...
Epoch 1/10


2023-08-31 12:15:00.574736: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }
2

360/360 - 30s - loss: 0.0037 - head: 0.0031 - thorax: 0.0031 - abdomen: 0.0037 - wingL: 0.0041 - wingR: 0.0041 - forelegL4: 0.0038 - forelegR4: 0.0038 - midlegL4: 0.0041 - midlegR4: 0.0041 - hindlegL4: 0.0039 - hindlegR4: 0.0040 - eyeL: 0.0035 - eyeR: 0.0035 - val_loss: 0.0033 - val_head: 0.0020 - val_thorax: 0.0027 - val_abdomen: 0.0031 - val_wingL: 0.0037 - val_wingR: 0.0038 - val_forelegL4: 0.0035 - val_forelegR4: 0.0036 - val_midlegL4: 0.0039 - val_midlegR4: 0.0039 - val_hindlegL4: 0.0038 - val_hindlegR4: 0.0039 - val_eyeL: 0.0025 - val_eyeR: 0.0026 - lr: 1.0000e-04 - 30s/epoch - 84ms/step
Epoch 2/10


2023-08-31 12:15:55.843376: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 28s - loss: 0.0028 - head: 0.0013 - thorax: 0.0019 - abdomen: 0.0027 - wingL: 0.0031 - wingR: 0.0031 - forelegL4: 0.0032 - forelegR4: 0.0033 - midlegL4: 0.0038 - midlegR4: 0.0038 - hindlegL4: 0.0037 - hindlegR4: 0.0038 - eyeL: 0.0014 - eyeR: 0.0015 - val_loss: 0.0024 - val_head: 8.4298e-04 - val_thorax: 9.6129e-04 - val_abdomen: 0.0023 - val_wingL: 0.0024 - val_wingR: 0.0024 - val_forelegL4: 0.0030 - val_forelegR4: 0.0031 - val_midlegL4: 0.0037 - val_midlegR4: 0.0037 - val_hindlegL4: 0.0036 - val_hindlegR4: 0.0037 - val_eyeL: 9.3550e-04 - val_eyeR: 9.1799e-04 - lr: 1.0000e-04 - 28s/epoch - 78ms/step
Epoch 3/10


2023-08-31 12:16:26.399534: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 31s - loss: 0.0022 - head: 7.5919e-04 - thorax: 6.8924e-04 - abdomen: 0.0021 - wingL: 0.0021 - wingR: 0.0021 - forelegL4: 0.0027 - forelegR4: 0.0029 - midlegL4: 0.0036 - midlegR4: 0.0034 - hindlegL4: 0.0034 - hindlegR4: 0.0034 - eyeL: 8.6820e-04 - eyeR: 8.5480e-04 - val_loss: 0.0020 - val_head: 7.0900e-04 - val_thorax: 4.8059e-04 - val_abdomen: 0.0020 - val_wingL: 0.0019 - val_wingR: 0.0019 - val_forelegL4: 0.0025 - val_forelegR4: 0.0027 - val_midlegL4: 0.0034 - val_midlegR4: 0.0032 - val_hindlegL4: 0.0032 - val_hindlegR4: 0.0031 - val_eyeL: 7.3284e-04 - val_eyeR: 7.2117e-04 - lr: 1.0000e-04 - 31s/epoch - 85ms/step
Epoch 4/10


2023-08-31 12:16:56.378655: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 29s - loss: 0.0019 - head: 6.4171e-04 - thorax: 4.4359e-04 - abdomen: 0.0019 - wingL: 0.0018 - wingR: 0.0018 - forelegL4: 0.0023 - forelegR4: 0.0025 - midlegL4: 0.0030 - midlegR4: 0.0028 - hindlegL4: 0.0030 - hindlegR4: 0.0029 - eyeL: 7.3767e-04 - eyeR: 6.9961e-04 - val_loss: 0.0018 - val_head: 6.7740e-04 - val_thorax: 5.4211e-04 - val_abdomen: 0.0018 - val_wingL: 0.0016 - val_wingR: 0.0017 - val_forelegL4: 0.0022 - val_forelegR4: 0.0023 - val_midlegL4: 0.0025 - val_midlegR4: 0.0028 - val_hindlegL4: 0.0029 - val_hindlegR4: 0.0028 - val_eyeL: 7.1843e-04 - val_eyeR: 7.0428e-04 - lr: 1.0000e-04 - 29s/epoch - 80ms/step
Epoch 5/10


2023-08-31 12:17:26.162611: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 30s - loss: 0.0016 - head: 5.9246e-04 - thorax: 3.7461e-04 - abdomen: 0.0017 - wingL: 0.0016 - wingR: 0.0016 - forelegL4: 0.0020 - forelegR4: 0.0021 - midlegL4: 0.0022 - midlegR4: 0.0021 - hindlegL4: 0.0026 - hindlegR4: 0.0025 - eyeL: 6.7904e-04 - eyeR: 6.4366e-04 - val_loss: 0.0015 - val_head: 5.8806e-04 - val_thorax: 3.3475e-04 - val_abdomen: 0.0016 - val_wingL: 0.0015 - val_wingR: 0.0015 - val_forelegL4: 0.0019 - val_forelegR4: 0.0019 - val_midlegL4: 0.0019 - val_midlegR4: 0.0020 - val_hindlegL4: 0.0024 - val_hindlegR4: 0.0023 - val_eyeL: 6.0869e-04 - val_eyeR: 6.2904e-04 - lr: 1.0000e-04 - 30s/epoch - 83ms/step
Epoch 6/10


2023-08-31 12:17:55.243756: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 29s - loss: 0.0014 - head: 5.3149e-04 - thorax: 3.2555e-04 - abdomen: 0.0015 - wingL: 0.0014 - wingR: 0.0014 - forelegL4: 0.0018 - forelegR4: 0.0019 - midlegL4: 0.0018 - midlegR4: 0.0018 - hindlegL4: 0.0022 - hindlegR4: 0.0021 - eyeL: 6.1009e-04 - eyeR: 5.9724e-04 - val_loss: 0.0013 - val_head: 5.5862e-04 - val_thorax: 3.7345e-04 - val_abdomen: 0.0014 - val_wingL: 0.0012 - val_wingR: 0.0013 - val_forelegL4: 0.0018 - val_forelegR4: 0.0017 - val_midlegL4: 0.0018 - val_midlegR4: 0.0018 - val_hindlegL4: 0.0022 - val_hindlegR4: 0.0021 - val_eyeL: 6.0660e-04 - val_eyeR: 5.6585e-04 - lr: 1.0000e-04 - 29s/epoch - 80ms/step
Epoch 7/10


2023-08-31 12:18:24.091349: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 29s - loss: 0.0013 - head: 4.8186e-04 - thorax: 2.8904e-04 - abdomen: 0.0014 - wingL: 0.0013 - wingR: 0.0012 - forelegL4: 0.0017 - forelegR4: 0.0017 - midlegL4: 0.0016 - midlegR4: 0.0016 - hindlegL4: 0.0020 - hindlegR4: 0.0020 - eyeL: 5.7082e-04 - eyeR: 5.4804e-04 - val_loss: 0.0012 - val_head: 5.3110e-04 - val_thorax: 2.4881e-04 - val_abdomen: 0.0013 - val_wingL: 0.0012 - val_wingR: 0.0012 - val_forelegL4: 0.0016 - val_forelegR4: 0.0017 - val_midlegL4: 0.0015 - val_midlegR4: 0.0016 - val_hindlegL4: 0.0020 - val_hindlegR4: 0.0019 - val_eyeL: 5.4547e-04 - val_eyeR: 5.2963e-04 - lr: 1.0000e-04 - 29s/epoch - 80ms/step
Epoch 8/10


2023-08-31 12:18:52.573865: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 30s - loss: 0.0012 - head: 4.4063e-04 - thorax: 2.4833e-04 - abdomen: 0.0013 - wingL: 0.0011 - wingR: 0.0011 - forelegL4: 0.0016 - forelegR4: 0.0016 - midlegL4: 0.0014 - midlegR4: 0.0015 - hindlegL4: 0.0018 - hindlegR4: 0.0018 - eyeL: 5.2887e-04 - eyeR: 5.0335e-04 - val_loss: 0.0011 - val_head: 4.2716e-04 - val_thorax: 2.5146e-04 - val_abdomen: 0.0013 - val_wingL: 0.0011 - val_wingR: 0.0011 - val_forelegL4: 0.0015 - val_forelegR4: 0.0014 - val_midlegL4: 0.0014 - val_midlegR4: 0.0014 - val_hindlegL4: 0.0018 - val_hindlegR4: 0.0019 - val_eyeL: 5.5738e-04 - val_eyeR: 4.7800e-04 - lr: 1.0000e-04 - 30s/epoch - 82ms/step
Epoch 9/10


2023-08-31 12:19:22.484569: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 29s - loss: 0.0011 - head: 4.2217e-04 - thorax: 2.3167e-04 - abdomen: 0.0012 - wingL: 0.0010 - wingR: 0.0010 - forelegL4: 0.0015 - forelegR4: 0.0015 - midlegL4: 0.0013 - midlegR4: 0.0014 - hindlegL4: 0.0017 - hindlegR4: 0.0017 - eyeL: 4.9500e-04 - eyeR: 4.8492e-04 - val_loss: 0.0010 - val_head: 4.3320e-04 - val_thorax: 1.8593e-04 - val_abdomen: 0.0011 - val_wingL: 9.9685e-04 - val_wingR: 9.9268e-04 - val_forelegL4: 0.0015 - val_forelegR4: 0.0014 - val_midlegL4: 0.0013 - val_midlegR4: 0.0013 - val_hindlegL4: 0.0017 - val_hindlegR4: 0.0017 - val_eyeL: 4.6401e-04 - val_eyeR: 4.8465e-04 - lr: 1.0000e-04 - 29s/epoch - 80ms/step
Epoch 10/10


2023-08-31 12:19:51.650968: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 29s - loss: 0.0010 - head: 3.9684e-04 - thorax: 2.0403e-04 - abdomen: 0.0011 - wingL: 9.6304e-04 - wingR: 9.5580e-04 - forelegL4: 0.0014 - forelegR4: 0.0014 - midlegL4: 0.0012 - midlegR4: 0.0013 - hindlegL4: 0.0016 - hindlegR4: 0.0016 - eyeL: 4.6837e-04 - eyeR: 4.5866e-04 - val_loss: 9.8112e-04 - val_head: 4.0969e-04 - val_thorax: 1.7037e-04 - val_abdomen: 0.0011 - val_wingL: 9.0576e-04 - val_wingR: 9.1009e-04 - val_forelegL4: 0.0014 - val_forelegR4: 0.0014 - val_midlegL4: 0.0012 - val_midlegR4: 0.0012 - val_hindlegL4: 0.0016 - val_hindlegR4: 0.0016 - val_eyeL: 4.2912e-04 - val_eyeR: 4.2165e-04 - lr: 1.0000e-04 - 29s/epoch - 81ms/step
INFO:sleap.nn.training:Finished training loop. [4.9 min]
INFO:sleap.nn.training:Deleting visualization directory: models/baseline_model.topdown/viz
INFO:sleap.nn.training:Saving evaluation metrics to model folder...


Output()

2023-08-31 12:19:54.971852: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_UINT8 } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_UINT8 shape { dim { size: 4 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: -29 } dim { size: -30 } dim { size: 1 } } }
2

INFO:sleap.nn.evals:Saved predictions: models/baseline_model.topdown/labels_pr.train.slp
INFO:sleap.nn.evals:Saved metrics: models/baseline_model.topdown/metrics.train.npz
INFO:sleap.nn.evals:OKS mAP: 0.511409


Output()

2023-08-31 12:20:11.992794: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_UINT8 } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_UINT8 shape { dim { size: 4 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: -29 } dim { size: -30 } dim { size: 1 } } }
2

INFO:sleap.nn.evals:Saved predictions: models/baseline_model.topdown/labels_pr.val.slp
INFO:sleap.nn.evals:Saved metrics: models/baseline_model.topdown/metrics.val.npz
INFO:sleap.nn.evals:OKS mAP: 0.509519


## 5. Continuing training

If we still have the trainer in memory, we can continue training by simply calling `trainer.train()` again with a potentially different number of epochs:

In [8]:
trainer.config.optimization.epochs = 3
trainer.train()

INFO:sleap.nn.training:Creating tf.data.Datasets for training data generation...


2023-08-31 12:20:25.509780: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }
2

INFO:sleap.nn.training:Finished creating training datasets. [14.0s]
INFO:sleap.nn.training:Starting training loop...
Epoch 1/3


2023-08-31 12:20:28.074333: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }
2

360/360 - 29s - loss: 9.5565e-04 - head: 3.6899e-04 - thorax: 1.8458e-04 - abdomen: 0.0010 - wingL: 8.9453e-04 - wingR: 8.8215e-04 - forelegL4: 0.0014 - forelegR4: 0.0014 - midlegL4: 0.0012 - midlegR4: 0.0012 - hindlegL4: 0.0015 - hindlegR4: 0.0016 - eyeL: 4.4440e-04 - eyeR: 4.2930e-04 - val_loss: 9.6708e-04 - val_head: 4.2331e-04 - val_thorax: 2.4029e-04 - val_abdomen: 0.0010 - val_wingL: 8.9596e-04 - val_wingR: 9.0811e-04 - val_forelegL4: 0.0014 - val_forelegR4: 0.0013 - val_midlegL4: 0.0011 - val_midlegR4: 0.0012 - val_hindlegL4: 0.0016 - val_hindlegR4: 0.0016 - val_eyeL: 5.0103e-04 - val_eyeR: 4.4195e-04 - lr: 1.0000e-04 - 29s/epoch - 79ms/step
Epoch 2/3


2023-08-31 12:21:24.436151: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 31s - loss: 9.1491e-04 - head: 3.5620e-04 - thorax: 1.6408e-04 - abdomen: 0.0010 - wingL: 8.5280e-04 - wingR: 8.3420e-04 - forelegL4: 0.0013 - forelegR4: 0.0013 - midlegL4: 0.0011 - midlegR4: 0.0011 - hindlegL4: 0.0015 - hindlegR4: 0.0015 - eyeL: 4.2744e-04 - eyeR: 4.1630e-04 - val_loss: 9.0186e-04 - val_head: 3.5358e-04 - val_thorax: 1.6415e-04 - val_abdomen: 9.6204e-04 - val_wingL: 7.9800e-04 - val_wingR: 8.5953e-04 - val_forelegL4: 0.0013 - val_forelegR4: 0.0013 - val_midlegL4: 0.0010 - val_midlegR4: 0.0012 - val_hindlegL4: 0.0015 - val_hindlegR4: 0.0016 - val_eyeL: 3.9051e-04 - val_eyeR: 4.1612e-04 - lr: 1.0000e-04 - 31s/epoch - 85ms/step
Epoch 3/3


2023-08-31 12:21:56.130066: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 31s - loss: 8.7575e-04 - head: 3.3607e-04 - thorax: 1.5766e-04 - abdomen: 9.7867e-04 - wingL: 8.2189e-04 - wingR: 7.9939e-04 - forelegL4: 0.0013 - forelegR4: 0.0013 - midlegL4: 0.0010 - midlegR4: 0.0011 - hindlegL4: 0.0014 - hindlegR4: 0.0014 - eyeL: 4.0498e-04 - eyeR: 3.9365e-04 - val_loss: 8.6277e-04 - val_head: 3.3302e-04 - val_thorax: 1.6257e-04 - val_abdomen: 9.8688e-04 - val_wingL: 8.9042e-04 - val_wingR: 8.0936e-04 - val_forelegL4: 0.0012 - val_forelegR4: 0.0012 - val_midlegL4: 9.6707e-04 - val_midlegR4: 0.0011 - val_hindlegL4: 0.0014 - val_hindlegR4: 0.0014 - val_eyeL: 3.8715e-04 - val_eyeR: 3.7773e-04 - lr: 1.0000e-04 - 31s/epoch - 85ms/step
INFO:sleap.nn.training:Finished training loop. [1.5 min]
INFO:sleap.nn.training:Deleting visualization directory: models/baseline_model.topdown/viz
INFO:sleap.nn.training:Saving evaluation metrics to model folder...


Output()

2023-08-31 12:21:58.961188: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_UINT8 } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_UINT8 shape { dim { size: 4 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: -29 } dim { size: -30 } dim { size: 1 } } }
2

INFO:sleap.nn.evals:Saved predictions: models/baseline_model.topdown/labels_pr.train.slp
INFO:sleap.nn.evals:Saved metrics: models/baseline_model.topdown/metrics.train.npz
INFO:sleap.nn.evals:OKS mAP: 0.559398


Output()

2023-08-31 12:22:16.271571: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_UINT8 } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_UINT8 shape { dim { size: 4 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: -29 } dim { size: -30 } dim { size: 1 } } }
2

INFO:sleap.nn.evals:Saved predictions: models/baseline_model.topdown/labels_pr.val.slp
INFO:sleap.nn.evals:Saved metrics: models/baseline_model.topdown/metrics.val.npz
INFO:sleap.nn.evals:OKS mAP: 0.559997


As you can see, the loss and accuracy pick up from where it left off in the previous training.


Usually, however, if you're continuing training it's likely because you're starting off from an already trained model.

In this case, all you need to do to continue training is to create a new `Trainer` from the existing model configuration and load up the weights before continuing training:

In [9]:
# Load config.
cfg = sleap.load_config("models/baseline_model.topdown")
# cfg.outputs.run_name = "new_folder"  # Set the run_name to a new value if you want the model to be saved to a different folder.

# Create and initialize the trainer.
trainer = sleap.nn.training.Trainer.from_config(cfg)
trainer.setup()

# Replace the randomly initialized weights with the saved weights.
trainer.keras_model.load_weights("models/baseline_model.topdown/best_model.h5")

INFO:sleap.nn.training:Loading training labels from: labels.pkg.slp


INFO:sleap.nn.training:Creating training and validation splits from validation fraction: 0.1
INFO:sleap.nn.training:  Splits: Training = 1440 / Validation = 160.
INFO:sleap.nn.training:Setting up for training...
INFO:sleap.nn.training:Setting up pipeline builders...
INFO:sleap.nn.training:Setting up model...
INFO:sleap.nn.training:Building test pipeline...
INFO:sleap.nn.training:Loaded test example. [0.807s]
INFO:sleap.nn.training:  Input shape: (160, 160, 1)
INFO:sleap.nn.training:Created Keras model.
INFO:sleap.nn.training:  Backbone: UNet(stacks=1, filters=16, filters_rate=2.0, kernel_size=3, stem_kernel_size=7, convs_per_block=2, stem_blocks=0, down_blocks=4, middle_block=True, up_blocks=2, up_interpolate=False, block_contraction=False)
INFO:sleap.nn.training:  Max stride: 16
INFO:sleap.nn.training:  Parameters: 2,101,501
INFO:sleap.nn.training:  Heads: 
INFO:sleap.nn.training:    [0] = CenteredInstanceConfmapsHead(part_names=['head', 'thorax', 'abdomen', 'wingL', 'wingR', 'foreleg

2023-08-31 12:22:20.324364: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


INFO:sleap.nn.training:Setting up visualization...


2023-08-31 12:22:21.223670: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


INFO:sleap.nn.training:Finished trainer set up. [2.4s]


2023-08-31 12:22:21.913322: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


In [10]:
trainer.config.optimization.epochs = 3
trainer.train()

INFO:sleap.nn.training:Creating tf.data.Datasets for training data generation...


2023-08-31 12:22:33.450983: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }
2

INFO:sleap.nn.training:Finished creating training datasets. [14.1s]
INFO:sleap.nn.training:Starting training loop...
Epoch 1/3


2023-08-31 12:22:36.356954: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }
2

360/360 - 32s - loss: 8.5582e-04 - head: 3.3795e-04 - thorax: 1.5564e-04 - abdomen: 9.4599e-04 - wingL: 7.9941e-04 - wingR: 7.8252e-04 - forelegL4: 0.0012 - forelegR4: 0.0012 - midlegL4: 0.0010 - midlegR4: 0.0010 - hindlegL4: 0.0014 - hindlegR4: 0.0014 - eyeL: 4.0100e-04 - eyeR: 3.9849e-04 - val_loss: 8.1115e-04 - val_head: 2.9811e-04 - val_thorax: 1.0771e-04 - val_abdomen: 0.0010 - val_wingL: 7.3117e-04 - val_wingR: 7.5454e-04 - val_forelegL4: 0.0011 - val_forelegR4: 0.0013 - val_midlegL4: 8.0686e-04 - val_midlegR4: 0.0010 - val_hindlegL4: 0.0012 - val_hindlegR4: 0.0014 - val_eyeL: 3.7697e-04 - val_eyeR: 3.3469e-04 - lr: 1.0000e-04 - 32s/epoch - 88ms/step
Epoch 2/3


2023-08-31 12:23:35.246699: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 29s - loss: 8.2423e-04 - head: 3.2766e-04 - thorax: 1.4867e-04 - abdomen: 9.3683e-04 - wingL: 7.7337e-04 - wingR: 7.4472e-04 - forelegL4: 0.0012 - forelegR4: 0.0012 - midlegL4: 9.6605e-04 - midlegR4: 9.8418e-04 - hindlegL4: 0.0013 - hindlegR4: 0.0014 - eyeL: 3.9413e-04 - eyeR: 3.8492e-04 - val_loss: 8.0816e-04 - val_head: 2.7702e-04 - val_thorax: 1.2315e-04 - val_abdomen: 9.8622e-04 - val_wingL: 7.4837e-04 - val_wingR: 8.1575e-04 - val_forelegL4: 0.0011 - val_forelegR4: 0.0013 - val_midlegL4: 7.4206e-04 - val_midlegR4: 9.8677e-04 - val_hindlegL4: 0.0013 - val_hindlegR4: 0.0014 - val_eyeL: 4.0262e-04 - val_eyeR: 3.2560e-04 - lr: 1.0000e-04 - 29s/epoch - 80ms/step
Epoch 3/3


2023-08-31 12:24:05.062109: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_FLOAT } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_FLOAT shape { dim { size: 1 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 160 } dim { size: 160 } dim { size: 1 } } }


360/360 - 31s - loss: 7.9676e-04 - head: 3.1471e-04 - thorax: 1.3562e-04 - abdomen: 9.0677e-04 - wingL: 7.2937e-04 - wingR: 7.3654e-04 - forelegL4: 0.0011 - forelegR4: 0.0011 - midlegL4: 9.3928e-04 - midlegR4: 9.4339e-04 - hindlegL4: 0.0013 - hindlegR4: 0.0013 - eyeL: 3.8123e-04 - eyeR: 3.7266e-04 - val_loss: 7.7582e-04 - val_head: 2.9292e-04 - val_thorax: 1.5784e-04 - val_abdomen: 9.0268e-04 - val_wingL: 7.6222e-04 - val_wingR: 7.1502e-04 - val_forelegL4: 0.0011 - val_forelegR4: 0.0013 - val_midlegL4: 7.0143e-04 - val_midlegR4: 9.4814e-04 - val_hindlegL4: 0.0012 - val_hindlegR4: 0.0013 - val_eyeL: 3.7808e-04 - val_eyeR: 3.2417e-04 - lr: 1.0000e-04 - 31s/epoch - 86ms/step
INFO:sleap.nn.training:Finished training loop. [1.5 min]
INFO:sleap.nn.training:Deleting visualization directory: models/baseline_model.topdown/viz
INFO:sleap.nn.training:Saving evaluation metrics to model folder...


Output()

2023-08-31 12:24:09.018360: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_UINT8 } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_UINT8 shape { dim { size: 4 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: -29 } dim { size: -30 } dim { size: 1 } } }
2

INFO:sleap.nn.evals:Saved predictions: models/baseline_model.topdown/labels_pr.train.slp
INFO:sleap.nn.evals:Saved metrics: models/baseline_model.topdown/metrics.train.npz
INFO:sleap.nn.evals:OKS mAP: 0.585983


Output()

2023-08-31 12:24:25.576891: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:690] Error in PredictCost() for the op: op: "CropAndResize" attr { key: "T" value { type: DT_UINT8 } } attr { key: "extrapolation_value" value { f: 0 } } attr { key: "method" value { s: "bilinear" } } inputs { dtype: DT_UINT8 shape { dim { size: 4 } dim { size: 1024 } dim { size: 1024 } dim { size: 1 } } } inputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: 4 } } } inputs { dtype: DT_INT32 shape { dim { size: -2 } } } inputs { dtype: DT_INT32 shape { dim { size: 2 } } } device { type: "CPU" vendor: "GenuineIntel" model: "103" frequency: 3600 num_cores: 16 environment { key: "cpu_instruction_set" value: "AVX SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2" } environment { key: "eigen" value: "3.4.90" } l1_cache_size: 49152 l2_cache_size: 524288 l3_cache_size: 16777216 memory_size: 268435456 } outputs { dtype: DT_FLOAT shape { dim { size: -2 } dim { size: -29 } dim { size: -30 } dim { size: 1 } } }
2

INFO:sleap.nn.evals:Saved predictions: models/baseline_model.topdown/labels_pr.val.slp
INFO:sleap.nn.evals:Saved metrics: models/baseline_model.topdown/metrics.val.npz
INFO:sleap.nn.evals:OKS mAP: 0.596779


Again, the loss and accuracy pick up from where they left off prior to this round of training.

The resulting model can be used as usual for inference on new data.