# Showcasing Protoplast Checkpointing in Cell-line Classification Model

## 1. Introduction

This notebook showcases the checkpointing feature in PROTOplast, which enables resuming model training after finishing one dataset & switching to another. It demonstrates how to save and load training checkpoints, making it easy to continue model development without starting from scratch. This is particularly useful for long training sessions, experimentation with various datasets, or training across multiple sessions or environments.

In [1]:
import anndata
import glob
import numpy as np
import pandas as pd
import os
import pathlib
import protoplast as pt
import ray
import torch

from anndata.experimental import AnnCollection
from protoplast.scrna.anndata.lightning_models import LinearClassifier
from protoplast.scrna.anndata.trainer import RayTrainRunner
from protoplast.scrna.anndata.torch_dataloader import DistributedAnnDataset
from protoplast.scrna.anndata.torch_dataloader import cell_line_metadata_cb, DistributedCellLineAnnDataset

from ray.train import Checkpoint
from ray.train.lightning import RayDDPStrategy

✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!


  import pynvml  # type: ignore[import]


✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!


## 2. Dataset pre-processing

We begin by reading the two datasets used to train the cell-line classification model in this notebook. To ensure compatibility, the model requires that both datasets have the same output dimensions

In the following section, we create a unified view by performing an **inner join** on the two datasets based on shared features. During this step, we:

- Identify and record the **number of output classes** (cell-lines),
- Extract the list of **cell-line** of both dataset.

This alignment is essential to ensure the model receives a consistent input/output structure regardless of the dataset source.

In [2]:
DS_PATHS = ["/mnt/hdd2/tan/tahoe100m/plate1_filt_Vevo_Tahoe100M_WServicesFrom_ParseGigalab.h5ad",
           "/mnt/hdd2/tan/tahoe100m/plate2_filt_Vevo_Tahoe100M_WServicesFrom_ParseGigalab.h5ad"]
adatas = [anndata.io.read_h5ad(p, backed = "r") for p in DS_PATHS]

In [3]:
# Create a view of all dataset
collection = AnnCollection(adatas, join_vars = "inner")

# Record the cell-lines (output classes) in both datasets
cell_lines = collection.obs.cell_line.unique().tolist()
cell_lines_count = collection.obs.cell_line.nunique()

## 3. Configure training step

In [4]:
thread_per_worker = 12
test_size = 0.0 # We don't have the test step in the model, so we can set this to 0
val_size = 0.2

## 4. Train on `plate1_filt_Vevo_Tahoe100M_WServicesFrom_ParseGigalab` dataset

In [5]:
plate1_adata = adatas[0]

In [6]:
plate1_adata.obs.head(n = 5)

Unnamed: 0_level_0,sample,gene_count,tscp_count,mread_count,drugname_drugconc,drug,cell_line,sublibrary,BARCODE,pcnt_mito,S_score,G2M_score,phase,pass_filter,cell_name,plate
BARCODE_SUB_LIB_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
01_001_025-lib_841,smp_1495,1676,2441,2892,"[('Infigratinib', 0.05, 'uM')]",Infigratinib,CVCL_0131,lib_841,01_001_025,0.025399,-0.066667,-0.095055,G1,full,A-172,plate1
01_001_026-lib_841,smp_1495,1657,2454,2925,"[('Infigratinib', 0.05, 'uM')]",Infigratinib,CVCL_0480,lib_841,01_001_026,0.042787,0.128571,0.650549,G2M,full,PANC-1,plate1
01_001_048-lib_841,smp_1495,1749,2521,2963,"[('Infigratinib', 0.05, 'uM')]",Infigratinib,CVCL_0293,lib_841,01_001_048,0.056724,0.242857,0.308791,G2M,full,HEC-1-A,plate1
01_001_076-lib_841,smp_1495,834,1038,1258,"[('Infigratinib', 0.05, 'uM')]",Infigratinib,CVCL_0397,lib_841,01_001_076,0.066474,0.009524,0.245788,G2M,full,LS 180,plate1
01_001_088-lib_841,smp_1495,1275,1710,2006,"[('Infigratinib', 0.05, 'uM')]",Infigratinib,CVCL_1097,lib_841,01_001_088,0.028655,-0.1,-0.085348,G1,full,C32,plate1


In [7]:
# Set up training
trainer = RayTrainRunner(
    LinearClassifier,
    DistributedCellLineAnnDataset,
    model_keys = ["num_genes",
                  "num_classes"],
    metadata_cb = cell_line_metadata_cb,
    sparse_key = "X"
)

2025-09-29 04:09:30,971	INFO worker.py:1951 -- Started a local Ray instance.
[36m(pid=2786171)[0m   import pynvml  # type: ignore[import]


[36m(TrainTrainable pid=2786171)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!
[36m(TrainTrainable pid=2786171)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!


[36m(RayTrainWorker pid=2786421)[0m   import pynvml  # type: ignore[import]
[36m(RayTrainWorker pid=2786421)[0m Setting up process group for: env:// [rank=0, world_size=1]
[36m(TorchTrainer pid=2786171)[0m Started distributed worker processes: 
[36m(TorchTrainer pid=2786171)[0m - (node_id=6c36d561221d2fec528ebcb63e96a380a17dde2ffd7fab853c3a442c, ip=192.168.1.226, pid=2786421) world_rank=0, local_rank=0, node_rank=0


[36m(RayTrainWorker pid=2786421)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!
[36m(RayTrainWorker pid=2786421)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!


[36m(RayTrainWorker pid=2786421)[0m 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
[36m(RayTrainWorker pid=2786421)[0m GPU available: True (cuda), used: True
[36m(RayTrainWorker pid=2786421)[0m TPU available: False, using: 0 TPU cores
[36m(RayTrainWorker pid=2786421)[0m HPU available: False, using: 0 HPUs
[36m(RayTrainWorker pid=2786421)[0m /mnt/hdd2/nam/miniconda3/envs/test/lib/python3.11/site-packages/lightning/fabric/plugins/environments/slurm.py:204: The `srun` command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with `srun` like so: srun python3.11 /mnt/hdd2/nam/miniconda3/envs/test/lib/python3.1 ...
[36m(RayTrainWorker pid=2786421)[0m You are using a CUDA device ('NVIDIA GeForce RTX 3080') that has Tensor Cores. To properly utilize 

Sanity Checking: |          | 0/? [00:00<?, ?it/s]


Sanity Checking DataLoader 0:   0%|          | 0/2 [00:00<?, ?it/s]


                                                                           


[36m(RayTrainWorker pid=2786421)[0m /mnt/hdd2/nam/miniconda3/envs/test/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/logger_connector/result.py:434: It is recommended to use `self.log('val_acc', ..., sync_dist=True)` when logging on epoch level in distributed setting to accumulate the metric across devices.


Epoch 0:   0%|          | 0/4224 [00:00<?, ?it/s]
Epoch 0:   0%|          | 3/4224 [00:15<6:12:03,  0.19it/s, v_num=0, train_loss=3.430] 
Epoch 0:   0%|          | 11/4224 [00:16<1:42:57,  0.68it/s, v_num=0, train_loss=1.280]
Epoch 0:   0%|          | 12/4224 [00:16<1:34:26,  0.74it/s, v_num=0, train_loss=0.972]
Epoch 0:   0%|          | 20/4224 [00:16<56:55,  1.23it/s, v_num=0, train_loss=0.494]  
Epoch 0:   0%|          | 21/4224 [00:16<54:15,  1.29it/s, v_num=0, train_loss=0.390]
Epoch 0:   1%|          | 30/4224 [00:16<38:09,  1.83it/s, v_num=0, train_loss=0.306]
Epoch 0:   1%|          | 38/4224 [00:16<30:15,  2.31it/s, v_num=0, train_loss=0.297]
Epoch 0:   1%|          | 39/4224 [00:16<29:30,  2.36it/s, v_num=0, train_loss=0.333]
Epoch 0:   1%|          | 48/4224 [00:16<24:05,  2.89it/s, v_num=0, train_loss=0.314]
...
Epoch 0:  99%|█████████▉| 4186/4224 [01:42<00:00, 40.69it/s, v_num=0, train_loss=0.0785]
Epoch 0:  99%|█████████▉| 4193/4224 [01:42<00:00, 40.72it/s, v_num=0, train

[36m(RayTrainWorker pid=2786421)[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/nam/training_results/TorchTrainer_2025-09-29_04-09-45/TorchTrainer_2232b_00000_0_2025-09-29_04-09-45/checkpoint_000000)
[36m(RayTrainWorker pid=2786421)[0m `Trainer.fit` stopped: `max_epochs=1` reached.


In [8]:
# Start training process. The output of the training phase will be output to the cell above where 
# we initialize a ray train runner.
result = trainer.train([DS_PATHS[0]],
                       batch_size = 1024,
                       test_size = test_size, 
                       val_size = val_size,
                       num_workers = 1,
                       thread_per_worker = thread_per_worker,
                       result_storage_path = "~/training_results")

Using 1 workers with {'CPU': 12} each
Data splitting time: 10.66 seconds
Spawning Ray worker and initiating distributed training


2025-09-29 04:09:45,847	INFO tune.py:616 -- [output] This uses the legacy output and progress reporter, as Jupyter notebooks are not supported by the new engine, yet. For more information, please see https://github.com/ray-project/ray/issues/36949


== Status ==
Current time: 2025-09-29 04:09:45 (running for 00:00:00.12)
Using FIFO scheduling algorithm.
Logical resource usage: 0/96 CPUs, 0/1 GPUs (0.0/1.0 accelerator_type:G)
Result logdir: /tmp/ray/session_2025-09-29_04-09-27_371551_2779459/artifacts/2025-09-29_04-09-45/TorchTrainer_2025-09-29_04-09-45/driver_artifacts
Number of trials: 1/1 (1 PENDING)


== Status ==
Current time: 2025-09-29 04:09:51 (running for 00:00:05.16)
Using FIFO scheduling algorithm.
Logical resource usage: 13.0/96 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type:G)
Result logdir: /tmp/ray/session_2025-09-29_04-09-27_371551_2779459/artifacts/2025-09-29_04-09-45/TorchTrainer_2025-09-29_04-09-45/driver_artifacts
Number of trials: 1/1 (1 PENDING)


== Status ==
Current time: 2025-09-29 04:09:56 (running for 00:00:10.19)
Using FIFO scheduling algorithm.
Logical resource usage: 13.0/96 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type:G)
Result logdir: /tmp/ray/session_2025-09-29_04-09-27_371551_2779459/artifacts/2025-09-29

2025-09-29 04:12:40,728	INFO tune.py:1009 -- Wrote the latest version of all result files and experiment state to '/home/nam/training_results/TorchTrainer_2025-09-29_04-09-45' in 0.0084s.
2025-09-29 04:12:40,740	INFO tune.py:1041 -- Total run time: 174.89 seconds (174.85 seconds for the tuning loop).


== Status ==
Current time: 2025-09-29 04:12:40 (running for 00:02:54.87)
Using FIFO scheduling algorithm.
Logical resource usage: 13.0/96 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type:G)
Result logdir: /tmp/ray/session_2025-09-29_04-09-27_371551_2779459/artifacts/2025-09-29_04-09-45/TorchTrainer_2025-09-29_04-09-45/driver_artifacts
Number of trials: 1/1 (1 TERMINATED)




In [9]:
ray.shutdown()

## 5. Train on `plate2_filt_Vevo_Tahoe100M_WServicesFrom_ParseGigalab` dataset

We now have a checkpoint saved after training the classification model using the first dataset. We need to pass into `train()` the path to the checkpoint file. This path can be retrieved from the result trainer in previous `train()`

In [10]:
plate2_adata = adatas[1]

In [11]:
plate2_adata.obs.head(n = 5)

Unnamed: 0_level_0,sample,gene_count,tscp_count,mread_count,drugname_drugconc,drug,cell_line,sublibrary,BARCODE,pcnt_mito,S_score,G2M_score,phase,pass_filter,cell_name,plate
BARCODE_SUB_LIB_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
01_001_053-lib_1000,smp_1591,2671,5629,6830,"[('Infigratinib', 0.5, 'uM')]",Infigratinib,CVCL_1119,lib_1000,01_001_053,0.016522,-0.265873,-0.313553,G1,full,CFPAC-1,plate2
01_001_082-lib_1000,smp_1591,2148,3173,3826,"[('Infigratinib', 0.5, 'uM')]",Infigratinib,CVCL_0292,lib_1000,01_001_082,0.025843,0.400794,0.520879,G2M,full,HCT15,plate2
01_001_145-lib_1000,smp_1591,683,886,1073,"[('Infigratinib', 0.5, 'uM')]",Infigratinib,CVCL_1098,lib_1000,01_001_145,0.029345,-0.019841,-0.032967,G1,full,HepG2/C3A,plate2
01_001_175-lib_1000,smp_1591,1845,2786,3368,"[('Infigratinib', 0.5, 'uM')]",Infigratinib,CVCL_0131,lib_1000,01_001_175,0.031587,-0.123016,-0.118498,G1,full,A-172,plate2
01_001_181-lib_1000,smp_1591,1228,1849,2226,"[('Infigratinib', 0.5, 'uM')]",Infigratinib,CVCL_0399,lib_1000,01_001_181,0.015143,0.02381,-0.008791,S,full,LoVo,plate2


In [12]:
# Set up training
trainer = RayTrainRunner(
    LinearClassifier,
    DistributedCellLineAnnDataset,
    model_keys = ["num_genes",
                  "num_classes"],
    metadata_cb = cell_line_metadata_cb,
    sparse_key = "X"
)

2025-09-29 04:13:25,417	INFO worker.py:1951 -- Started a local Ray instance.
[36m(pid=2793715)[0m   import pynvml  # type: ignore[import]


[36m(TrainTrainable pid=2793715)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!
[36m(TrainTrainable pid=2793715)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!


[36m(RayTrainWorker pid=2793962)[0m   import pynvml  # type: ignore[import]
[36m(RayTrainWorker pid=2793962)[0m Setting up process group for: env:// [rank=0, world_size=1]
[36m(TorchTrainer pid=2793715)[0m Started distributed worker processes: 
[36m(TorchTrainer pid=2793715)[0m - (node_id=5c3609e5ec2f9fb4351277f6952c4771ba48a3c27de1561a31e9fe1c, ip=192.168.1.226, pid=2793962) world_rank=0, local_rank=0, node_rank=0


[36m(RayTrainWorker pid=2793962)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!
[36m(RayTrainWorker pid=2793962)[0m ✓ Applied AnnDataFileManager patch, AnnData cannot be imported after the patch!


[36m(RayTrainWorker pid=2793962)[0m 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
[36m(RayTrainWorker pid=2793962)[0m GPU available: True (cuda), used: True
[36m(RayTrainWorker pid=2793962)[0m TPU available: False, using: 0 TPU cores
[36m(RayTrainWorker pid=2793962)[0m HPU available: False, using: 0 HPUs
[36m(RayTrainWorker pid=2793962)[0m You are using a CUDA device ('NVIDIA GeForce RTX 3080') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision




[36m(RayTrainWorker pid=2793962)[0m Restoring states from the checkpoint path at /home/nam/training_results/TorchTrainer_2025-09-29_04-09-45/TorchTrainer_2232b_00000_0_2025-09-29_04-09-45/checkpoint_000000/checkpoint.ckpt
[36m(RayTrainWorker pid=2793962)[0m /mnt/hdd2/nam/miniconda3/envs/test/lib/python3.11/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:445: The dirpath has changed from '/tmp/ray/session_2025-09-29_04-09-27_371551_2779459/artifacts/2025-09-29_04-09-45/TorchTrainer_2025-09-29_04-09-45/working_dirs/TorchTrainer_2232b_00000_0_2025-09-29_04-09-45/lightning_logs/version_0/checkpoints' to '/tmp/ray/session_2025-09-29_04-13-22_162568_2779459/artifacts/2025-09-29_04-13-50/TorchTrainer_2025-09-29_04-13-50/working_dirs/TorchTrainer_b3c18_00000_0_2025-09-29_04-13-50/lightning_logs/version_0/checkpoints', therefore `best_model_score`, `kth_best_model_path`, `kth_value`, `last_model_path` and `best_k_models` won't be reloaded. Only `best_model_path` will be reload

Sanity Checking: |          | 0/? [00:00<?, ?it/s]


Sanity Checking DataLoader 0:   0%|          | 0/2 [00:00<?, ?it/s]
                                                                           


[36m(RayTrainWorker pid=2793962)[0m /mnt/hdd2/nam/miniconda3/envs/test/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/logger_connector/result.py:434: It is recommended to use `self.log('val_acc', ..., sync_dist=True)` when logging on epoch level in distributed setting to accumulate the metric across devices.


Epoch 1:   0%|          | 0/6144 [00:00<?, ?it/s]
Epoch 1:   0%|          | 1/6144 [00:19<33:01:31,  0.05it/s, v_num=0, train_loss=0.0601]
Epoch 1:   0%|          | 3/6144 [00:20<11:35:56,  0.15it/s, v_num=0, train_loss=0.193] 
Epoch 1:   0%|          | 4/6144 [00:20<8:42:23,  0.20it/s, v_num=0, train_loss=0.195] 
Epoch 1:   0%|          | 13/6144 [00:20<2:41:23,  0.63it/s, v_num=0, train_loss=0.0382]
Epoch 1:   0%|          | 21/6144 [00:20<1:40:20,  1.02it/s, v_num=0, train_loss=0.049] 
Epoch 1:   0%|          | 22/6144 [00:20<1:35:48,  1.06it/s, v_num=0, train_loss=0.0927]
Epoch 1:   0%|          | 29/6144 [00:20<1:12:59,  1.40it/s, v_num=0, train_loss=0.101] 
Epoch 1:   0%|          | 30/6144 [00:20<1:10:34,  1.44it/s, v_num=0, train_loss=0.220]
Epoch 1:   1%|          | 31/6144 [00:20<1:08:19,  1.49it/s, v_num=0, train_loss=0.200]
...
Epoch 1:  99%|█████████▉| 6103/6144 [02:35<00:01, 39.27it/s, v_num=0, train_loss=0.0495]
Epoch 1:  99%|█████████▉| 6108/6144 [02:35<00:00, 39.27it/s

[36m(RayTrainWorker pid=2793962)[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/nam/protoplast_results/TorchTrainer_2025-09-29_04-13-50/TorchTrainer_b3c18_00000_0_2025-09-29_04-13-50/checkpoint_000000)
[36m(RayTrainWorker pid=2793962)[0m `Trainer.fit` stopped: `max_epochs=2` reached.


In [13]:
# We get the checkpoint path from the training result in Tahoe plate 1. The training
# progress will be shown in cell 12 above.
ckpt_path = os.path.join(result.checkpoint.path, "checkpoint.ckpt")

trainer.train([DS_PATHS[1]],
              max_epochs = 2,
              batch_size = 1024,
              test_size = test_size, 
              val_size = val_size,
              num_workers = 1,
              thread_per_worker = thread_per_worker,
              ckpt_path = ckpt_path)

Using 1 workers with {'CPU': 12} each


2025-09-29 04:13:50,052	INFO tune.py:616 -- [output] This uses the legacy output and progress reporter, as Jupyter notebooks are not supported by the new engine, yet. For more information, please see https://github.com/ray-project/ray/issues/36949


Data splitting time: 17.27 seconds
Spawning Ray worker and initiating distributed training
== Status ==
Current time: 2025-09-29 04:13:50 (running for 00:00:00.13)
Using FIFO scheduling algorithm.
Logical resource usage: 0/96 CPUs, 0/1 GPUs (0.0/1.0 accelerator_type:G)
Result logdir: /tmp/ray/session_2025-09-29_04-13-22_162568_2779459/artifacts/2025-09-29_04-13-50/TorchTrainer_2025-09-29_04-13-50/driver_artifacts
Number of trials: 1/1 (1 PENDING)


== Status ==
Current time: 2025-09-29 04:13:55 (running for 00:00:05.16)
Using FIFO scheduling algorithm.
Logical resource usage: 13.0/96 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type:G)
Result logdir: /tmp/ray/session_2025-09-29_04-13-22_162568_2779459/artifacts/2025-09-29_04-13-50/TorchTrainer_2025-09-29_04-13-50/driver_artifacts
Number of trials: 1/1 (1 PENDING)


== Status ==
Current time: 2025-09-29 04:14:00 (running for 00:00:10.24)
Using FIFO scheduling algorithm.
Logical resource usage: 13.0/96 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type

2025-09-29 04:18:05,879	INFO tune.py:1009 -- Wrote the latest version of all result files and experiment state to '/home/nam/protoplast_results/TorchTrainer_2025-09-29_04-13-50' in 0.0108s.
2025-09-29 04:18:05,884	INFO tune.py:1041 -- Total run time: 255.83 seconds (255.80 seconds for the tuning loop).


== Status ==
Current time: 2025-09-29 04:18:05 (running for 00:04:15.81)
Using FIFO scheduling algorithm.
Logical resource usage: 13.0/96 CPUs, 1.0/1 GPUs (0.0/1.0 accelerator_type:G)
Result logdir: /tmp/ray/session_2025-09-29_04-13-22_162568_2779459/artifacts/2025-09-29_04-13-50/TorchTrainer_2025-09-29_04-13-50/driver_artifacts
Number of trials: 1/1 (1 TERMINATED)




Result(
  metrics={'train_loss': 0.0727640762925148, 'val_acc': 0.9816258549690247, 'epoch': 1, 'step': 10368},
  path='/home/nam/protoplast_results/TorchTrainer_2025-09-29_04-13-50/TorchTrainer_b3c18_00000_0_2025-09-29_04-13-50',
  filesystem='local',
  checkpoint=Checkpoint(filesystem=local, path=/home/nam/protoplast_results/TorchTrainer_2025-09-29_04-13-50/TorchTrainer_b3c18_00000_0_2025-09-29_04-13-50/checkpoint_000000)
)

In [15]:
ray.shutdown()

### Conclusion

This brings us to the end of the tutorial notebook.

This workflow highlights using checkpointing in **PROTOplast**, enabling efficient model development across diverse datasets.

Feel free to explore and extend this notebook to suit your own data and use cases!