feat: support for multi-gpu inference and OOM datasets by jdpearce4 · Pull Request #53 · czi-ai/transcriptformer

jdpearce4 · 2025-08-15T22:51:04Z

Summary

This PR enables scalable inference across multiple GPUs and adds an out-of-memory (OOM) safe map-style dataloader for very large .h5ad files.

Key Changes

Multi-GPU inference (DDP)
- Trainer now uses devices and accelerator=gpu based on inference_config.num_gpus.
- DistributedSampler is used with the new map-style dataset when devices > 1.
OOM-safe map-style dataloader
- New AnnDatasetOOM:
  - Opens .h5ad in backed='r' mode.
  - Applies gene filtering once per file; caches layer handle; densifies per-row in __getitem__.
  - Compatible with DistributedSampler (order-safe with multiple workers).
- Densification is moved to the row level to avoid materializing large matrices.
CLI and config
- New flags:
  - --oom-dataloader: enable OOM-safe map-style dataloader.
  - --n-data-workers: number of DataLoader workers per process.
- Added use_oom_dataloader to InferenceConfig and inference_config.yaml.
- Wiring for n_data_workers via CLI → DataConfig.
Sparse-aware data handling
- get_counts_layer safely selects raw.X or X with clear logging.
- is_raw_counts supports sparse inputs via sampling of non-zero data.
- AnnDataset explicitly densifies before batch processing to satisfy downstream expectations.

Testing

Verified multi-GPU prediction with 8 GPUs on a large .h5ad.
Confirmed memory stability and no full-matrix densification.
Pre-commit lint passes.

Documentation

Updated README.md with:
- New flags (--oom-dataloader, --n-data-workers)
- OOM-safe usage example
- Removal of legacy streaming options

Copilot

Pull Request Overview

This PR enables scalable inference across multiple GPUs and adds an out-of-memory (OOM) safe map-style dataloader for very large .h5ad files. This addresses memory bottlenecks and processing speed limitations when working with large single-cell datasets.

Multi-GPU inference support using DistributedDataParallel (DDP) with configurable GPU allocation
OOM-safe map-style dataset that uses backed reads and per-row densification to handle large files
New CLI flags for enabling OOM dataloader, specifying data workers, and controlling GPU usage

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
test/test_compare_umap.py	New test utility for comparing UMAPs between embeddings with Procrustes analysis
test/test_compare_emb.py	Updated embedding comparison to use "embeddings" key and added statistical tests
test/test_cli.py	Removed deprecated CLI test functions
src/transcriptformer/model/inference.py	Added multi-GPU support and OOM dataloader integration
src/transcriptformer/data/dataloader.py	Major refactoring with new AnnDatasetOOM class and extracted utility functions
src/transcriptformer/data/dataclasses.py	Updated InferenceConfig with num_gpus and use_oom_dataloader fields
src/transcriptformer/cli/conf/inference_config.yaml	Added new configuration options for multi-GPU and OOM handling
src/transcriptformer/cli/init.py	Complete CLI rewrite with direct inference execution instead of Hydra delegation
inference.py	Removed legacy inference script
download_artifacts.py	Removed legacy download script
README.md	Updated documentation with new CLI flags and usage examples

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

test/test_compare_emb.py

src/transcriptformer/data/dataloader.py

src/transcriptformer/cli/__init__.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

SESDNA

LGTM!

jdpearce4 added 6 commits August 11, 2025 16:52

feat: multi-gpu inference

5988c1c

update

c216526

feat: data streaming for oom datasets

e56ac8c

+ test scripts; - old inf/dl scripts

789f902

Merge branch 'jpearce-multigpu-inf' into jpearce-oom-datasets

aa936ef

update

d730dea

jdpearce4 changed the title ~~Support for multi-gpu inference and OOM datasets~~ feat: support for multi-gpu inference and OOM datasets Aug 15, 2025

rm unit test

54daed8

jdpearce4 requested a review from Copilot August 19, 2025 17:44

Copilot AI reviewed Aug 19, 2025

View reviewed changes

test/test_compare_emb.py Outdated Show resolved Hide resolved

src/transcriptformer/data/dataloader.py Outdated Show resolved Hide resolved

src/transcriptformer/data/dataloader.py Outdated Show resolved Hide resolved

src/transcriptformer/cli/__init__.py Outdated Show resolved Hide resolved

jdpearce4 and others added 5 commits August 19, 2025 11:26

Update test/test_compare_emb.py

ba28241

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/transcriptformer/data/dataloader.py

f9441dc

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/transcriptformer/data/dataloader.py

d2ee925

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/transcriptformer/cli/__init__.py

a0786ff

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

formatting

a7a21fd

SESDNA approved these changes Aug 20, 2025

View reviewed changes

jdpearce4 merged commit 1d0012c into main Aug 20, 2025
6 checks passed

jdpearce4 deleted the jpearce-oom-datasets branch August 20, 2025 16:47

czi-github-helper bot mentioned this pull request Aug 20, 2025

chore(main): release 0.6.0 #54

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support for multi-gpu inference and OOM datasets#53

feat: support for multi-gpu inference and OOM datasets#53
jdpearce4 merged 12 commits intomainfrom
jpearce-oom-datasets

jdpearce4 commented Aug 15, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SESDNA left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jdpearce4 commented Aug 15, 2025

Summary

Key Changes

Testing

Documentation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SESDNA left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants