medrs

High-performance medical imaging I/O and processing library for Rust and Python.

Overview

medrs is designed for throughput-critical medical imaging workflows, particularly deep learning pipelines that process large 3D volumes. It provides:

Fast NIfTI I/O: Memory-mapped reading, crop-first loading (read sub-volumes without loading entire files)
Transform Pipeline: Lazy evaluation with automatic operation fusion and SIMD acceleration
Mixed Precision: Native f16/bf16 support for 50% storage savings
Random Augmentation: Reproducible, GPU-friendly augmentations for ML training
Python Bindings: Zero-copy numpy views, direct PyTorch/JAX tensor creation
MONAI Integration: Drop-in replacements for MONAI transforms

Why medrs?

Performance vs MONAI & TorchIO (128³ volume)

Operation	medrs	MONAI	TorchIO	vs MONAI
Load	0.13ms	4.55ms	4.71ms	35x
Load Cropped (64³)	0.41ms	4.68ms	9.86ms	11x
Load Resampled	0.40ms	6.88ms	27.65ms	17x
To PyTorch	0.49ms	5.14ms	10.22ms	10x
Load + Normalize	0.60ms	5.36ms	12.26ms	9x

At larger volumes (512³), speedups increase dramatically: up to 38,000x vs MONAI and 6,600x vs TorchIO.

Storage Efficiency (128³ volume, compressed)

Format	Size	vs f32
float32	8.3 MB	100%
bfloat16	3.4 MB	41%
float16	4.1 MB	50%
int16	1.2 MB	15%

Training Throughput (FastLoader vs DataLoaders)

Loader	Workers	Samples/sec	vs medrs
medrs FastLoader	4	1,279	1x
TorchIO Queue	4	8.7	147x slower
MONAI DataLoader	4	0.8	1,560x slower

Tested on 20 gzipped 64³ volumes with random cropping. FastLoader uses parallel prefetching and Mgzip support.

Key Advantages

Crop-First Loading: Load 64³ patch from 512³ volume without reading entire file - 6,600x faster than MONAI
FastLoader: Purpose-built training loader achieves 1,560x higher throughput than MONAI DataLoader
Mixed Precision: Save in bf16/f16 for 40-50% smaller files with minimal precision loss
MONAI Drop-in: Replace MONAI I/O transforms with one import change
Zero-Copy: Direct tensor creation without intermediate numpy allocations

📊 Detailed Benchmarks (click to expand)

Comprehensive Benchmark Results

Benchmark results comparing medrs, MONAI, and TorchIO across multiple volume sizes and operations.

Load Performance (Basic I/O)

Size	medrs	MONAI	TorchIO	vs MONAI	vs TorchIO
64³	0.13ms	1.34ms	2.35ms	10x	18x
128³	0.13ms	4.55ms	4.71ms	35x	36x
256³	0.14ms	159.11ms	95.18ms	1,136x	680x
512³	0.13ms	5,006.76ms	866.54ms	38,513x	6,665x

Crop-First Loading (64³ patch)

Source	medrs	MONAI	TorchIO	vs MONAI	vs TorchIO
64³	0.27ms	1.75ms	6.00ms	6x	22x
128³	0.41ms	4.68ms	9.86ms	11x	24x
256³	0.55ms	154.86ms	104.48ms	282x	190x
512³	0.76ms	5,041.42ms	1,076.89ms	6,633x	1,417x

Load Resampled (Half resolution)

Source	medrs	MONAI	TorchIO	vs MONAI	vs TorchIO
64³ → 32³	0.18ms	1.93ms	5.45ms	11x	30x
128³ → 64³	0.40ms	6.88ms	27.65ms	17x	69x
256³ → 128³	2.02ms	178.87ms	363.85ms	89x	180x
512³ → 256³	6.67ms	5,960.93ms	4,039.05ms	894x	605x

Direct PyTorch Loading

Source	medrs	MONAI	TorchIO	vs MONAI	vs TorchIO
64³	0.34ms	1.58ms	5.37ms	5x	16x
128³	0.49ms	5.14ms	10.22ms	10x	21x
256³	0.60ms	162.78ms	53.70ms	271x	90x
512³	0.84ms	5,864.85ms	1,223.24ms	6,982x	1,456x

Load with Z-Normalization

Source	medrs	MONAI	TorchIO	vs MONAI	vs TorchIO
64³	0.49ms	2.15ms	7.04ms	4x	14x
128³	0.60ms	5.36ms	12.26ms	9x	20x
256³	0.73ms	163.38ms	53.59ms	224x	73x
512³	1.01ms	3,735.31ms	1,092.25ms	3,698x	1,081x

Benchmarks run on Apple M1 Pro, 20 iterations, 3 warmup. Run your own: python benchmarks/bench_medrs.py

Installation

Python

pip install medrs

Rust

[dependencies]
medrs = "0.1"

Development

git clone https://github.com/liamchalcroft/med-rs.git
cd med-rs
pip install -e ".[dev]"
maturin develop --features python

Quick Start

Python:

import medrs
import torch

# Load a NIfTI image
img = medrs.load("brain.nii.gz")
print(f"Shape: {img.shape}, Spacing: {img.spacing}")

# Method chaining for transforms
processed = img.resample([1.0, 1.0, 1.0]).z_normalize().clamp(-1, 1)
processed.save("output.nii.gz")

# Load directly to PyTorch tensor (most efficient)
tensor = medrs.load_to_torch("brain.nii.gz", dtype=torch.float16, device="cuda")

For training pipelines that repeatedly access the same files, use load_cached() for faster subsequent loads (caches decompressed data for .nii.gz files).

Rust:

use medrs::nifti;
use medrs::transforms::{resample_to_spacing, Interpolation};

fn main() -> medrs::Result<()> {
    let img = nifti::load("brain.nii.gz")?;
    println!("Shape: {:?}, Spacing: {:?}", img.shape(), img.spacing());

    let resampled = resample_to_spacing(&img, [1.0, 1.0, 1.0], Interpolation::Trilinear)?;
    nifti::save(&resampled, "output.nii.gz")?;
    Ok(())
}

Transform Pipeline

Build composable transform pipelines with lazy evaluation and automatic optimization:

Python:

import medrs

# Create a reusable pipeline
pipeline = medrs.TransformPipeline()
pipeline.z_normalize()
pipeline.clamp(-1.0, 1.0)
pipeline.resample_to_shape([64, 64, 64])

# Apply to multiple images
for path in image_paths:
    img = medrs.load(path)
    processed = pipeline.apply(img)

Rust:

use medrs::pipeline::compose::TransformPipeline;

let pipeline = TransformPipeline::new()
    .z_normalize()
    .clamp(-1.0, 1.0)
    .resample_to_shape([64, 64, 64]);

let processed = pipeline.apply(&img);

Random Augmentation

Reproducible augmentations for ML training with optional seeding:

Python:

import medrs

img = medrs.load("brain.nii.gz")

# Individual augmentations
flipped = medrs.random_flip(img, axes=[0, 1, 2], prob=0.5, seed=42)
noisy = medrs.random_gaussian_noise(img, std=0.1, seed=42)
scaled = medrs.random_intensity_scale(img, scale_range=0.1, seed=42)
shifted = medrs.random_intensity_shift(img, shift_range=0.1, seed=42)
rotated = medrs.random_rotate_90(img, axes=(0, 1), seed=42)
gamma = medrs.random_gamma(img, gamma_range=(0.7, 1.5), seed=42)

# Combined augmentation (flip + noise + scale + shift)
augmented = medrs.random_augment(img, seed=42)

Rust:

use medrs::transforms::{random_flip, random_gaussian_noise, random_augment};

// Individual augmentations
let flipped = random_flip(&img, &[0, 1, 2], Some(0.5), Some(42))?;
let noisy = random_gaussian_noise(&img, Some(0.1), Some(42))?;

// Combined augmentation
let augmented = random_augment(&img, Some(42))?;

Mgzip: Parallel Compressed Loading

For .nii.gz files, medrs supports the Mgzip (multi-member gzip) format for parallel decompression. Mgzip files are backwards-compatible with standard gzip but can be decompressed 3-5× faster using multiple threads.

Performance (256³ volume)

Method	Time	vs nibabel
nibabel	173ms	1×
medrs.load()	126ms	1.4×
medrs.load_mgzip(8 threads)	47ms	3.7×

Usage

import medrs

# Convert existing .nii.gz to Mgzip format (one-time)
medrs.convert_to_mgzip("brain.nii.gz", "brain.mgz.nii.gz", num_threads=8)

# Load with parallel decompression
img = medrs.load_mgzip("brain.mgz.nii.gz", num_threads=8)

# Save directly in Mgzip format
medrs.save_mgzip(img, "output.mgz.nii.gz", num_threads=8)

# Check if file is Mgzip format
if medrs.is_mgzip("file.nii.gz"):
    img = medrs.load_mgzip("file.nii.gz")

Batch Conversion CLI

Convert entire datasets with the included CLI tool:

# Convert all .nii.gz files in a directory (recursive)
python -m medrs.cli convert-mgzip data/*.nii.gz -r -w 8 -v

# Options:
#   -r, --recursive    Search subdirectories
#   -w, --workers N    Parallel conversion threads (default: CPU count)
#   -v, --verbose      Show progress
#   --suffix .mgz      Output suffix (default: replaces .nii.gz with .mgz.nii.gz)

When to Use Mgzip

Large compressed datasets (100+ files, 256³+ volumes)
Multi-core systems (4+ cores)
Repeated access (training pipelines that load same files across epochs)

Mgzip files are ~1% larger than standard gzip but provide significant speedups. Standard gzip readers (nibabel, etc.) can still read Mgzip files.

Crop-First Loading

Load only the data you need - essential for training pipelines:

import medrs
import torch

# Load a 64^3 patch starting at position (32, 32, 32)
patch = medrs.load_cropped("volume.nii", [32, 32, 32], [64, 64, 64])

# Load with resampling and reorientation in one step
patch = medrs.load_resampled(
    "volume.nii",
    output_shape=[64, 64, 64],
    target_spacing=[1.0, 1.0, 1.0],
    target_orientation="RAS"
)

# Load directly to GPU tensor
tensor = medrs.load_cropped_to_torch(
    "volume.nii",
    output_shape=[64, 64, 64],
    target_spacing=[1.0, 1.0, 1.0],
    dtype=torch.float16,
    device="cuda"
)

Training Data Loaders

TrainingDataLoader

LRU-cached patch extraction with prefetching:

import medrs

loader = medrs.TrainingDataLoader(
    volumes=["vol1.nii", "vol2.nii", "vol3.nii"],
    patch_size=[64, 64, 64],
    patches_per_volume=4,
    patch_overlap=[0, 0, 0],
    randomize=True,
    cache_size=1000
)

for patch in loader:
    tensor = patch.to_torch()

FastLoader

Parallel prefetching loader for large .nii.gz datasets (100k+ files):

import glob
import medrs

loader = medrs.FastLoader(
    volumes=glob.glob("data/*.nii.gz"),
    patch_shape=[64, 64, 64],
    prefetch=16,
    workers=4,
    shuffle=True,
    seed=42
)

for patch in loader:
    tensor = patch.to_torch()

Available Transforms

Intensity Transforms

z_normalize() / z_normalization() - Zero mean, unit variance
rescale() / rescale_intensity() - Scale to [min, max] range
clamp() - Clamp values to range

Spatial Transforms

resample() / resample_to_spacing() - Resample to target spacing
resample_to_shape() - Resample to target shape
reorient() - Reorient to standard orientation (RAS, LPS, etc.)
crop_or_pad() - Crop or pad to target shape
flip() - Flip along specified axes

Random Augmentation

random_flip() - Random axis flipping
random_gaussian_noise() - Additive Gaussian noise
random_intensity_scale() - Random intensity scaling
random_intensity_shift() - Random intensity offset
random_rotate_90() - Random 90-degree rotations
random_gamma() - Random gamma correction
random_augment() - Combined augmentation pipeline

Performance

medrs uses several optimization strategies:

SIMD: Trilinear interpolation uses AVX2/SSE for 8-way parallel processing
Parallel Processing: Rayon-based parallelism for large volumes
Lazy Evaluation: Transform pipelines compose operations before execution
Memory Mapping: Large files are memory-mapped to avoid full loads
Buffer Pooling: Reusable buffers reduce allocation overhead
Parallel Decompression: Mgzip format enables multi-threaded gzip decompression

Examples

See the examples/ directory for:

basic/ - Loading, transforms, and saving
integrations/ - PyTorch, MONAI, JAX integration
advanced/ - Async pipelines, custom transforms

Testing

# Rust tests
cargo test

# Python tests
pytest tests/

# Benchmarks (requires torch, monai, torchio)
python benchmarks/bench_medrs.py --quick
python benchmarks/bench_monai.py --quick
python benchmarks/bench_torchio.py --quick

# Generate benchmark plots
python benchmarks/plot_results.py

License

medrs is dual-licensed under MIT and Apache-2.0. See LICENSE for details.

Contributing

See CONTRIBUTING.md for guidelines.

Maintainer

Liam Chalcroft (liam.chalcroft.20@ucl.ac.uk)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
benches		benches
benchmarks		benchmarks
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
build.rs		build.rs
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

liamchalcroft/med-rs

Folders and files

Latest commit

History

Repository files navigation

medrs

Overview

Why medrs?

Performance vs MONAI & TorchIO (128³ volume)

Storage Efficiency (128³ volume, compressed)

Training Throughput (FastLoader vs DataLoaders)

Key Advantages

Comprehensive Benchmark Results

Load Performance (Basic I/O)

Crop-First Loading (64³ patch)

Load Resampled (Half resolution)

Direct PyTorch Loading

Load with Z-Normalization

Installation

Python

Rust

Development

Quick Start

Transform Pipeline

Random Augmentation

Mgzip: Parallel Compressed Loading

Performance (256³ volume)

Usage

Batch Conversion CLI

When to Use Mgzip

Crop-First Loading

Training Data Loaders

TrainingDataLoader

FastLoader

Available Transforms

Intensity Transforms

Spatial Transforms

Random Augmentation

Performance

Examples

Testing

License

Contributing

Maintainer

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages