RF-Deep

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

Research codebase for post-hoc out-of-distribution detection in 3D CT lung tumor segmentation. The repository accompanies our work on using tumor-anchored deep features from pretrained-then-finetuned segmentation backbones, together with lightweight random forests and limited outlier exposure, to detect unsafe inputs at scan level.

Overview

Modern segmentation models can remain highly accurate on in-distribution data while failing confidently on clinically mismatched scans. RF-Deep is designed to detect those failures without modifying the underlying segmentation architecture. Instead of relying only on logits or architecture-specific uncertainty heads, RF-Deep extracts hierarchical deep features from regions of interest anchored to predicted tumor segmentations and uses them for downstream OOD detection.

Representative in-distribution and out-of-distribution examples from the paper, illustrating how uncertainty maps can appear concentrated, diffuse, or misaligned across different deployment scenarios.

In the paper, RF-Deep is evaluated on 2,056 CT scans spanning in-distribution lung cancer, near-OOD chest CT datasets, and far-OOD abdominal datasets. It achieves strong near-OOD detection and near-perfect far-OOD detection while remaining simple, lightweight, and architecture-agnostic.

Highlights

Post-hoc OOD detection for segmentation, without changing the segmentation network
Tumor-anchored feature extraction from predicted regions of interest
Support for RF-Deep, radiomics, Mahalanobis, and logit-based baselines
Metadata-aware and scanner-aware analysis utilities
Figure-generation and analysis scripts used for the paper

Repository at a Glance

ood_rfdeep.py: main RF-Deep experiment entrypoint
extract_features.py: deep-feature extraction from segmentation backbones
ood_maha.py: Mahalanobis deep-feature baseline (with optional ReAct/ASH transforms)
logit_baselines.py: logit-derived OOD baselines and analysis
roi_logit_baselines.py: ROI-restricted logit baselines using the same crop protocol as RF-Deep
ood_metadata_holdout.py: metadata-stratified holdout evaluation
segmentation_inference.py: segmentation inference utility for supported backbones

Getting Started

This codebase targets Python 3.9.

pip install -r requirements.txt
python -m scripts.smoke_check

Pretrained Checkpoints

Segmentation backbone weights used in the paper are released separately. Place them under models/finetuned_weights/ and models/pretrained_weights/, or override the locations with the FINETUNED_WEIGHTS_ROOT and PRETRAINED_WEIGHTS_ROOT environment variables.

Download here: MSKBox RF-Deep
(If the above link does not work at any point, please open an issue on GitHub and it will be addressed promptly.)

Datasets

RF-Deep is evaluated on public CT collections; this repository redistributes none of them. Obtain each from its original source:

NSCLC-Radiomics — TCIA
NSCLC-Radiogenomics (LRAD) — TCIA
RSNA-STR Pulmonary Embolism Detection (RSNA PE) — Kaggle
MIDRC COVID-19 negative CT (MIDRC C19^-) — TCIA
MIDRC COVID-19 positive CT (MIDRC C19⁺) — TCIA
KiTS — kits-challenge.org
PancreasCT — medicaldecathlon.com
Breast cancer CT — institutional internal dataset, not publicly redistributed

What Can Be Reproduced Publicly

Using the public datasets listed below together with the released checkpoints, users can reproduce the main RF-Deep workflow: segmentation inference, deep-feature extraction, OOD evaluation, and most analysis scripts. Results that depend on the internal breast cancer CT dataset cannot be reproduced from public data alone, since that cohort is not publicly available.

Typical Workflow

To reproduce the paper workflow, the main steps are feature extraction, optional baseline generation, RF-Deep evaluation, and figure or analysis generation. Most scripts expect data to be indexed through JSON manifests in jsons/.

1. Feature Extraction

RF-Deep operates on hierarchical backbone (3D Swin Transformer) features extracted from the segmentation model.

python extract_features.py --model smit

Feature caches are typically written as .pkl files under pickle_data/.

2. Baseline Generation

To compare RF-Deep against radiomics or logit-based uncertainty baselines:

python logit_baselines.py global --metric maxlogit

3. OOD Detection and Evaluation

Train and evaluate RF-Deep on the ID and OOD datasets:

python ood_rfdeep.py --method lodo --model-name smit --img-size 128 --train-size 20

Data and Paths

Datasets are expected under data/ by default, but that path is intentionally ignored because it may be machine-specific or a symlink. Shared code resolves paths through project_paths.py, and dataset roots can be overridden with environment variables when needed. Metadata required for scanner analysis and PyCERR-based radiomics lives in metadata_info/.

Dataset manifests under jsons/ are machine-specific and not redistributed; generate them locally with python -m scripts.make_json after obtaining the datasets. See jsons/README.md for the manifest schema.

Repository Guide

PROJECT_LAYOUT.md: canonical directory layout and output policy
CODE_REFERENCE.md: module-by-module reference for the shared codebase
AGENTS.md: orientation file for agentic AI tools (Claude Code, Codex, Cursor, etc.)
models/README.md: model architectures, feature-extraction expectations, and weight directories
paper_figures/README.md: figure-generation entrypoints used for paper assets
scripts/README.md: reusable operational and analysis scripts
jsons/README.md: dataset manifest conventions
metadata_info/README.md: synced metadata inputs for scanner, holdout, and radiomics-support analysis
results/README.md: generated analysis outputs
radiomics_features/README.md: generated radiomics CSV outputs
pickle_data/README.md: cached deep-feature pickle outputs
excelrecords/README.md: generated segmentation metric CSV outputs

Acknowledgement

We sincerely thank the authors of Swin UNETR and SMIT for open-sourcing their code and models. In addition, thanks to these great repositories: PyTorch, MONAI, PyCERR, DeepMind Surface Distance, NiBabel, scikit-learn among others. Finally, AI-driven coding assistants were used in development of parallelization scripts, code cleanup, and relevant technical documentation.

Citation

@article{rangnekar2025tumor,
  title={Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation},
  author={Rangnekar, Aneesh and Veeraraghavan, Harini},
  journal={arXiv preprint arXiv:2512.08216},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RF-Deep

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

Overview

Highlights

Repository at a Glance

Getting Started

Pretrained Checkpoints

Datasets

What Can Be Reproduced Publicly

Typical Workflow

1. Feature Extraction

2. Baseline Generation

3. OOD Detection and Evaluation

Data and Paths

Repository Guide

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs/readme_assets		docs/readme_assets
excelrecords		excelrecords
jsons		jsons
metadata_info		metadata_info
models		models
paper_figures		paper_figures
pickle_data		pickle_data
radiomics_features		radiomics_features
results		results
scripts		scripts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CODE_REFERENCE.md		CODE_REFERENCE.md
LICENSE		LICENSE
PROJECT_LAYOUT.md		PROJECT_LAYOUT.md
README.md		README.md
extract_features.py		extract_features.py
logit_baselines.py		logit_baselines.py
ood_maha.py		ood_maha.py
ood_metadata_holdout.py		ood_metadata_holdout.py
ood_rfdeep.py		ood_rfdeep.py
ood_utils.py		ood_utils.py
project_paths.py		project_paths.py
requirements.txt		requirements.txt
roi_logit_baselines.py		roi_logit_baselines.py
segmentation_inference.py		segmentation_inference.py

Folders and files

Latest commit

History

Repository files navigation

RF-Deep

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

Overview

Highlights

Repository at a Glance

Getting Started

Pretrained Checkpoints

Datasets

What Can Be Reproduced Publicly

Typical Workflow

1. Feature Extraction

2. Baseline Generation

3. OOD Detection and Evaluation

Data and Paths

Repository Guide

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages