This repository provides a unified implementation of density ratio estimation methods, integrating multiple state-of-the-art papers from top-tier machine learning venues (NeurIPS, ICML, ICLR). The framework supports likelihood estimation, f-divergence estimation, mutual information estimation, change point detection, and out-of-distribution (OOD) detection.
Score-based generative modeling has emerged as a powerful paradigm for learning complex probability distributions. This framework unifies multiple approaches:
- Score-based Methods: DRE-$\infty$, D$^3$RE, MVP-DRE
- Kernel Methods: Kernel Density Ratio Estimation
- Neural Methods: KULSIF, NCE, InfoNCE, RKL, Hellinger, KL, Gamma, EW, PW
- Continuous Normalizing Flows: ODE-based likelihood computation
- 🚀 Unified Interface: Single entry point (
main.py) for all tasks and methods - 📊 Comprehensive Benchmarks: Synthetic 2D, tabular, image datasets
- 🔧 Flexible Configuration: Extensive hyperparameter control via command-line args
- 📈 Multi-Task Support: Likelihood estimation, f-divergence, MI, CPD, OOD detection
- 🖼️ Image Backbones: Support for ResNet and WideResNet architectures
# Create conda environment
conda create -n opendre python=3.10
conda activate opendre
# Install dependencies
pip install -r requirements.txt
# For NSF flows (optional, for real image experiments)
cd nsf && pip install -e .- Python 3.10+
- PyTorch 2.1.2+
- PyTorch Lightning 2.1.2+
- torchdiffeq, torchquad
- See
requirements.txtfor full list
The codebase is split into top-level packages for easier navigation:
| Package | Contents |
|---|---|
nn_models/ |
Neural network models (tabular_models, image_models, legacy UnifiedModel wrapper, plus logical layers modules) |
pl_data/ |
Lightning DataModules and dataset configs |
pl_models/ |
Lightning modules per task |
dre/ |
Core DRE components: paths (dre/path), losses (dre/losses), estimators (dre/estimators.py) |
dre_utils/ |
Project helpers (utils.py, data_utils.py, tabular/), sampling routines, plotting/visualization, logger (dre_utils/logger.py), and helper scripts (e.g. download_ood_benchmark.py). |
config/ |
Argument parsing and save paths |
Import examples: from pl_data import SUBTASK_DATA_MODULES, from nn_models import JointScoreModel, from config import parse_args, from path import ToyInterpXt.
All experiments are run through main.py with subparsers for different tasks:
# Likelihood estimation
python main.py likelihood_estimation --subtask synthetic2d --data checkerboard --method dre
# Tabular density estimation
python main.py likelihood_estimation --subtask tabular --data bsds300 --method dre --batch_size 20000
# Real image modeling
python main.py likelihood_estimation --subtask real_image --data mnist --method dre
# Distribution shift estimation
python main.py fdiv_estimation --subtask distribution_shift --data gauss_shift --shift_steps 5 --shift 0.05 --d 80
# Mutual information estimation
python main.py fdiv_estimation --subtask mutual_information --data diag_multi_gauss --epochs 40000 --d 160
# Change point detection
python main.py fdiv_estimation --subtask change_point_detection --data cpd_text --use_simplex True --cpd_swl 64
# OOD detection (see "OOD detection" section below for tasks and parameters)
python main.py fdiv_estimation --subtask ood_detection --sub_method neural --subsub_method nce --ood_use_imglist True --ood_in_dist cifar10 --image_backbone resnet18
python main.py fdiv_estimation --subtask ood_detection --sub_method score --subsub_method d3re --condition True --joint False --ood_use_imglist True --ood_in_dist fashion_mnist --image_backbone resnet18Most experiments follow the same high-level pattern:
python main.py <task> \
--subtask <subtask> \
--data <dataset> \
--method <dre|cnf> \
--sub_method <score|neural|kernel> \
--subsub_method <method_variant> \
[task-specific arguments...]- Common arguments
--task(implied by subparser):likelihood_estimationorfdiv_estimation--subtask: task-specific subtask (e.g.synthetic2d,tabular,real_image,distribution_shift,mutual_information,change_point_detection,ood_detection,mi_realdata)--data: dataset name (depends on subtask; see tables below)--method: high-level method family, currentlydre(density ratio estimation) orcnf(continuous normalizing flows, for likelihood estimation)--sub_method: estimator type:score: score-based DRE (path-based models)neural: neural density-ratio estimators (e.g. NCE, InfoNCE, logistic, PW, χ²)kernel: kernel density-ratio estimators
--subsub_method: specific method variant within the chosensub_method:- score-based:
dre_infty,d3re - neural:
kulsif,logistic,nce,infonce,rkl,hellinger,kl,gamma,ew,pw,chisq - kernel: currently no sub-variants
- score-based:
-
Likelihood estimation (
task=likelihood_estimation)- 2D synthetic (score-based vs neural vs kernel):
# Score-based DRE (e.g., DRE∞) python main.py likelihood_estimation \ --subtask synthetic2d \ --data checkerboard \ --method dre \ --sub_method score \ --subsub_method dre_infty # Neural DRE (e.g., NCE) python main.py likelihood_estimation \ --subtask synthetic2d \ --data checkerboard \ --method dre \ --sub_method neural \ --subsub_method nce # Kernel DRE python main.py likelihood_estimation \ --subtask synthetic2d \ --data checkerboard \ --method dre \ --sub_method kernel
- Tabular / real image tasks share the same pattern; only
--subtask,--data, and some optimization hyperparameters (e.g.,--batch_size,--epochs) change.
- 2D synthetic (score-based vs neural vs kernel):
-
F-divergence estimation (
task=fdiv_estimation)- Distribution shift, mutual information, change point detection:
# Score-based DRE for distribution shift python main.py fdiv_estimation \ --subtask distribution_shift \ --data gauss_shift \ --method dre \ --sub_method score \ --subsub_method dre_infty # Neural DRE for mutual information python main.py fdiv_estimation \ --subtask mutual_information \ --data diag_multi_gauss \ --method dre \ --sub_method neural \ --subsub_method nce \ --epochs 40000 --d 160
- Distribution shift, mutual information, change point detection:
-
OOD detection (
task=fdiv_estimation,subtask=ood_detection)- The OOD pipeline is the most involved; the recommended combinations (score vs neural methods, CIFAR-10 vs CIFAR-100, train vs test, multi-seed runs) are encoded in
dre_ood.sh. - Core patterns (per single run) are:
# Neural OOD (e.g., NCE) with ResNet-18 backbone python main.py fdiv_estimation \ --subtask ood_detection \ --method dre \ --sub_method neural \ --subsub_method nce \ --ood_use_imglist True \ --ood_in_dist cifar10 \ --image_backbone resnet18 \ --epochs 1000 \ --batch_size 768 \ --lr 0.001 \ --sample_noise_std 0.005 \ --ood_train random # Score-based OOD (e.g., D3RE or OS-DRE) with UNet backbone python main.py fdiv_estimation \ --subtask ood_detection \ --method dre \ --sub_method score \ --subsub_method d3re \ # or dre_infty --ood_use_imglist True \ --ood_in_dist cifar10 \ --image_backbone unet \ --path_type trigonometric \ --t_mode lognorm \ --condition True \ --joint True \ --bridge 1 \ --epochs 1000 \ --batch_size 768 \ --lr 0.001 \ --sample_noise_std 0.005 \ --ood_train random
- For large-scale, multi-GPU, multi-seed sweeps, use the provided launcher:
# Test mode (single-GPU, timing-friendly) bash dre_ood.sh test score cifar10 "0" # Train all score-based methods on CIFAR-10 and CIFAR-100, multiple GPUs and seeds bash dre_ood.sh train score both "0 1 2 3" "1 2 3407"
dre_ood.shcentralizes:- Shared OOD hyperparameters (epochs, LR, noise level, imglist mode, path type, etc.)
- Method lists:
- Neural:
NEURAL_METHODS=(nce pw chisq infonce logistic) - Score:
SCORE_METHODS=(dre_infty d3re)
- Neural:
- GPU scheduling logic for training mode.
- The OOD pipeline is the most involved; the recommended combinations (score vs neural methods, CIFAR-10 vs CIFAR-100, train vs test, multi-seed runs) are encoded in
Data is read from the data folder (sibling to OpenDRE; path set by --data_dir, default ../data). Two modes:
| Mode | Description | Parameters |
|---|---|---|
| Built-in | ID data from torchvision (download to data_dir). OOD source and eval sets fixed per ID. |
--ood_in_dist, --ood_base_type, --image_backbone, etc. |
| Imglist | ID and OOD from list files under data_dir (one line per image_path label). Benchmarks defined in pl_data/ood_detection.py. |
--ood_use_imglist True, --ood_in_dist (cifar10 or cifar100), --data_dir, --image_backbone, etc. |
| Task | In-distribution | OOD source / eval | Command-line |
|---|---|---|---|
| MNIST OOD | MNIST | --ood_base_type: local (cropped MNIST), general (Fashion MNIST), universal (300K random images). Eval: Fashion, EMNIST. |
--ood_in_dist mnist --ood_base_type general --image_backbone resnet18 |
| Fashion MNIST OOD | Fashion MNIST | Same options. Eval: MNIST, EMNIST. | --ood_in_dist fashion_mnist --ood_base_type general --image_backbone resnet18 |
| CIFAR-10 OOD | CIFAR-10 | Same options. Eval: SVHN, CIFAR-100, LSUN, CelebA. | --ood_in_dist cifar10 --ood_base_type general --image_backbone wideresnet28_10 |
| CIFAR-100 OOD | CIFAR-100 | Same options. Eval: SVHN, CIFAR-10, LSUN, CelebA. | --ood_in_dist cifar100 --ood_base_type general --image_backbone wideresnet28_10 |
- Backbone:
--image_backbone resnet18(MNIST/Fashion);--image_backbone wideresnet28_10(CIFAR, recommended). - Optional:
--sample_noise_std 0.1,--ood_crop_size 12(local),--image_dropout 0.3.
Imglist mode uses list files (one line per image_path label) and matches the OpenOOD data layout. Use this when you want standard near-OOD / far-OOD splits (e.g. CIFAR-100, TinyImageNet, MNIST, SVHN as OOD).
How to get the txt files and data
-
Option A (recommended): From OpenDRE repo, run our script (uses same OpenOOD Google Drive IDs):
pip install gdown python dre_utils/download_ood_benchmark.py --save_dir ../data
This creates
../data/benchmark_imglist/(all txt lists) and../data/images_classic/{cifar10,cifar100,tin,mnist,svhn}/. Then set--data_dir ../data. -
Option B: Clone OpenOOD and run their download script so that
data/contains:benchmark_imglist/— all.txtlist files (e.g.cifar10/train_cifar10.txt,cifar10/test_cifar100.txt, …)images_classic/— image folders (cifar10, cifar100, tin, mnist, svhn, …) From OpenOOD repo (withpip install gdown):
python scripts/download/download.py --contents datasets --datasets cifar-10 --save_dir /path/to/data --dataset_mode benchmark
For CIFAR-10 + CIFAR-100 benchmarks you can use
--datasets ood_v1.5and download both. The script fetches benchmark_imglist (txt) and the image datasets (cifar10, cifar100, tin, mnist, svhn, etc.) intosave_dir. -
Point OpenDRE at that folder:
--data_dir /path/to/data(the directory that containsbenchmark_imglistandimages_classic).
Near-OOD vs far-OOD (OpenOOD convention)
- Near-OOD: semantically closer to ID (e.g. CIFAR-100, TinyImageNet when ID is CIFAR-10).
- Far-OOD: visually/semantically farther (e.g. MNIST, SVHN, Texture, Places365 when ID is CIFAR-10).
In pl_data/ood_detection.py, OOD_IMGLIST_BENCHMARKS["cifar10"]["ood_eval"] defines the evaluation OOD sets by name and list path; the same txt files are produced by OpenOOD’s download. You do not set nearood/farood via a single parameter — they are fixed by the benchmark (list of eval datasets). To add or remove OOD sets, edit OOD_IMGLIST_BENCHMARKS in pl_data/ood_detection.py.
Layout after OpenOOD download
data_dir/
benchmark_imglist/
cifar10/ train_cifar10.txt, val_cifar10.txt, test_cifar10.txt, val_tin.txt, test_cifar100.txt, test_tin.txt, test_mnist.txt, test_svhn.txt, ...
cifar100/ ...
images_classic/
cifar10/, cifar100/, tin/, mnist/, svhn/, ...
Example
# After downloading with OpenOOD script and setting data_dir to that data folder
python main.py fdiv_estimation --subtask ood_detection --ood_use_imglist True --ood_in_dist cifar10 --image_backbone resnet18 --epochs 500 --sub_method neural --subsub_method nce| Subtask | Datasets | Description |
|---|---|---|
synthetic2d |
swissroll, 8gaussians, pinwheel, circles, moons, 2spirals, checkerboard, rings, tree | 2D synthetic distributions for visualization |
tabular |
bsds300 (63D), power (6D), gas (8D), hepmass (21D), miniboone (43D) | UCI benchmark datasets |
real_image |
mnist, fashion_mnist, cifar10, svhn, celeba, lsun_church, frey_face | Image density estimation |
| Subtask | Datasets | Description |
|---|---|---|
distribution_shift |
gauss_shift, beta_shift, cifar_c, domainbed, gauss_shift_kl | Measure distribution shift between domains |
mutual_information |
half_cube_map, asinh_mapping, additive_noise, gamma_exponential, diag_multi_gauss, gauss | Estimate mutual information I(X;Y) |
mi_realdata |
image, text, mixture | Real-world MI estimation (MNIST, IMDB, CIFAR-10) |
change_point_detection |
cpd_mc, cpd_text, cpd_creditcard | Detect change points in time series |
ood_detection |
mnist, fashion_mnist, cifar10, cifar100 | Detect out-of-distribution samples |
This section summarizes where the data for each task/subtask comes from, where it is stored, and how to prepare it. The root data directory is specified by --data_dir (default ../data).
| Subtask | Source | Location / How to obtain |
|---|---|---|
| synthetic2d | Generated in code | No download required; datasets (e.g., swissroll, 8gaussians, pinwheel, etc.) are generated on the fly. |
| tabular | Local files (UCI, etc.) | Place data under {data_dir}/tabular/ with the following structure:• BSDS300: BSDS300/BSDS300.hdf5• power: power/data.npy• gas: gas/ethylene_CO.pickle• hepmass: hepmass/1000_train.csv, hepmass/1000_test.csv• miniboone: miniboone/data.npy |
| real_image | Torchvision download + local | mnist / fashion_mnist / cifar10 / svhn / lsun_church / celeba: downloaded via torchvision with download=True into {data_dir}.frey_face: manually place frey_rawface.mat under {data_dir}. |
| Subtask | Source | Location / How to obtain |
|---|---|---|
| distribution_shift | Generated in code | Synthetic data (e.g., gauss_shift, beta_shift, cifar_c, domainbed); no external files needed. |
| mutual_information | Generated in code | Synthetic data (e.g., half_cube_map, gauss); no download required. |
| mi_realdata | Torchvision + local text data | Uses args.data_dir: image data (MNIST/CIFAR-10) is downloaded via torchvision; text data (IMDB, etc.) should be placed under the corresponding subdirectories of {data_dir} as expected by the code. |
| change_point_detection | Local files / synthetic | cpd_mc: synthetic, no files needed. cpd_text: place Language Detection.csv under ./data/density_ratio_estimation/change_point_detection/ (or the path configured in the code).cpd_creditcard: place creditcard.txt (or equivalent CSV) in the same directory. |
| ood_detection | See OOD section below | See OOD data: how to prepare txt and images. |
-
Built-in ID datasets (no imglist)
ID datasets such as CIFAR-10/CIFAR-100 are downloaded automatically via torchvision into{data_dir}. An optional300K_random_images.npycan be used for universal OOD; you need to prepare this file yourself and place it at the expected path. -
Imglist-style OOD (OpenOOD-style)
Two options:
Option A: use the helper script
python dre_utils/download_ood_benchmark.py
This downloadsbenchmark_imglistandimages_classicinto the configured data directory (either from the script or environment variables).
Option B: clone OpenOOD and preparebenchmark_imglist/andimages_classic/as described in its documentation, then place them under{data_dir}so that the layout matchespl_data/ood_detection.py(OOD_IMGLIST_BENCHMARKS,list_subdir/image_subdir, including near- and far-OOD).
When using imglist mode, set--ood_use_imglist True; onlycifar10andcifar100are supported as ID datasets.
| Path Type | Description | Flag |
|---|---|---|
VP |
Variance Preserving SDE | Default |
subVP |
Sub-Variance Preserving SDE | --path_type subVP |
bridge |
Bridge Matching | --bridge |
mvp |
Mimimum Variance Path | --mvp |
OT |
Optimal Transport | --OT |
SAI |
Secant Alignment Identity | --SAI |
# Neural density ratio estimation
python main.py likelihood_estimation --subtask synthetic2d --sub_method neural --subsub_method kulsif
# Kernel density ratio estimation
python main.py likelihood_estimation --subtask synthetic2d --sub_method kernel# Use ResNet-18 backbone (default for MNIST-like datasets)
python main.py fdiv_estimation --subtask ood_detection --ood_in_dist mnist --image_backbone resnet18
# Use ResNet-50 backbone
python main.py fdiv_estimation --subtask ood_detection --ood_in_dist mnist --image_backbone resnet50
# Use WideResNet-28-10 (recommended for CIFAR-10/100)
python main.py fdiv_estimation --subtask ood_detection --ood_in_dist cifar10 --image_backbone wideresnet28_10 --image_dropout 0.3| Argument | Default | Description |
|---|---|---|
--task |
Required | likelihood_estimation or fdiv_estimation |
--subtask |
Required | Task-specific subtask |
--method |
dre |
dre or cnf |
--sub_method |
score |
score, neural, or kernel |
--subsub_method |
dre_infty |
Specific method variant |
--data |
Required | Dataset name |
--gpu_id |
[0] |
GPU device IDs |
--batch_size |
9600 | Training batch size |
--epochs |
3000 | Number of training epochs |
--lr |
1e-3 | Learning rate |
--dims |
128-128-128-128-128 |
Hidden layer dimensions |
--seed |
3407 | Random seed |
| Argument | Default | Description |
|---|---|---|
--ood_in_dist |
mnist |
In-distribution dataset |
--ood_base_type |
universal |
Base distribution type: local, general, universal |
--sample_noise_std |
0.1 | Noise standard deviation (unified name) |
--ood_data_dir |
../data |
Data directory |
--image_backbone |
resnet18 |
Image backbone: resnet18, resnet34, resnet50, wideresnet28_10 |
--image_dropout |
0.3 | Dropout rate for WideResNet |
main.py → pl_data (DataModule) → pl_models (Lightning module) → training
- main.py: Entry point and argument parsing.
- pl_data/: DataModules and dataset configs; register in
SUBTASK_DATA_MODULES. - pl_models/: Lightning modules per task; register in
SUBTASK_MODLE_MODULES. - config/: Task-specific arguments (e.g.
get_fdiv_args,get_likelihood_args). - nn_models/: Model architectures used by
pl_modelsviabuild_model().
Use an existing task (e.g. OOD detection in pl_models/ood_detection.py, pl_data/ood_detection.py) as a template.
| Step | Where | What to do |
|---|---|---|
| 1 | config/config_setting.py |
Add arguments in the right get_*_args() (e.g. get_fdiv_args). |
| 2 | pl_data/ |
Add dataset name to the task’s list; implement a DataModule and register it in SUBTASK_DATA_MODULES. |
| 3 | pl_models/ |
Add a Lightning module (inherit from BaseDREMetrics or the task base); implement build_model(), training_step, and optionally validation_step; register in SUBTASK_MODLE_MODULES. |
| 4 | main.py |
In the task branch, instantiate your DataModule and Model and pass them to the trainer. |
No need to copy large blocks: open the corresponding file for an existing subtask and mirror its structure (imports, registration keys, method names).
- Density ratio: Many tasks use qx (reference, e.g. t=0) and px (target, e.g. t=1). The model estimates r(x) = px(x) / qx(x). Batch format is often
(qx, px)or(x0, x1). - New datasets: For synthetic data, add a generator in
dre_utils/(e.g.data_utils.py) and wire it in the task’s DataModule. For tabular/image, add the name to the task list inpl_dataand implement loading in that DataModule. - Custom losses: Implement a train-step callable (e.g.
def __call__(self, model, batch, step) -> dict) and use it in the task’sset_basic_functions()/train_step_fn. Seelosses/score_matching.py,losses/kernel_density_ratio.py,losses/neural_density_ratio.pyfor the expected interface.
Models live in nn_models/:
tabular_models.py: tabular score and density-ratio modelsimage_models.py: image-based UNet/ResNet/WideResNet score and density-ratio modelsunified.py: thin compatibility wrapper (UnifiedModel) that dispatches to the concrete models above. New code should usetabular_models/image_modelsdirectly.
Use the existing pl_models’ build_model() methods as the reference for constructor
arguments and how models are wired into tasks.
| Use case | Tabular | Image (UNet / ResNet / WRN) |
|---|---|---|
| Score (s(x, t)) | JointScoreModel |
ImageJointScoreModel (UNet), ImageJointScoreModelResNet |
| Density ratio (r(x) = p(x)/q(x)) | NeuralDensityRatioModel |
ResNetDensityRatioModel |
Example (tabular joint score model):
from nn_models import JointScoreModel
model = JointScoreModel(
input_dim=args.d,
hidden_dims=[256, 256, 256],
embed_dim=args.embed_dim,
nonlinearity=args.nonlinearity,
args=args,
)For image backbones (UNet / ResNet / WideResNet), see pl_models/ood_detection.py or
pl_models/density_estimation_real_image.py for how the Lightning modules build the
corresponding nn_models based on args.image_backbone, args.image_backbone_norm,
args.image_dropout, etc. Activation and polynomial options are documented in
layers/activations.py and in the model docstrings in tabular_models.py /
image_models.py.
You do not need to modify UnifiedModel when adding a new model. Instead:
-
New tabular model
- Implement a new class in
nn_models/tabular_models.py(e.g.MyTabularJointScoreModel), following the patterns ofJointScoreModel. - If a task should use this model, update the corresponding
pl_models/*Lightning module (build_model()or_build_default_model()) to instantiate your class.
- Implement a new class in
-
New image backbone or image model
- Implement a new class in
nn_models/image_models.py(e.g.MyImageJointScoreModel), or extend the ResNet/UNet helpers if you add a new backbone. - Wire it in
pl_modelswhere image models are built, typically inpl_models/ood_detection.py,pl_models/density_estimation_real_image.py, orpl_models/mi_real.py, based onargs.image_backboneor a new flag you add.
- Implement a new class in
-
Optional compatibility layer
- Only if you still rely on
UnifiedModelin legacy scripts, you can extendnn_models/unified.pyto route a new(data_type, model_type, backbone)combination to your new concrete model. For all new code paths, prefer usingtabular_models/image_modelsdirectly.
- Only if you still rely on
OpenDRE/
├── main.py
├── config/ # Argument parsing (e.g. config_setting.py)
├── pl_data/ # DataModules and dataset configs
├── pl_models/ # Lightning modules per task
├── nn_models/ # Model architectures (tabular_models, image_models, legacy UnifiedModel wrapper, plus layers)
├── nsf/ # Original NSF codebase (flows, datasets, experiments, checkpoints)
├── dre/ # Core DRE components: paths, losses, estimators
├── dre_utils/ # Utilities, data_utils, logger, sampling, visualization, helper scripts
└── results/ # Experiment logs, figures, and tables
This framework integrates methods from multiple papers. Please cite the respective works:
- D3RE: Dequantified Diffusion-Schrödinger Bridge for Density Ratio Estimation
@inproceedings{chen2025dequantified,
title={Dequantified Diffusion-Schr\"odinger Bridge for Density Ratio Estimation},
author={Wei Chen and Shigui Li and Jiacheng Li and Junmei Yang and John Paisley and Delu Zeng},
booktitle={International Conference on Machine Learning},
year={2025},
url={https://openreview.net/forum?id=zvyHCOcwsw}
}- MVP-DRE: A Minimum Variance Path Principle for Accurate and Stable Score-Based Density Ratio Estimation
@inproceedings{chen2026a,
title={A Minimum Variance Path Principle for Accurate and Stable Score-Based Density Ratio Estimation},
author={Wei Chen and Jiacheng Li and Shigui Li and Zhiqi Lin and Junmei Yang and John Paisley and Delu Zeng},
year={2026},
url={https://openreview.net/forum?id=vf16PZJWD1}
}- CNF: Neural Ordinary Differential Equations
- SDE: Score-Based Generative Modeling
MIT License - See LICENSE file for details.
This repository builds upon open-source implementations from the research community.