SemoDepth

This repo is the official implementation for Selection, Not Fusion: Radar-Modulated State Space Models for Radar-Camera Depth Estimation.

Radar-camera depth estimation built on two core designs:

Radar-Modulated Selection (RMS). Radar features are injected inside the Mamba selective scan via the step size $\Delta_t$ and readout $\mathbf{C}_t$, while the input projection $\mathbf{B}_t$ and state-evolution matrix $\mathbf{A}$ remain image-only. The Mamba main stream carries pure-image tokens; radar enters only through the $\Delta_t / \mathbf{C}_t$ paths.
Multi-View Scan Pyramid (MVSP). A hierarchical decoder applies global 4-direction Mamba scans at coarse resolutions, radar-centered windowed scans at mid resolution, and FiLM modulation at fine resolutions — putting scan-based fusion where context matters most and lightweight modulation where local detail dominates.

Install

pip install -r requirements.txt
pip install -e .

Two dependencies need a CUDA build environment and are installed separately:

pip install mamba-ssm causal-conv1d

For the nuScenes data path (ground-truth accumulation, day/night split):

pip install nuscenes-devkit

Datasets

nuScenes — full driving dataset; download from the official nuScenes website. Train/val/test manifests follow the radar-camera-fusion-depth conventions used by recent radar-camera depth work (TacoDepth, CaFNet, Li et al., Singh et al.). Pre-built manifests live under a single manifest_dir.
ZJU-4DRadarCam — campus dataset with 4D radar; download from the original release.

Hard paths are not baked in. Each script accepts CLI flags or environment variables:

Variable	Used by	Purpose
`NUSCENES_MANIFEST_DIR`	`train_nuscenes.py`	Default `--manifest_dir`
`NUSCENES_LOG_ROOT`	`train_nuscenes.py`	Default `--log_root` (where checkpoints land)
`ZJU_DATA_ROOT`	`train_zju.py`	Default `--data_root`
`ZJU_LOG_ROOT`	`train_zju.py`	Default `--log_root`

Train

python scripts/train_nuscenes.py \
    --manifest_dir /path/to/manifests \
    --epochs 50 --batch_size 12 --lambda_grad 0.5

python scripts/train_zju.py \
    --data_root /path/to/ZJU-4DRadarCam/data \
    --epochs 100 --batch_size 12 --lambda_grad 0.5

Pretrained weights

A single archive on Google Drive bundles both released checkpoints:

Download from Google Drive

pip install gdown
gdown 1XleLB5xPMfHPUJ9pUS_ziN2DnH1-v9SC
# unzip / tar -xzf the downloaded archive — both .pth files end up next to it

Checkpoint	Trained on	Eval set	MAE@80m	δ1@80m
`semodepth_nuscenes.pth`	nuScenes (160-sweep D_acc, horizon-cleaned)	nuScenes val (5869 fr.)	1285	0.9436
`semodepth_zju.pth`	ZJU-4DRadarCam train	ZJU-4DRadarCam test	1137	0.9035

Both checkpoints use fusion_mode=BE with pure_image_tokens=True. Pass --max_depth 120 for the nuScenes weight and --max_depth 80 for the ZJU weight (matches each model's training-time depth ceiling).

Evaluate

python scripts/eval_nuscenes.py \
    --checkpoint /path/to/checkpoint.pt \
    --manifest_dir /path/to/manifests

For the day/night protocol (TacoDepth Table 7):

python scripts/eval_nuscenes.py \
    --checkpoint /path/to/checkpoint.pt \
    --manifest_dir /path/to/manifests \
    --frame_subset day \
    --dataroot /path/to/nuScenes

Build accumulated LiDAR ground truth (nuScenes)

The model is trained on a 160-sweep accumulated supervision target with a horizon-cleaning pass. Build it from the raw dataset:

python scripts/accumulate_lidar_gt.py \
    --dataroot /path/to/nuScenes \
    --derived_dir /path/to/nuScenes_derived \
    --output_subdir gt_lidar_160sweep --n_sweeps 160 --n_thread 40

python scripts/clean_160sweep_horizon.py \
    --src_root /path/to/nuScenes_derived/gt_lidar_160sweep \
    --output_dir /path/to/nuScenes_derived/gt_lidar_160sweep_cleaned

Tests

pytest tests/

Tests cover the zero-init parity guarantee (the radar-side projections start at zero so the block matches vanilla Mamba at init) and gradient flow through each pathway. They require CUDA and mamba-ssm; tests are skipped automatically without them.

Repository layout

mambadepth_fusion/model/ — RMS block (cross_modal_mamba.py), top-level model (depth_net.py), MVSP blocks (fusion_blocks.py), Radar GSE (radar_gse.py)
mambadepth_fusion/data/ — nuScenes and ZJU dataset loaders
mambadepth_fusion/train/ — training loop with the composite loss
mambadepth_fusion/utils/ — evaluation metrics
scripts/ — CLI wrappers for training, evaluation, qualitative visualization, ground-truth construction, and supervision cleaning
tests/ — smoke tests for the RMS block

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
assets		assets
mambadepth_fusion		mambadepth_fusion
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
google80ad15d300238929.html		google80ad15d300238929.html
index.html		index.html
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SemoDepth

Install

Datasets

Train

Pretrained weights

Evaluate

Build accumulated LiDAR ground truth (nuScenes)

Tests

Repository layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SemoDepth

Install

Datasets

Train

Pretrained weights

Evaluate

Build accumulated LiDAR ground truth (nuScenes)

Tests

Repository layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages