Skip to content

HouV001/SemoDepth

Repository files navigation

SemoDepth

This repo is the official implementation for Selection, Not Fusion: Radar-Modulated State Space Models for Radar-Camera Depth Estimation.

Radar-camera depth estimation built on two core designs:

  1. Radar-Modulated Selection (RMS). Radar features are injected inside the Mamba selective scan via the step size $\Delta_t$ and readout $\mathbf{C}_t$, while the input projection $\mathbf{B}_t$ and state-evolution matrix $\mathbf{A}$ remain image-only. The Mamba main stream carries pure-image tokens; radar enters only through the $\Delta_t / \mathbf{C}_t$ paths.
  2. Multi-View Scan Pyramid (MVSP). A hierarchical decoder applies global 4-direction Mamba scans at coarse resolutions, radar-centered windowed scans at mid resolution, and FiLM modulation at fine resolutions — putting scan-based fusion where context matters most and lightweight modulation where local detail dominates.

Install

pip install -r requirements.txt
pip install -e .

Two dependencies need a CUDA build environment and are installed separately:

pip install mamba-ssm causal-conv1d

For the nuScenes data path (ground-truth accumulation, day/night split):

pip install nuscenes-devkit

Datasets

  • nuScenes — full driving dataset; download from the official nuScenes website. Train/val/test manifests follow the radar-camera-fusion-depth conventions used by recent radar-camera depth work (TacoDepth, CaFNet, Li et al., Singh et al.). Pre-built manifests live under a single manifest_dir.
  • ZJU-4DRadarCam — campus dataset with 4D radar; download from the original release.

Hard paths are not baked in. Each script accepts CLI flags or environment variables:

Variable Used by Purpose
NUSCENES_MANIFEST_DIR train_nuscenes.py Default --manifest_dir
NUSCENES_LOG_ROOT train_nuscenes.py Default --log_root (where checkpoints land)
ZJU_DATA_ROOT train_zju.py Default --data_root
ZJU_LOG_ROOT train_zju.py Default --log_root

Train

python scripts/train_nuscenes.py \
    --manifest_dir /path/to/manifests \
    --epochs 50 --batch_size 12 --lambda_grad 0.5

python scripts/train_zju.py \
    --data_root /path/to/ZJU-4DRadarCam/data \
    --epochs 100 --batch_size 12 --lambda_grad 0.5

Pretrained weights

A single archive on Google Drive bundles both released checkpoints:

Download from Google Drive

pip install gdown
gdown 1XleLB5xPMfHPUJ9pUS_ziN2DnH1-v9SC
# unzip / tar -xzf the downloaded archive — both .pth files end up next to it
Checkpoint Trained on Eval set MAE@80m δ1@80m
semodepth_nuscenes.pth nuScenes (160-sweep D_acc, horizon-cleaned) nuScenes val (5869 fr.) 1285 0.9436
semodepth_zju.pth ZJU-4DRadarCam train ZJU-4DRadarCam test 1137 0.9035

Both checkpoints use fusion_mode=BE with pure_image_tokens=True. Pass --max_depth 120 for the nuScenes weight and --max_depth 80 for the ZJU weight (matches each model's training-time depth ceiling).

Evaluate

python scripts/eval_nuscenes.py \
    --checkpoint /path/to/checkpoint.pt \
    --manifest_dir /path/to/manifests

For the day/night protocol (TacoDepth Table 7):

python scripts/eval_nuscenes.py \
    --checkpoint /path/to/checkpoint.pt \
    --manifest_dir /path/to/manifests \
    --frame_subset day \
    --dataroot /path/to/nuScenes

Build accumulated LiDAR ground truth (nuScenes)

The model is trained on a 160-sweep accumulated supervision target with a horizon-cleaning pass. Build it from the raw dataset:

python scripts/accumulate_lidar_gt.py \
    --dataroot /path/to/nuScenes \
    --derived_dir /path/to/nuScenes_derived \
    --output_subdir gt_lidar_160sweep --n_sweeps 160 --n_thread 40

python scripts/clean_160sweep_horizon.py \
    --src_root /path/to/nuScenes_derived/gt_lidar_160sweep \
    --output_dir /path/to/nuScenes_derived/gt_lidar_160sweep_cleaned

Tests

pytest tests/

Tests cover the zero-init parity guarantee (the radar-side projections start at zero so the block matches vanilla Mamba at init) and gradient flow through each pathway. They require CUDA and mamba-ssm; tests are skipped automatically without them.

Repository layout

  • mambadepth_fusion/model/ — RMS block (cross_modal_mamba.py), top-level model (depth_net.py), MVSP blocks (fusion_blocks.py), Radar GSE (radar_gse.py)
  • mambadepth_fusion/data/ — nuScenes and ZJU dataset loaders
  • mambadepth_fusion/train/ — training loop with the composite loss
  • mambadepth_fusion/utils/ — evaluation metrics
  • scripts/ — CLI wrappers for training, evaluation, qualitative visualization, ground-truth construction, and supervision cleaning
  • tests/ — smoke tests for the RMS block

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors