Map-Uncertainty-Test

Code for "Language-Driven Semantic Change Detection in Urban Maps via Multi-Modal Deep Learning" (Huaze Liu, Zihao Gao, Adyasha Mohanty — Harvey Mudd College, ION GNSS+ 2025). [paper PDF]

_{See demo/ for the source clip and how it was generated. GitHub does not
render local <video> sources in rendered Markdown -- after pushing, drag demo/demo.mp4
into a GitHub comment/PR box once to get a stable user-attachments URL, then swap it in here.}

Abstract

High-integrity maps are essential for safe autonomous navigation in dynamic urban environments, where frequent changes and sensor limitations present significant challenges. This work introduces a deep-learning-driven framework for continuous map uncertainty monitoring and semantic change detection that fuses vision and LiDAR features. Zero-shot semantic segmentation via large pre-trained vision-language models provides interpretable, language-driven explanations for detected map inconsistencies, while Kullback-Leibler divergence tracks map consistency over time and enables proactive real-time alerts.

Architecture

                     ┌─────────────────────────┐        ┌──────────────────────────┐
  RGB frame  ───────▶│      VISION MODULE       │        │      LIDAR MODULE        │◀─────── depth + semantic
  (before/after)     │  Grounding DINO v2 + SAM │        │ depth → pseudo-LiDAR →   │         frame (before/after)
                     │  → per-class label mask  │        │ PointNet (enhanced)      │
                     │  → change map C(x,y)     │        │ → class-removal variant  │
                     └────────────┬─────────────┘        └────────────┬─────────────┘
                                  │ D_vis_KL (KL divergence)           │ D_lidar_KL (weighted
                                  │                                    │  normalized KL scatter)
                                  └───────────────┬────────────────────┘
                                                  ▼
                                   S = α·D_vis_KL + β·D_lidar_KL
                                   (α,β heuristic per weather condition)

Paper section	Code
I.1 Vision Module (Grounding DINO v2 + SAM, change map)	`map_uncertainty/vision/`
I.1 baselines: CLIP patch-diff, LoFTR (Sec. III.1.a)	`map_uncertainty/vision/baselines.py`
I.2 LiDAR Module (depth→pseudo-LiDAR, enhanced PointNet)	`map_uncertainty/lidar/`
I.3 Fusion score `S = α·D_vis + β·D_lidar`	`map_uncertainty/fusion/score.py`
II.2 KL divergence, weighted KL scatter, Change Ratio, Jaccard Distance	`map_uncertainty/metrics/`
II.1 Class-removal scene variants (Telea inpainting, point removal)	`map_uncertainty/lidar/variants.py`
Root prototype: CLIPSeg construction-object segmentation	`SemantSeg.ipynb`, `map_uncertainty/vision/clipseg_pipeline.py`
Vendored classical PointNet (Qi et al., 2017) reference impl	`3D/point_net/` (MIT, © Isaac Berrios — see `3D/LICENSE`)
PointNet exploration / S3DIS visualization notebooks	`3D/`

Repository layout

map_uncertainty/          importable package -- the paper's pipeline
  vision/                 Grounding DINO+SAM, CLIPSeg, CLIP/LoFTR baselines, change maps
  lidar/                  depth→pointcloud, enhanced PointNet, class-removal variants
  fusion/                 vision+LiDAR fusion score
  metrics/                KL divergence, weighted KL scatter, Change Ratio, Jaccard Distance
  data/                   synthetic Virtual-KITTI-shaped sample generator
scripts/                  CLI entry points (see "Quickstart")
tests/                    pytest unit tests (`pytest tests/`)
3D/                       vendored PointNet reference implementation + S3DIS notebooks
demo/                     source assets + script for the short demo clip above
imgs/, roadwork.jpg, ...  example construction-scene images used by the prototype notebooks
SemantSeg.ipynb, PointNet.ipynb  original exploratory notebooks

Installation

python -m venv .venv && source .venv/bin/activate   # or your conda env of choice
pip install -r requirements.txt

The Grounding DINO + SAM proposed method and the LoFTR baseline need heavier, optional dependencies (and their own checkpoint downloads -- see below):

pip install -r requirements-optional.txt

Quickstart

The full Virtual KITTI 2 dataset (Cabon et al., 2020) used in the paper's experiments is tens of GB and is not vendored here. map_uncertainty/data/synthetic.py generates small synthetic RGB/depth/semantic frames with the same schema so every pipeline stage below runs end-to-end without it -- point the same functions at real Virtual KITTI frames to reproduce the paper's actual numbers (see "Reproducing the paper's results").

# 1. Generate a synthetic sample sequence
python scripts/generate_synthetic_sample.py --out data/sample_scene --frames 5

# 2. Vision branch: CLIPSeg backend (no extra checkpoints needed beyond `transformers`)
python scripts/run_vision_pipeline.py --backend clipseg \
    --before without_roadwork.jpg --after roadwork.jpg \
    --labels "cone" "barricade" "construction vehicle" \
    --out outputs/vision_clipseg.png

# CLIP patch-difference baseline (Sec. III.1.a)
python scripts/run_vision_pipeline.py --backend clip \
    --before without_roadwork.jpg --after roadwork.jpg --out outputs/vision_clip.png

# 3. LiDAR branch: depth→pointcloud, remove a class, score the change
python scripts/run_lidar_pipeline.py --variant noTrafficSigns

# 4. Fused vision+LiDAR score
python scripts/run_fusion.py --condition clear

Run pytest tests/ to check the metrics/vision/lidar/fusion modules (23 tests, no GPU or checkpoints required).

Model weights

Not fetched automatically by this repo -- download separately if you want to run the proposed Grounding DINO + SAM method or the LoFTR baseline for real:

Model	Used by	Approx. size	Source
GroundingDINO (Swin-T)	`vision.GroundingDINOSAM`	~660MB	IDEA-Research/GroundingDINO
SAM `vit_b`	`vision.GroundingDINOSAM`	~375MB	facebookresearch/segment-anything (`vit_h` is 2.4GB and unnecessary here)
CLIP ViT-B/32	`vision.CLIPPatchDiff`	~605MB	auto-cached via `transformers` (`openai/clip-vit-base-patch32`)
CLIPSeg (rd64-refined)	`vision.clipseg_pipeline`	~600MB	auto-cached via `transformers` (`CIDAS/clipseg-rd64-refined`)
LoFTR (outdoor)	`vision.LoFTRChangeDetector`	~46MB	auto-cached via `kornia`

Check your available disk space before downloading GroundingDINO/SAM -- together they need roughly 1GB free.

Reproducing the paper's results

This repo is the reference implementation of every formula and architectural component described in the paper (change map construction, weighted normalized KL scatter, Change Ratio, Jaccard Distance, fusion score, the enhanced PointNet head). The tables/figures in the paper (TPR, mIoU, KL divergence and Pearson correlation vs. baselines) were produced on the full Virtual KITTI 2 dataset with GPU training over the LiDAR classifier -- reproducing those exact numbers requires that dataset and compute, neither of which this repo bundles. The synthetic-data path above is for verifying every stage of the pipeline runs correctly end-to-end, not for reproducing the paper's benchmark numbers.

Citation

@inproceedings{liu2025languagedriven,
  title     = {Language-Driven Semantic Change Detection in Urban Maps via Multi-Modal Deep Learning},
  author    = {Liu, Huaze and Gao, Zihao and Mohanty, Adyasha},
  booktitle = {Proceedings of the ION GNSS+ 2025},
  year      = {2025},
  url       = {https://huazeliu.github.io/files/paper/ION_GNSS_2025.pdf}
}

Also cite the underlying methods this framework builds on: PointNet (Qi et al., 2017), Grounding DINO (Ding et al., 2024), Segment Anything (Kirillov et al., 2023), LoFTR (Sun et al., 2021), and Virtual KITTI 2 (Cabon et al., 2020).

License

MIT (see LICENSE). The vendored PointNet implementation in 3D/point_net/ is separately MIT-licensed by Isaac Berrios (3D/LICENSE).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Map-Uncertainty-Test

Abstract

Architecture

Repository layout

Installation

Quickstart

Model weights

Reproducing the paper's results

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
3D @ bf2aaa0		3D @ bf2aaa0
demo		demo
imgs		imgs
map_uncertainty		map_uncertainty
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
PointNet.ipynb		PointNet.ipynb
README.md		README.md
SemantSeg.ipynb		SemantSeg.ipynb
color_convert.py		color_convert.py
kl_divergence_plot.png		kl_divergence_plot.png
pyproject.toml		pyproject.toml
requirements-optional.txt		requirements-optional.txt
requirements.txt		requirements.txt
roadwork.jpg		roadwork.jpg
without_roadwork.jpg		without_roadwork.jpg

Folders and files

Latest commit

History

Repository files navigation

Map-Uncertainty-Test

Abstract

Architecture

Repository layout

Installation

Quickstart

Model weights

Reproducing the paper's results

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages