MVP-Nav: Multi-layer Value Map Planner Navigator is a physical-aware, RGB-only framework for zero-shot Object Goal Navigation (ZSON). MVP-Nav has been accepted to RSS 2026.
MVP-Nav reconstructs explicit physical occupancy from monocular RGB observations, lifts 2D semantic instances into 3D oriented bounding boxes, and plans through a Multi-layer Value Map (MVM) that combines semantic priorities, exploration guidance, and traversability constraints.
The system is organized as a recursive navigation loop:
- Physical Perception: RGB observations are processed by VGGT and Grounded-SAM to build point clouds, oriented bounding boxes, BEV maps, and a Global Spatial Semantic List (GSSL).
- VLM Reasoning: the goal and GSSL are scored by an LLM/VLM module to choose the navigation mode and semantic priorities.
- MVM Planning: semantic, direction, exploration, and safety fields are fused into a shared cost space to select a midterm goal.
- Low-level Execution: A* and FMM generate short-term goals, with semantic floor re-projection used as a safety check.
Core mapping:
| Component | Code |
|---|---|
| Physical Perception | src/core/physical_perception.py, src/map/spacev6.py |
| VLM Reasoning | src/core/vlm_reasoning.py, src/graph/graphv2.py |
| MVM Planning | src/core/mvm_planning.py, src/graph/graphv2.py |
| Low-level Execution | src/core/low_level_execution.py, src/agent/unigoal/agent.py |
| Episode loop | src/core/navigation.py, src/pipeline/navigation_loop.py |
| Main entry | main_nav.py |
.
├── main_nav.py # Main Habitat navigation entry
├── configs/ # Runtime config
├── scripts/download_nav_datasets.py # Episode dataset downloader
├── src/
│ ├── agent/ # Agent control and perception wrapper
│ ├── core/ # Navigation modules
│ ├── data/ # Dataset path and validation helpers
│ ├── envs/ # Habitat and real-world env adapters
│ ├── graph/ # GSSL, VLM reasoning, MVM
│ ├── map/ # VGGT, point cloud, BEV and OBB logic
│ └── pipeline/ # run_episode wrapper
└── docs/assets/ # README figures
Large datasets, checkpoints, logs, and local experiment artifacts are ignored by git.
The environment follows the same dependency family as UniGoal, with additional MVP-Nav modules. Use Python 3.8 for best compatibility with Habitat 0.2.3.
conda create -n mvpnav python=3.8
conda activate mvpnavInstall Habitat:
conda install habitat-sim==0.2.3 -c conda-forge -c aihabitat
pip install -e third_party/habitat-labInstall LightGlue, Detectron2, PyTorch3D, and Grounded-SAM:
pip install git+https://github.com/cvg/LightGlue.git
pip install git+https://github.com/facebookresearch/detectron2.git
pip install git+https://github.com/facebookresearch/pytorch3d.git
git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git third_party/Grounded-Segment-Anything
cd third_party/Grounded-Segment-Anything
git checkout 5cb813f
pip install -e segment_anything
pip install --no-build-isolation -e GroundingDINO
cd ../../Install common runtime packages:
conda install pytorch::faiss-gpu
pip install hydra-core omegaconf gym numpy-quaternion opencv-python scikit-image matplotlib supervision transformersIf a requirements.txt is provided in your checkout, install it after the Habitat and third-party packages:
pip install -r requirements.txtPlace model files under data/models/:
data/models/
├── groundingdino_swint_ogc.pth
├── sam_vit_h_4b8939.pth
├── model.pt # VGGT checkpoint
└── bert-base-uncased/ # Local BERT tokenizer/model directory
Download the public SAM and GroundingDINO checkpoints:
mkdir -p data/models
wget -O data/models/sam_vit_h_4b8939.pth \
https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget -O data/models/groundingdino_swint_ogc.pth \
https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pthPass a different BERT path with:
python main_nav.py --bert-path path/to/bert-base-uncased ...Option 1: use a local Ollama model:
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2-visionOption 2: use an OpenAI-compatible API. Set llm_model, vlm_model, api_key, and base_url in configs/config_habitat.yaml.
Do not commit private API keys.
By default, episode datasets are expected under:
sharedata/datasets/
You can override this with --data-root.
Download Habitat ObjectNav episode files:
python scripts/download_nav_datasets.py --dataset objectnav_hm3d --data-root sharedata
python scripts/download_nav_datasets.py --dataset objectnav_mp3d --data-root sharedataExpected layout:
sharedata/
└── datasets/
└── objectnav/
├── hm3d/
│ └── v2/
│ ├── train/
│ ├── val/
│ └── val_mini/
└── mp3d/
└── v1/
├── train/
├── val/
└── val_mini/
HM3D and MP3D scene meshes require separate access/licensing and are not downloaded by download_nav_datasets.py.
Place or symlink scenes under:
data/
└── scene_datasets/
├── hm3d_v0.2/
│ └── val/
│ ├── 00800-TEEsavR23oF/
│ ├── ...
│ └── 00899-58NLZxWBSpk/
└── mp3d/
├── 17DRP5sb8fy/
├── ...
└── zsNo4HB9uLZ/
For instance-image-goal and text-goal runs, use the layout inherited from UniGoal:
data/
└── datasets/
├── textnav/
│ └── val/
│ └── val_text.json.gz
└── instance_imagenav/
└── hm3d/
└── v3/
└── val/
├── content/
│ ├── <scene>.json.gz
│ └── ...
└── val.json.gz
Validate syntax and dataset paths:
python -m py_compile main_nav.py src/data/nav_datasets.py scripts/download_nav_datasets.py
python main_nav.py --nav-dataset objectnav_hm3d --split val_mini --num-episodes 1 --smoke-env-only
python main_nav.py --nav-dataset objectnav_mp3d --split val_mini --num-episodes 1 --smoke-env-only--smoke-env-only constructs Habitat and resets one episode without loading the heavy MVP-Nav perception models.
Run ObjectNav on HM3D:
python main_nav.py \
--nav-dataset objectnav_hm3d \
--split val_mini \
--num-episodes 1Run ObjectNav on MP3D:
python main_nav.py \
--nav-dataset objectnav_mp3d \
--split val_mini \
--num-episodes 1Run InstanceImageNav on HM3D:
python main_nav.py \
--nav-dataset instance_imagenav_hm3d \
--split val_mini \
--num-episodes 1Useful flags:
--device-id 0
--data-root sharedata
--scene-data-root data/scene_datasets
--bert-path data/models/bert-base-uncased
--skip-dataset-check
--smoke-env-only
Outputs are written to:
vis_log_nav/<timestamp>/
vis_log_nav_good/<timestamp>/
result_nav.txt
outputs/
These paths are ignored by git.
Core:
main_nav.py: entry point for Habitat evaluation.src/agent/unigoal/agent.py: observation handling, semantic segmentation, and low-level actions.src/graph/graphv2.py: goal parsing, GSSL, semantic scoring, and MVM planning.src/map/spacev6.py: VGGT-based reconstruction, object fusion, OBBs, and BEV map generation.src/core/navigation.py: recursive episode loop.
Environment and utilities:
src/envs/habitat/: Habitat environment wrappers and task configs.src/utils/fmm/: Fast Marching Method utilities.src/utils/visualization/: visualization and video helpers.src/utils/llm.py: LLM/VLM wrappers.src/data/nav_datasets.py: dataset specs, paths, and validation helpers.
- Keep datasets, checkpoints, generated logs, and private credentials out of git.
- The repository assumes Habitat scene assets are prepared locally under
data/scene_datasets/. sharedata/is a convenient local/shared data root; use--data-rootif your layout differs.
