MapGPT

The official implementation of MapGPT. [Paper] [Project]

MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation.

Jiaqi Chen, Bingqian Lin, Ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee K. Wong.

Annual Meeting of the Association for Computational Linguistics (ACL 2024).

If you have any questions, please contact me by email: jqchen(at)cs.hku.hk

Setup

Install Matterport3D simulators: follow instructions here. We use the latest version instead of v0.1.

Install requirements:

conda create -n MapGPT python=3.10
conda activate MapGPT
pip install -r requirements.txt

Prepare data:

You can follow DUET and set the annotations for testing val-unseen set.
We sample a subset containing 72 scenes and 216 cases for quick and cost-effective testing. You can download the corresponding MapGPT_72_scenes_processed.json and place it in the datasets/R2R/annotations directory.
The observation images need to be collected in advance from the simulator. You can use your own saved images or use the RGB_Observations.zip we have processed.

GPT key: please set your API key here.

Inference

In addition to the reported results of GPT-4v in the paper, we have also included the implementation of latest GPT-4o which is faster and cheaper.

You can run the following script where --llm is set as gpt-4o-2024-05-13 and --response_format is set as json.

bash scripts/gpt4o.sh

The performance comparison between two implementations on a sampled subset is as follows. GPT-4o can achieve better NE but slightly worse SR.

LLMs	NE	OSR	SR	SPL
GPT-4v	5.62	57.9	47.7	38.1
GPT-4o	5.11	56.9	46.3	37.8

Note that you should modify the following part in gpt4o.sh to set the path to your observation images, the split you want to test, etc.

--root_dir ${DATA_ROOT}
--img_root /path/to/images
--split MapGPT_72_scenes_processed
--end 10  # the number of cases to be tested
--output_dir ${outdir}
--max_action_len 15
--save_pred
--stop_after 3
--llm gpt-4o-2024-05-13
--response_format json
--max_tokens 1000

Citation

@inproceedings{chen2024mapgpt,
  title={MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation},
  author={Chen, Jiaqi and Lin, Bingqian and Xu, Ran and Chai, Zhenhua and Liang, Xiaodan and Wong, Kwan-Yee~K.},
  booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics",
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
GPT		GPT
figs		figs
scripts		scripts
utils		utils
vln		vln
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MapGPT

Setup

Inference

Citation

About

Releases

Packages

Languages

chen-judge/MapGPT

Folders and files

Latest commit

History

Repository files navigation

MapGPT

Setup

Inference

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages