Target-Driven Structured Transformer Planner for Vision-Language Navigation

This is the official implementation of MM'22 (ACM International Conference on Multimedia) paper (accepted as oral): Target-Driven Structured Transformer Planner for Vision-Language Navigation.

Citation

@inproceedings{zhao2022target,
  title={Target-Driven Structured Transformer Planner for Vision-Language Navigation},
  author={Zhao, Yusheng and Chen, Jinyu and Gao, Chen and Wang, Wenguan and Yang, Lirong and Ren, Haibing and Xia, Huaxia and Liu, Si},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages={4194--4203},
  year={2022}
}

Installation

Please refer to this repo for installation guides.

Training & Inference

To train on R2R:

cd finetune_src
python ./scripts/run_r2r.sh

To train on REVERIE:

cd finetune_src
python ./scripts/run_reverie.sh

To test on R2R:

cd finetune_src
python ./scripts/test_r2r.sh

To test on REVERIE:

cd finetune_src
python ./scripts/test_reverie.sh

Note that

Some file paths in the code may require slight adaptation according to your local environment.
To achieve the RGS and RGSPL of REVERIE in the paper, you need to train a separated ViL-BERT on the REVERIE training split and perform the referring part at the end of the navigation. Since the referring part is not the contribution of our paper and is easy to tune, we do not plan to release that part of code.

Acknowledgement

This code is based on HAMT. We appreciate their great contribution to the community.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
finetune_src		finetune_src
preprocess		preprocess
pretrain_src		pretrain_src
tokenizer_files/bert-base-uncase		tokenizer_files/bert-base-uncase
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

finetune_src

finetune_src

preprocess

preprocess

pretrain_src

pretrain_src

tokenizer_files/bert-base-uncase

tokenizer_files/bert-base-uncase

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Target-Driven Structured Transformer Planner for Vision-Language Navigation

Citation

Installation

Training & Inference

Acknowledgement

About

Releases

Packages

Languages

YushengZhao/TD-STP

Folders and files

Latest commit

History

Repository files navigation

Target-Driven Structured Transformer Planner for Vision-Language Navigation

Citation

Installation

Training & Inference

Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages