This is the PyTorch code of arxiv paper Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation. Paper Link.
The code is based on Matterplot3D Simulator, please follow R2R-EnvDrop to setup the environment. Beside the default env, KEFA needs additional libraries:
- lmdb
- scipy
- tslearn
Please run the following line to install the libraries.
pip3 install lmdb scipy tslearn
The ResNet image features of R2R dataset should be placed like:
${PROJECT_ROOT}/
|-- img_features
| |-- ResNet-152-imagenet.tsv
The paraphrase data of meteor metric can be downloaded from here. It should be put into the following directory:
${PROJECT_ROOT}/
|-- r2r_src
| |-- eval_utils
| |-- meteor
| |-- data
| |-- paraphrase-en.gz
The processed detection data can be downloaded from here. The feature data should be placed in:
${PROJECT_ROOT}/
|-- r2r_src
| |-- detect_feat_genome_by_view.pkl
Run the following line to start training:
bash ./run/speaker_kefa.bash [GPU_id]
Our code is based on the following repository. We thank the authors for releasing their codes.