Code and Data for Paper "Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information" (NAACL 2021)

Environment Installation

Download Room-to-Room navigation data:

bash ./tasks/R2R/data/download.sh

Download Room-Across-Room navigation data and save under /tasks:

gsutil -m cp -R gs://rxr-data .

Download image features for environments:

mkdir img_features
wget https://www.dropbox.com/s/o57kxh2mn5rkx4o/ResNet-152-imagenet.zip -P img_features/
cd img_features
unzip ResNet-152-imagenet.zip

Python requirements

pip install -r python_requirements.txt

Install Matterport3D simulators:

git submodule update --init --recursive
sudo apt-get install libjsoncpp-dev libepoxy-dev libglm-dev libosmesa6 libosmesa6-dev libglew-dev
mkdir build && cd build
cmake -DOSMESA_RENDERING=ON ..
make -j8

Code

Parsing

python r2r_src/parsing.py
python r2r_src/parsing_hite.py

parsing.py parses all English instructions in R2R and RxR. parsing_hite.py parses all Hindi and Telugu instructions in RxR.

Agent

bash run/agent_r2r.bash 0
bash run/agent_rxr.bash 0

0 is the id of GPU. It will train the agent and save the snapshot under snap/agent/.

agent_r2r.bash runs the agent on R2R dataset, and agent_rxr.bash runs the agent on RxR dataset.

When train and test on RxR dataset, use parameter --language to pick a single language (en, hi, te).

Baseline

bash run/agent_baseline_r2r.bash 0
bash run/agent_baseline_rxr.bash 0

Run this code to replicate the baseline.

Similarly, when train and test on RxR dataset, choose language by setting the parameter --language as en, hi, te.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.idea		.idea
cmake		cmake
connectivity		connectivity
include		include
pybind11		pybind11
r2r_src		r2r_src
run		run
src		src
tasks/R2R/data		tasks/R2R/data
.DS_Store		.DS_Store
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
LICENSE_Matterport3DSimulator		LICENSE_Matterport3DSimulator
README.md		README.md
python_requirements.txt		python_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code and Data for Paper "Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information" (NAACL 2021)

Environment Installation

Code

Parsing

Agent

Baseline

About

Releases

Packages

Languages

License

jialuli-luka/SyntaxVLN

Folders and files

Latest commit

History

Repository files navigation

Code and Data for Paper "Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information" (NAACL 2021)

Environment Installation

Code

Parsing

Agent

Baseline

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages