Skip to content
PyTorch code for BMVC 2019 paper: Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
C++ Python HTML JavaScript CMake Shell Dockerfile
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
cmake/Modules
connectivity
include
models
pybind11 @ 34c2281
scripts
speaksee @ e2fd8bb
src
tasks/R2R
web
webgl_imgs
.gitmodules
CMakeLists.txt
Dockerfile
Doxyfile
LICENSE
README.md
requirements.txt

README.md

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

This is the PyTorch implementation for our paper:

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
Federico Landi, Lorenzo Baraldi, Massimiliano Corsini, Rita Cucchiara
British Machine Vision Conference (BMVC), 2019
Oral Presentation

Visit the main website for more details.

Reference

If you use our code for your research, please cite our paper (BMVC 2019 oral):

Bibtex:

@inproceedings{landi2019embodied,
      title={Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters},
      author={Landi, Federico and Baraldi, Lorenzo and Corsini, Massimiliano and Cucchiara, Rita},
      booktitle={Proceedings of the British Machine Vision Conference},
      year={2019}
    }

Installation

Clone Repo

Clone the repository:

# Make sure to clone with --recursive
git clone --recursive https://github.com/fdlandi/DynamicConv-agent.git
cd DynamicConv-agent

If you didn't clone with the --recursive flag, then you'll need to manually clone the pybind submodule from the top-level directory:

git submodule update --init --recursive

Python setup

Python 3.6 is required to run our code. You can install the other modules via:

cd speaksee
pip install -e .
cd ..
pip install -r requirements.txt

Building with Docker

Please follow the instructions on the Matterport3DSimulator to install the simulator via Docker.

Bulding without Docker

The simulator can be built outside of a docker container using the cmake build commands described above. However, this is not the recommended approach, as all dependencies will need to be installed locally and may conflict with existing libraries. The main requirements are:

  • Ubuntu >= 14.04
  • Nvidia-driver with CUDA installed
  • C++ compiler with C++11 support
  • CMake >= 3.10
  • OpenCV >= 2.4 including 3.x
  • OpenGL
  • GLM
  • Numpy

Optional dependences (depending on the cmake rendering options):

  • OSMesa for OSMesa backend support
  • epoxy for EGL backend support

Build and Test

Build the simulator and run the unit tests:

cd DynamicConv-agent
mkdir build && cd build
cmake -DEGL_RENDERING=ON ..
make
cd ../
./build/tests ~Timing

If you use a conda environment for your experiments, you should specify the python path in the cmake options:

cmake -DEGL_RENDERING=ON -DPYTHON_EXECUTABLE:FILEPATH='path_to_your_python_bin' ..

Precomputing ResNet Image Features

Alternatively, skip the generation and just download and extract our tsv files into the img_features directory:

Training and Testing

You can train our agent by running:

python tasks/R2R/main.py

The number of dynamic filters can be set with the --num_heads parameter:

python tasks/R2R/main.py --num_heads=4

Reproducibility Note

Results in our paper were obtained with version v0.1 of the Matterport3DSimulator. Due to this difference, results could vary from the one in the paper. Using different GPUs for training, as well as different random seeds, may also affect results.

We provide the weights obtained with our training. To reproduce results from the paper, run:

python tasks/R2R/main.py --name=normal_data --num_heads=4 --eval_only

or:

python tasks/R2R/main.py --name=data_augmentation --num_heads=4 --eval_only

License

The Matterport3D dataset, and data derived from it, is released under the Matterport3D Terms of Use. Our code is released under the MIT license.

You can’t perform that action at this time.