Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Pushing it out of the Way: Interactive Visual Navigation

By Kuo-Hao Zeng, Luca Weihs, Ali Farhadi, and Roozbeh Mottaghi

Paper | Video | BibTex

In this paper, we study the problem of interactive navigation where agents learn to change the environment to navigate more efficiently to their goals. To this end, we introduce the Neural Interaction Engine (NIE) to explicitly predict the change in the environment caused by the agent's actions. By modeling the changes while planning, we find that agents exhibit significant improvements in their navigational capabilities.


  1. Requirements

    We implement this codebase on Ubuntu 18.04.3 LTS and also have tried it on Ubuntu 16.

    In addition, this codebase needs to be executed on GPU(s).

  2. Clone this repository

    git clone
  3. Intsall xorg if the machine does not have it

    Note: This codebase should be executed on GPU. Thus, we need xserverfor GPU redering.

    # Need sudo permission to install xserver
    sudo apt-get install xorg

    Then, do the xserver refiguration for GPU

    sudo python
  4. Using conda, create an environment

    Note: The python version needs to be above 3.6, since python 2.x may have issues with some required packages.

    conda env create --file environment.yml --name thor-ivn


We consider two downstream tasks in the physics-enabled, visually rich AI2-THOR environment:

(1) reaching a target while the path to the target is blocked

(2) moving an object to a target location by pushing it.

We use Kitchens, Living Rooms, Bedrooms, and Bathrooms for our experiments (120 scenes in total).
To collect the datasets, we use 20 categories of objects, including
alarm clock, apple, armchair, box, bread, chair, desk, dining table, dog bed, laptop,
garbage can, lettuce, microwave, pillow, pot, side table, sofa, stool, television, tomato.

Overall, we collect 10K training instances, 2.5K validation instances, and 2.5K testing instances for both tasks, respectively.

The data for ObjPlace and ObsNav are available in the dataset folder.

For more information about how to control an agent to interact with the environment in AI2-iTHOR, please vist this webpage.

Pretrained Models

We currently provide the following pretrained models:

ObsNav Model
PPO Link
NIE Link
PPO Link
NIE Link
MaskRCNN Link

These models can be downloaded from the above links and should be placed into the pretrained_model_ckptsdirectory. You can then, for example, run inference for the NIE model on ObsNav using AllenAct by running:

export CURRENT_TIME=$(date '+%Y-%m-%d_%H-%M-%S') # This is just to record when you ran this inference
allenact -s 23456 configs/ithor_ObsNav/ -c pretrained_model_ckpts/{THE_DOWNLOADED_MODEL}.pt -t $CURRENT_TIME

Note: Make sure to download the pretrained MaskRCNN to the pretrained_model_ckpts/maskRCNN/model.pth. In addition, you have to turn on the using_mask_rcnn in the configs/ithor-ObsNav/ to use a pretrained maskRCNN model during the evaluation stage when testing the NIE model.

Train a new model from scratch with AllenAct

We use the AllenAct framework for training the baseline models and our NIE models, this framework is automatically installed when installing the requirements for this project.

Before running training or inference you'll first have to add the Interactive_Visual_Navigation directory to your PYTHONPATH (so that python and AllenAct knows where to for various modules). To do this you can run the following:

cd YOUR/PATH/TO/Interactive_Visual_Navigation

Let's say you want to train a NIE model on ObjPlace task. This can be easily done by running the command

allenact -s 23456 -o out -b . configs/ithor_ObjPlace/


If you find this project useful in your research, please consider citing our paper:

  author = {Zeng, Kuo-Hao and Farhadi, Ali and Weihs, Luca and Mottaghi, Roozbeh},
  title = {Pushing it out of the Way: Interactive Visual Navigation},
  booktitle = {CVPR},	    
  year = {2021}


Pushing it out of the Way: Interactive Visual Navigation






No packages published