Skip to content

arsalan-mousavian/Navigation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 

Repository files navigation

semantic_nav

asdasdasd Visual Representation for Semantic Target Driven Navigation

Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, James Davidson

arXiv 2018

Target: Fridge Target: Television
Target: Microwave Target: Couch

IMAGE ALT TEXT HERE

ArXiv: https://arxiv.org/abs/1805.06066

1. Installation

Requirements

Python Packages

networkx
gin-config

Download semantic_nav

git clone --depth 1 https://github.com/tensorflow/models.git

2. Datasets

Download ActiveVisionDataset

We used Active Vision Dataset (AVD) which can be downloaded from here. To make our code faster and reduce memory footprint, we created the AVD Minimal dataset. AVD Minimal consists of low resolution images from the original AVD dataset. In addition, we added annotations for target views, predicted object detections from pre-trained object detector on MS-COCO dataset, and predicted semantic segmentation from pre-trained model on NYU-v2 dataset. AVD minimal can be downloaded from here. Set $AVD_DIR as the path to the downloaded AVD Minimal.

ActiveVisionDataset Demo

If you wish to navigate the environment, to see how the AVD looks like you can use the following command:

python viz_active_vision_dataset_main -- \
  --mode=human \
  --gin_config=envs/configs/active_vision_config.gin \
  --gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR'

3. Training

Right now, the released version only supports training and inference using the real data from Active Vision Dataset.

Run training

Use the following command for training:

# Train
python train_supervised_active_vision.py \
  --mode='train' \
  --logdir=$CHECKPOINT_DIR \
  --modality_types='det' \
  --batch_size=8 \
  --train_iters=200000 \
  --lstm_cell_size=2048 \
  --policy_fc_size=2048 \
  --sequence_length=20 \
  --max_eval_episode_length=100 \
  --test_iters=194 \
  --gin_config=envs/configs/active_vision_config.gin \
  --gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR' \
  --logtostderr

Run Evaluation

Use the following command for unrolling the policy on the eval environments. The inference code periodically check the checkpoint folder for new checkpoints to use it for unrolling the policy on the eval environments. After each evaluation, it will create a folder in the $CHECKPOINT_DIR/evals/$ITER where $ITER is the iteration number at which the checkpoint is stored.

# Eval
python train_supervised_active_vision \
  --mode='eval' \
  --logdir=$CHECKPOINT_DIR \
  --modality_types='det' \
  --batch_size=8 \
  --train_iters=200000 \
  --lstm_cell_size=2048 \
  --policy_fc_size=2048 \
  --sequence_length=20 \
  --max_eval_episode_length=100 \
  --test_iters=194 \
  --gin_config=envs/configs/active_vision_config.gin \
  --gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR' \
  --logtostderr

At any point, you can run the following command to compute statistics such as success rate over all the evaluations so far. It also generates gif images for unrolling of the best policy.

# Visualize and Compute Stats
python viz_active_vision_dataset_main.py \
   --mode=eval \ 
   --eval_folder=$CHECKPOINT_DIR/evals/ \
   --output_folder=$OUTPUT_GIFS_FOLDER \
   --gin_config=envs/configs/active_vision_config.gin \
   --gin_params='ActiveVisionDatasetEnv.dataset_root=$AVD_DIR'

Reference

If you find our work useful in your research please consider citing our paper:

@inproceedings{MousavianArxiv18,
  author = {A. Mousavian and A. Toshev and M. Fiser and J. Kosecka and J. Davidson},
  title = {Visual Representations for Semantic Target Driven Navigation},
  booktitle = {arXiv:1805.06066},
  year = {2018},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published