Skip to content
Generative Multisensory Network for Neural Multisensory Scene Inference
Python Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
cache
datasets
eval
models
scripts
utils
.gitignore
LICENSE
README.md
main_multimodal.py

README.md

Generative Multisensory Network

Pytorch implementation of Generative Multisensory Network (GMN) on our paper:

Jae Hyun Lim, Pedro O. Pinheiro, Negar Rostamzadeh, Christopher Pal, Sungjin Ahn, Neural Multisensory Scene Inference (2019)

Introduction

Please check out our project website!

Getting Started

Requirements

python>=3.6
pytorch==0.4.x
tensorflow (for tensorboardX)
tensorboardX

Dataset

data from MESE

Structure

  • data: data folder
  • datasets: dataloader definitions
  • models: model definitions
  • utils: miscelleneous functions
  • cache: temporary files
  • eval: a set of python codes for evaluation / visualization
  • scripts: scripts for experiments
    ├── eval: eval/visualization scripts are here
    ├── train: training codes are here
    └── train_missing_modalities: training with missing modalities are here
        ├── m5
        ├── m8
        └── m14
  • main_multimodal.py: main function to train model

Experiments

Train

  • For example, you can train an APoE model for vision and haptic data (# of modalities = 2) as follows,
    python main_multimodal.py \
        --dataset haptix-shepard_metzler_5_parts \
        --model conv-apoe-multimodal-cgqn-v4 \
        --train-batch-size 12 --eval-batch-size 4 \
        --lr 0.0001 \
        --clip 0.25 \
        --add-opposite \
        --epochs 10 \
        --log-interval 100 \
        --exp-num 1 \
        --cache experiments/haptix-m2
    For more information, please find example scripts in scripts/train folder.

Classification (using learned model)

  • An example script to run classification with a learned model on held-out date can be written as follows:
    For the additional Shepard-Metzler objects with 4 or 6 parts (), 10-way classification.
    python eval/clsinf_multimodal_m2.py \
    --dataset haptix-shepard_metzler_46_parts \
    --model conv-apoe-multimodal-cgqn-v4 \
    --train-batch-size 10 --eval-batch-size 10 \
    --vis-interval 1 \
    --num-z-samples 50 \
    --mod-step 1 \
    --mask-step 1 \
    --cache clsinf.m2.s50/rgb/46_parts \
    --path <path-to-your-model>
    For more information, please find example scripts in scripts/eval folder.

Train with missing modalities

  • If you would like to run an APoE model for where , run following script,
    python main_multimodal.py \
        --dataset haptix-shepard_metzler_5_parts-48-ul-lr-rgb-half-intrapol1114 \
        --model conv-apoe-multimodal-cgqn-v4 \
        --train-batch-size 9 --eval-batch-size 4 \
        --lr 0.0001 \
        --clip 0.25 \
        --add-opposite \
        --epochs 10 \
        --log-interval 100 \
        --cache experiments/haptix-m14-intrapol1114
    For more information, please find example scripts in scripts/train_missing_modalities folder.

Contact

For questions and comments, feel free to contact Jae Hyun Lim.

License

MIT License

Reference

@article{jaehyun2019gmn,
  title     = {Neural Multisensory Scene Inference},
  author    = {Jae Hyun Lim and
               Pedro O. Pinheiro and
               Negar Rostamzadeh and
               Christopher J. Pal and
               Sungjin Ahn},
  journal   = {arXiv preprint arXiv:1910.02344},
  year      = {2019},
}
You can’t perform that action at this time.