[ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation
-
Updated
Jul 18, 2022 - C++
[ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation
Code for 'Chasing Ghosts: Instruction Following as Bayesian State Tracking' published at NeurIPS 2019
Contrastive-VisionVAE-Follower is a model used for multi-modal task called Vision-and-Language Navigation (VLN).
Add a description, image, and links to the vln topic page so that developers can more easily learn about it.
To associate your repository with the vln topic, visit your repo's landing page and select "manage topics."