Cascaded Mutual Modulation for Visual Reasoning
Source code for Yiqun Yao; Jiaming Xu and Bo Xu. Cascaded Mutual Modulation for Visual Reasoning. 2018. EMNLP. arxiv
This code is a fork from the code for "FiLM: Visual Reasoning with a General Conditioning Layer" available here.
We implement a new model: CMM on the basis of PG+EE and FiLM. Different from the original code, our model runs on multiple gpus.
Setup and Training
Setup instructions for the CMM model are nearly the same as for PG+EE and FiLM.
First, follow the virtual environment setup instructions.
Second, follow the CLEVR data preprocessing instructions.
The below script has the hyper-parameters and settings to reproduce CMM CLEVR results:
The following scripts run trained models on CLEVR: