This repository hosts the official code for the paper "Information-Theoretic Hierarchical Perception for Multimodal Learning."
Drawing on neurological models, the ITHP model employs the information bottleneck method to form compact and informative latent states, forging connections across modalities. Its hierarchical architecture incrementally distills information, offering a novel multimodal learning approach.
-
Clone the repository and install dependencies:
git clone https://github.com/joshuaxiao98/ITHP.git pip install -r requirements.txt
-
Download the datasets to
./datasets
by runningdownload_datasets.sh
. For details, see here. -
Train the model on
MOSI
orMOSEI
datasets using the--dataset
flag:python train.py --dataset mosi # For MOSI (default) python train.py --dataset mosei # For MOSEI
- Customize
train.py
for variable, loss function, or output modifications. - Reduce
max_seq_length
from the default50
for memory efficiency. - Adjust
train_batch_size
to fit memory constraints.
Please cite the following paper if this model assists your research:
@inproceedings{
xiao2024neuroinspired,
title={Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning},
author={Xiongye Xiao and Gengshuo Liu and Gaurav Gupta and Defu Cao and Shixuan Li and Yaxing Li and Tianqing Fang and Mingxi Cheng and Paul Bogdan},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=Z9AZsU1Tju}
}