Skip to content

ICML2024-ReconBoost: Boosting Can Achieve Modality Reconcilement

Notifications You must be signed in to change notification settings

huacong/ReconBoost

Repository files navigation

ReconBoost

This is the official code for the paper “ReconBoost: Boosting Can Achieve Modality Reconcilement.” accepted by International Conference on Machine Learning (ICML2024). This paper is available at here.

paper video slides Website

Paper Title: ReconBoost: Boosting Can Achieve Modality Reconcilement.

Authors: Cong Hua, Qianqian Xu*, Shilong Bao, Zhiyong Yang, Qingming Huang*

pipeline

Installation

Clone this repository

git clone https://github.com/huacong/ReconBoost.git

Install the required libraries

pip install -r requirements.txt

Dataset

In our paper, six benchmarks are adopted for evaluation. CREMA-D, AVE and ModelNet40 (front-rear views) are two-modality datasets and MOSI, MOSEI and SIMS are tri-modality datasets. . The statistics of all datasets used in the experiments are included in the table below.

image-20240520001124087

Training

For training, we provide hyper-parameter settings, running command and checkpoints for each dataset. To enhance the stability of the training process, we load the pre-trained uni-modal model via specifying hyper-parameter --use_pretrain.

Get started with TensorBoard to monitor the training process.

tensorboard --logdir ./ --port 6007 --bind_all

The well-trained models are saved at here.

CREMA-D dataset

python train.py 
--dataset CREMAD 
--dataset_path /data/huacong/CREMA/data
--n_class 6
--batch_size 64
--boost_rate 1.0
--n_worker 8
--epochs_per_stage 4
--correct_epoch 4
--use_lr True
--m_lr 0.01
--e_lr 0.01
--weight1 5.0
--weight2 1.0
--use_pretrain 
--m1ckpt modality1_ckpt_pth
--m2ckpt modality2_ckpt_pth

AVE dataset

python train.py 
--dataset AVE 
--dataset_path /data/huacong/AVE_Dataset
--n_class 28
--batch_size 64
--boost_rate 1.0
--n_worker 8
--epochs_per_stage 4
--correct_epoch 4
--use_lr True
--m_lr 0.01
--e_lr 0.01
--weight1 4.0
--weight2 1.0
--use_pretrain 
--m1ckpt modality1_ckpt_pth
--m2ckpt modality2_ckpt_pth

ModelNet40 dataset

python train.py
--dataset MView40
--dataset_path /data/huacong/ModelNet40
--n_class 40
--batch_size 48
--boost_rate 1.0
--n_worker 8
--epochs_per_stage 4
--correct_epoch 4
--weight1 4.0
--weight2 1.0
--use_pretrain 
--m1ckpt modality1_ckpt_pth

MOSEI

python train_MSA.py 
--dataset MSA
--dataset_name mosei
--featurePath /data/huacong/MSA/MOSEI/Processed/unaligned_50.pkl
--seq_lens [50, 1, 1]
--feature_dims [768, 74, 35]

MOSI

python train_MSA.py 
--dataset MSA
--dataset_name mosi
--featurePath /data/huacong/MSA/MOSI/Processed/unaligned_50.pkl
--seq_lens [50, 1, 1]
--feature_dims [768, 5, 20]

SIMS

python train_MSA.py 
--dataset MSA
--dataset_name sims
--featurePath /data/huacong/MSA/SIMS/unaligned_39.pkl
--seq_lens [50, 1, 1]
--feature_dims [768, 33, 709]

Evaluation

Overall Evaluation

CREMA-D dataset

python eval.py --dataset CREMAD --dataset_path /data/huacong/CREMA/data --n_class 6 --batch_size 64 --n_worker 8 
--ensemble_ckpt_path cache/ckpt/CREMAD/best_ensemble_net_XX.path 
--uni_ckpt_path cache/ckpt/CREMAD/uni_encoder_XX.pth

AVE

python eval.py --dataset AVE --dataset_path /data/huacong/AVE_Dataset --n_class 28 --batch_size 64 --n_worker 8 --ensemble_ckpt_path cache/ckpt/AVE/best_ensemble_net_XX.path 
--uni_ckpt_path cache/ckpt/AVE/uni_encoder_XX.pth

Uni-modal Linear-prob Evaluation

Evaluate the audio encoder on CREMA-D dataset.

python uni_eval.py --dataset CREMAD --dataset_path /data/huacong/CREMA/data --modality audio --n_class 6 --batch_size 64 --max_epochs 100 --emb 512 --uni_ckpt_path cache/ckpt/CREMAD/uni_encoder_XX.pth

Evaluate the visual encoder on AVE dataset.

python uni_eval.py --dataset AVE --dataset_path /data/huacong/AVE_Dataset --modality visual --n_class 28 --batch_size 64 --max_epochs 100 --emb 512 --uni_ckpt_path cache/ckpt/AVE/uni_encoder_XX.pth

Latent Embedding Visualization

Latent embeddings among different competitors are saved at here.

To visualize high-dimension embedding, you can run the following command.

python tsne_embedding.py

Citation

If you find this repository useful in your research, please cite the following papers:

@inproceedings{hua2024reconboost,
title={ReconBoost: Boosting Can Achieve Modality Reconcilement}, 
author={Cong Hua and Qianqian Xu and Shilong Bao and Zhiyong Yang and Qingming Huang},
booktitle={The Forty-first International Conference on Machine Learning},
year={2024}
}

Contact us

If you have any detailed questions or suggestions, you can email us: huacong23z@ict.ac.cn. We will reply in 1-2 business days. Thanks for your interest in our work!

About

ICML2024-ReconBoost: Boosting Can Achieve Modality Reconcilement

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages