Decoupled Multimodal Distilling for Emotion Recognition, CVPR 2023.

Highlight paper (10% of accepted papers, 2.5% of submissions)

We propose a decoupled multimodal distillation (DMD) approach that facilitates flexible and adaptive crossmodal knowledge distillation. The key ingredients includes:

The representation of each modality is decoupled into two parts, i.e., modality-irrelevant/-exclusive spaces.
We utilizes a graph distillation unit (GD-Unit) for each decoupled part so that each knowledge distillation can be performed in a specialized and effective manner.
A GD-Unit consists of a dynamic graph where each vertice represents a modality and each edge indicates a dynamic knowledge distillation.

In general, the proposed GD paradigm provides a flexible knowledge transfer manner where the distillation weights can be automatically learned, thus enabling diverse crossmodal knowledge transfer patterns.

The motivation.

Motivation and main idea:

(a) illustrates the significant emotion recognition discrepancies using unimodality, adapted from Mult.
(b) shows the conventional cross-modal distillation.
(c) shows our proposed DMD.

The Framework.

The framework of DMD. Please refer to Paper Link for details.

The learned graph edges.

Illustration of the graph edges in HomoGD and HeteroGD. In (a), $L \to A$ and $L \to V$ are dominated because the homogeneous language features contribute most and the other modalities perform poorly. In (b), $L \to A$, $L \to V$, and $V \to A$ are dominated. $V \to A$ emerges because the visual modality enhanced its feature discriminability via the multimodal transformer mechanism in HeteroGD.

Usage

Prerequisites

Python 3.8
PyTorch 1.9.0
CUDA 11.4

Datasets

Data files (containing processed MOSI, MOSEI datasets) can be downloaded from here. You can put the downloaded datasets into ./dataset directory. Please note that the meta information and the raw data are not available due to privacy of Youtube content creators. For more details, please follow the official website of these datasets.

Run the Codes

Training

First, you need to set the necessary parameters in the ./config/config.json. Then, you can select the training dataset in train.py. Training the model as below:

python train.py

By default, the trained model will be saved in ./pt directory. You can change this in train.py.

Testing

Testing the trained model as below:

python test.py

Please set the path of trained model in run.py (line 174). We also provide some pretrained models for testing. (Google drive)

Citation

If you find the code helpful in your resarch or work, please cite the following paper.

@InProceedings{Li_2023_CVPR,
    author    = {Li, Yong and Wang, Yuanzhi and Cui, Zhen},
    title     = {Decoupled Multimodal Distilling for Emotion Recognition},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {6631-6640}
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
config		config
dataset		dataset
log		log
pt		pt
result		result
trains		trains
utils		utils
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data_loader.py		data_loader.py
edge.png		edge.png
figure2.png		figure2.png
figure_1.png		figure_1.png
run.py		run.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decoupled Multimodal Distilling for Emotion Recognition, CVPR 2023.

Highlight paper (10% of accepted papers, 2.5% of submissions)

The motivation.

The Framework.

The learned graph edges.

Usage

Prerequisites

Datasets

Run the Codes

Citation

About

Releases

Packages

Languages

License

mdswyz/DMD

Folders and files

Latest commit

History

Repository files navigation

Decoupled Multimodal Distilling for Emotion Recognition, CVPR 2023.

Highlight paper (10% of accepted papers, 2.5% of submissions)

The motivation.

The Framework.

The learned graph edges.

Usage

Prerequisites

Datasets

Run the Codes

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages