The implementation of EMNLP2023 paper "M2DF: Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis"
+ python 3.7.13
+ torch 1.12.0+cu113
+ numpy 1.21.6
+ transformers==3.4.0
+ fastnlp
+ h5py
Because the image features after processing is very large, you can download them via the link Google Drive. It should be noted that the path of the data is consistent with the file tree.
├── /src/
│ ├── /data/
│ │ │ ├── /jsons/
│ │ │ │ ├── twitter15_info.json
│ │ │ │ ├── twitter17_info.json
│ │ │ │ ├── amended_similarity_by_region2015.json fine-grained similarity
│ │ │ │ ├── amended_similarity_by_region2017.json
│ │ │ │ ├── amended_similarity_by_whole2015.json coarse-grained similarity
│ │ │ │ ├── amended_similarity_by_whole2017.json
│ ├── /twitter2015/
│ ├── /twitter2017/
│ ├── /twitter2015_box_att_NER/
│ ├── /twitter2017_box_att_NER/
- Train and Test on twitter2015
sh 15_pretrain_full.sh
- Train and Test on twitter2017
sh 17_pretrain_full.sh
Training log on tw17 (trained on GeForce GTX 1080 Ti) and tw15 (GeForce GTX 3090 Ti) are placed in the \log\
For the JMASA Task
Model | TWITTER-15 | TWITTER-17 | ||||
Pre | Rec | F1 | Pre | Rec | F1 | |
UMT-collapse | 60.4 | 61.6 | 61.0 | 60.0 | 61.7 | 60.8 |
UMT-collapse + M2DF | 61.1±0.40 | 63.4±0.57 | 62.2±0.10 | 60.9±0.28 | 62.0±0.52 | 61.4±0.13 |
OSCGA-collapse | 63.1 | 63.7 | 63.2 | 63.5 | 63.5 | 63.5 |
OSCGA-collapse + M2DF | 64.4±0.37 | 64.6±0.45 | 64.5±0.13 | 64.1±0.11 | 63.9±0.16 | 64.0±0.12 |
RpBERT | 49.3 | 46.9 | 48.0 | 57.0 | 55.4 | 56.2 |
RpBERT + M2DF | 49.3±0.20 | 49.0±0.25 | 49.2±0.15 | 56.9±0.34 | 56.5±0.38 | 56.7±0.22 |
RDS | 60.8 | 61.7 | 61.2 | 61.8 | 62.9 | 62.3 |
RDS + M2DF | 61.2±0.12 | 63.0±0.35 | 62.1±0.15 | 62.4±0.16 | 63.6±0.12 | 63.0±0.08 |
JML | 64.8 | 63.6 | 64.0 | 65.6 | 66.1 | 65.9 |
JML + M2DF | 64.9±0.36 | 65.3±0.16 | 65.1±0.25 | 67.7±0.30 | 67.0±0.08 | 67.3±0.16 |
VLP-MABSA | 64.1 | 68.6 | 66.3 | 65.8 | 67.9 | 66.9 |
VLP-MABSA + M2DF | 67.0±0.20 | 68.3±0.26 | 67.6±0.18 | 67.9±0.10 | 68.8±0.37 | 68.3±0.18 |
For the MATE Task
Model | TWITTER-15 | TWITTER-17 | ||||
Pre | Rec | F1 | Pre | Rec | F1 | |
UMT | 77.8 | 81.7 | 79.7 | 86.7 | 86.8 | 86.7 |
UMT + M2DF | 79.1±0.14 | 81.5±0.33 | 80.3±0.12 | 87.4±0.18 | 87.5±0.22 | 87.5±0.15 |
OSCGA | 81.7 | 82.1 | 81.9 | 90.2 | 90.7 | 90.4 |
OSCGA + M2DF | 82.0±0.10 | 82.8±0.31 | 82.4±0.13 | 90.3±0.15 | 91.5±0.17 | 90.9±0.07 |
JML | 82.9 | 81.2 | 82.0 | 90.2 | 90.9 | 90.5 |
JML + M2DF | 84.0±0.26 | 82.3±0.12 | 83.1±0.14 | 91.1±0.11 | 90.9±0.18 | 91.0±0.12 |
VLP-MABSA | 82.2 | 88.2 | 85.1 | 89.9 | 92.5 | 91.3 |
VLP-MABSA + M2DF | 85.2±0.24 | 87.4±0.20 | 86.3±0.15 | 91.5±0.25 | 93.2±0.23 | 92.4±0.14 |
For the MASC Task
Model | TWITTER-15 | TWITTER-17 | ||
Acc | F1 | Acc | F1 | |
TomBERT | 77.2 | 71.8 | 70.5 | 68.0 |
TomBERT + M2DF | 77.9±0.11 | 73.2±0.11 | 71.0±0.14 | 68.7±0.20 |
CapTrBERT | 78.0 | 73.2 | 72.3 | 70.2 |
CapTrBERT + M2DF | 78.4±0.12 | 74.0±0.08 | 73.0±0.08 | 71.3±0.07 |
FITE | 78.5 | 73.9 | 70.9 | 68.7 |
FITE + M2DF | 78.9±0.07 | 74.2±0.08 | 71.5±0.11 | 69.4±0.12 |
ITM | 78.3 | 74.2 | 72.6 | 72.0 |
ITM + M2DF | 78.9±0.05 | 75.0±0.07 | 73.2±0.10 | 73.0±0.08 |
JML | 78.1 | - | 72.7 | - |
JML + M2DF | 78.8±0.15 | - | 74.0±0.12 | - |
VLP-MABSA | 77.2 | 72.9 | 73.2 | 71.4 |
VLP-MABSA + M2DF | 78.9±0.15 | 74.8±0.24 | 74.3±0.15 | 73.0±0.16 |
If you find this helpful, please cite our paper.
@misc{zhao2023m2df,
title={M2DF: Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis},
author={Fei Zhao and Chunhui Li and Zhen Wu and Yawen Ouyang and Jianbing Zhang and Xinyu Dai},
year={2023},
eprint={2310.14605},
archivePrefix={arXiv},
primaryClass={cs.CL}
}