Skip to content
/ MMSA Public
forked from thuiar/MMSA

CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotations of Modality (ACL2020)

License

Notifications You must be signed in to change notification settings

iyuge2/MMSA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python 3.6

MMSA

Pytorch implementation for codes in CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotations of Modality (ACL2020)

Paper


CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotations of Modality

Please cite our paper if you find our work useful for your research:

@inproceedings{yu2020ch,
  title={CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality},
  author={Yu, Wenmeng and Xu, Hua and Meng, Fanyang and Zhu, Yilin and Ma, Yixiao and Wu, Jiele and Zou, Jiyun and Yang, Kaicheng},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  pages={3718--3727},
  year={2020}
}

Dataset

Annotations

  • You can download CH-SIMS from the following links.

md5: 6a92dccd83373b48ac83257bddab2538

  1. Baidu Yun Disk[code: ozo2]
  2. Google Drive

Support Models

In this framework, we support the following methods:

Type Model Name From
Single-Task EF_LSTM MultimodalDNN
Single-Task LF_DNN -
Single-Task TFN TensorFusionNetwork
Single-Task LMF Low-rank-Multimodal-Fusion
Single-Task MFN Memory-Fusion-Network
Single-Task MulT(without CTC) Multimodal-Transformer
Multi-Task MLF_DNN -
Multi-Task MTFN -
Multi-Task MLMF -

Usage


Run the Code

  • Clone this repo and install requirements.
git clone https://github.com/thuiar/MMSA  
cd MMSA
pip install -r requirements.txt

Data Preprocessing

If you want to extract features from raw videos, you can use the following steps. Or you can directly use the feature data provided by us.

  • fetch audios and aligned faces (see data/DataPre.py)
  1. Install ffmpeg toolkits
sudo apt update
sudo apt install ffmpeg
  1. Run data/DataPre.py
python data/DataPre.py --data_dir [path_to_CH-SIMS]
  • get features (see data/getFeature.py)
  1. Download Bert-Base, Chinese from Google-Bert.
  2. Convert Tensorflow into pytorch using transformers-cli
  3. Install Openface Toolkits
  4. Run data/getFeature.py
python data/getFeature.py --data_dir [path_to_CH-SIMS] --openface2Path [path_to_FeatureExtraction] -- pretrainedBertPath [path_to_pretrained_bert_directory]
  1. Then, you can see the preprocessed features in the path/to/CH-SIMS/Processed/features/data.npz

Run

python run.py --modelName *** --datasetName sims --tasks MTAV

About

CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotations of Modality (ACL2020)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%