Skip to content
/ MCLS Public

Assist Non-native Viewers: Multimodal Crosslingual Summarization for How2 Videos

Notifications You must be signed in to change notification settings

korokes/MCLS

Repository files navigation

Assist Non-native Viewers: Multimodal Cross-Lingual Summarization for How2 Videos

Accepted at the EMNLP 2022

Data Preparing

The reorganized How2-MCLS text data can be downloaded from here [Baidu Netdisk, Passcode: 6cd9], as well as video features [Baidu Netdisk, Passcode: eqqj] (derived from the original How2 dataset). The original How2 dataset for multimodal summarization is provided by https://github.com/srvk/how2-dataset.

Preprocessing

Some demo data is placed in "data/demo_data" folder, and you can replace the demo data with the full How2-MCLS dataset, following the format of "data/demo_data" folder. Then run the following command to preprocess the data.

python preprocess.py #Please modify the data storage path configuration.

Training and Prediction

You can run the following script commands to execute the training and prediction procedures of the proposed models, VDF, VDF-TS-E, and VDF-TS-V.

VDF

./run_scripts/VDF.sh

VDF-TS-E

./run_scripts/VDF-TS-E.sh

VDF-TS-V

./run_scripts/VDF-TS-V.sh

Alternatively, we also provide a well-trained first-stage model [Baidu Netdisk, Passcode: rcqo] that you can choose to use directly to skip the first-stage training in the triple-stage training framework.

Evaluation

nmtpytorch library is used to evaluate models, which includes BLEU (1, 2, 3, 4), ROUGE-L, METEOR, and CIDEr evaluation metrics.

As an alternative, nlg-eval evaluation library can obtain the same evaluation scores as nmtpytorch.

In addition, ROUGE evaluation library is used to calculate the ROUGE (1, 2, L) score.

Acknowledgement

We are very grateful that the code is based on MFN, nmtpytorch, fairseq, machine-translation, and Transformers.

About

Assist Non-native Viewers: Multimodal Crosslingual Summarization for How2 Videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published