Assist Non-native Viewers: Multimodal Cross-Lingual Summarization for How2 Videos

Accepted at the EMNLP 2022

Data Preparing

The reorganized How2-MCLS text data can be downloaded from here [Baidu Netdisk, Passcode: 6cd9], as well as video features [Baidu Netdisk, Passcode: eqqj] (derived from the original How2 dataset). The original How2 dataset for multimodal summarization is provided by https://github.com/srvk/how2-dataset.

Preprocessing

Some demo data is placed in "data/demo_data" folder, and you can replace the demo data with the full How2-MCLS dataset, following the format of "data/demo_data" folder. Then run the following command to preprocess the data.

python preprocess.py #Please modify the data storage path configuration.

Training and Prediction

You can run the following script commands to execute the training and prediction procedures of the proposed models, VDF, VDF-TS-E, and VDF-TS-V.

VDF

./run_scripts/VDF.sh

VDF-TS-E

./run_scripts/VDF-TS-E.sh

VDF-TS-V

./run_scripts/VDF-TS-V.sh

Alternatively, we also provide a well-trained first-stage model [Baidu Netdisk, Passcode: rcqo] that you can choose to use directly to skip the first-stage training in the triple-stage training framework.

Evaluation

nmtpytorch library is used to evaluate models, which includes BLEU (1, 2, 3, 4), ROUGE-L, METEOR, and CIDEr evaluation metrics.

As an alternative, nlg-eval evaluation library can obtain the same evaluation scores as nmtpytorch.

In addition, ROUGE evaluation library is used to calculate the ROUGE (1, 2, L) score.

Acknowledgement

We are very grateful that the code is based on MFN, nmtpytorch, fairseq, machine-translation, and Transformers.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data-bin/how2mcls		data-bin/how2mcls
data/demo_data		data/demo_data
run_scripts		run_scripts
seq2seq		seq2seq
transformers		transformers
README.md		README.md
__init__.py		__init__.py
generate.py		generate.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train_base.py		train_base.py
train_s1.py		train_s1.py
train_s2.py		train_s2.py
train_s3.py		train_s3.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assist Non-native Viewers: Multimodal Cross-Lingual Summarization for How2 Videos

Data Preparing

Preprocessing

Training and Prediction

Evaluation

Acknowledgement

About

Releases

Packages

Languages

korokes/MCLS

Folders and files

Latest commit

History

Repository files navigation

Assist Non-native Viewers: Multimodal Cross-Lingual Summarization for How2 Videos

Data Preparing

Preprocessing

Training and Prediction

Evaluation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages