Video-aided Unsupervised Grammar Induction

We investigate video-aided grammar induction, which learns a constituency parser from both unlabeled text and its corresponding video. We explore rich features (e.g. action, object, scene, audio, face, OCR and speech) from videos, taking the recent Compound PCFG as the baseline. We further propose a Multi-Modal Compound PCFG model (MMC-PCFG) to effectively aggregate these rich features from different modalities.

PDF

News

🍻 Our follow-up work is accepted by EMNLP 2022. [PDF] [Code]
🏆 Our paper wins the best long paper award at NAACL 2021.
☀️ Our paper is accepted by NAACL 2021.

Prerequisites

pytorch 1.5.0
python 3.7
easydict
terminaltables
tqdm
numpy
(A forked version) Torch-Struct
(Optional) benepar

Quick Start

Please download the data from dropbox or box, and save it to the data folder. Preprocessing details are described here.

Training

Run the following commands for training:

sh scripts/didemo/train.sh 
sh scripts/youcook2/train.sh 
sh scripts/msrvtt/train.sh

For VC-PCFG and MMC-PCFG, each run will take approximate 2 hours on DiDeMo, 1 hour on YouCook2 and 10 hours on MSRVTT with a single GTX 1080Ti or GTX 2080Ti.

Testing

Our trained model are provided in dropbox or box. Please download them to the final_checkpoints folder.

Then, run the following commands for evaluation:

sh scripts/didemo/test.sh 
sh scripts/youcook2/test.sh 
sh scripts/msrvtt/test.sh

Preprocessing Details

Download the data from CVPR 2020: Video Pentathlon challenge and save it to the data folder. After that, preprocess sentences with the following scripts:

python tools/preprocess_captions.py
python tools/compute_gold_trees.py
python tools/generate_vocabularies.py

Acknowledgements

This repo is developed based on vpcfg and detr.

Citation

If any part of our paper and code is helpful to your work, please generously cite with:

@InProceedings{zhang2021video,
author = {Zhang, Songyang and Song, Linfeng and Jin, Lifeng and Xu, Kun and Yu, Dong and Luo, Jiebo},
title = {Video-aided Unsupervised Grammar Induction},
booktitle = {NAACL},
year = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
experiments		experiments
figures		figures
lib		lib
parsing		parsing
scripts		scripts
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video-aided Unsupervised Grammar Induction

News

Prerequisites

Quick Start

Training

Testing

Preprocessing Details

Acknowledgements

Citation

About

Releases

Packages

Languages

License

Sy-Zhang/MMC-PCFG

Folders and files

Latest commit

History

Repository files navigation

Video-aided Unsupervised Grammar Induction

News

Prerequisites

Quick Start

Training

Testing

Preprocessing Details

Acknowledgements

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages