Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

Official code of Video-Mined Task Graphs for Keystep Recognition in Instructional Videos, NeurIPS 2023.

Introduction

This paper proposes learning a task graph to regularize keystep predictions. The proposed method outperforms prior works on zero-shot keystep recognition for CrossTask and COIN datasets. We use the task graph to pseudo-label large-scale instructional video dataset (HowTo100M) and representation learning using the obtained labels improves downstream task performance.

Usage

Fill-in the paths

Please replace all the paths with sentence and feature files that can be downloaded from here. Alternate links:

Zero-shot keystep recognition

Navigate to the zero-shot repository and run individual files with

python text.py coin # text modality evaluation for COIN datast
python text.py crosstask # text modality evaluation for Crosstask datast
python video.py coin # video modality evaluation for COIN datast
python video.py crosstask # video modality evaluation for Crosstask datast

Running these codes should result in the numbers present in the Table 1.

Representation learning

We use Video Distant Supervision to train the representation learning model. We replace the labels provided by them with our task graph labels. We use HowTo100M ASR narrations provided by this paper. The labels can be downloaded from here.

Reporting issues

Feel free to open an issue in case of questions, or email me.

Acknowledgement

The instructional video representation learning is based on Distant Supervision repository. We thank the authors and maintainers of this codebase.

License

This codebase is licensed under the CC-BY-NC license.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
teaser		teaser
zero-shot		zero-shot
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

Introduction

Usage

Fill-in the paths

Zero-shot keystep recognition

Representation learning

Reporting issues

Acknowledgement

License

About

Releases

Packages

Languages

License

facebookresearch/TaskGraph

Folders and files

Latest commit

History

Repository files navigation

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

Introduction

Usage

Fill-in the paths

Zero-shot keystep recognition

Representation learning

Reporting issues

Acknowledgement

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages