Skip to content

Official code repository for "Video-Mined Task Graphs for Keystep Recognition in Instructional Videos" arXiv, 2023

License

Notifications You must be signed in to change notification settings

facebookresearch/TaskGraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

Official code of Video-Mined Task Graphs for Keystep Recognition in Instructional Videos, NeurIPS 2023.

Project page | arXiv

Teaser

Introduction

This paper proposes learning a task graph to regularize keystep predictions. The proposed method outperforms prior works on zero-shot keystep recognition for CrossTask and COIN datasets. We use the task graph to pseudo-label large-scale instructional video dataset (HowTo100M) and representation learning using the obtained labels improves downstream task performance.

Usage

Fill-in the paths

Please replace all the paths with sentence and feature files that can be downloaded from here. Alternate links:

Zero-shot keystep recognition

Navigate to the zero-shot repository and run individual files with

python text.py coin # text modality evaluation for COIN datast
python text.py crosstask # text modality evaluation for Crosstask datast
python video.py coin # video modality evaluation for COIN datast
python video.py crosstask # video modality evaluation for Crosstask datast

Running these codes should result in the numbers present in the Table 1.

Representation learning

We use Video Distant Supervision to train the representation learning model. We replace the labels provided by them with our task graph labels. We use HowTo100M ASR narrations provided by this paper. The labels can be downloaded from here.

Reporting issues

Feel free to open an issue in case of questions, or email me.

Acknowledgement

The instructional video representation learning is based on Distant Supervision repository. We thank the authors and maintainers of this codebase.

License

This codebase is licensed under the CC-BY-NC license.

About

Official code repository for "Video-Mined Task Graphs for Keystep Recognition in Instructional Videos" arXiv, 2023

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages