Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Dimitri Zhukov
Dimitri Zhukov readme typos
Latest commit 2d79941 May 3, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE Initial commit Mar 19, 2019 readme typos May 3, 2019 code release Apr 30, 2019 code release Apr 30, 2019
dp.pyx code release Apr 30, 2019 code release Apr 30, 2019 code release Apr 30, 2019 code release Apr 30, 2019

Cross-task weakly supervised learning from instructional videos


This is an implementation of the paper "Cross-task weakly supervised learning from instructional videos" by D. Zhukov, J.-B. Alayrac, R. G. Cinbis, D. Fouhey, I. Laptev and J. Sivic [arXiv]

Please, consider siting the paper, if you use our code or data:

@INPROCEEDINGS{Zhukov2019, author = {Zhukov, Dimitri and Alayrac, Jean-Baptiste and Cinbis, Ramazan Gokberk and Fouhey, David and Laptev, Ivan and Sivic, Josef}, title = {Cross-task weakly supervised learning from instructional videos}, booktitle = CVPR, year = {2019}, }

CrossTask dataset

CrossTask dataset contains instructional videos, collected for 83 different tasks. For each task we provide an ordered list of steps with manual descriptions. The dataset is divided in two parts: 18 primary and 65 related tasks. Videos for the primary tasks are collected manually and provided with annotations for temporal step boundaries. Videos for the related tasks are collected automatically and don't have annotations.

Tasks, video URLs and annotations are provided here. See readme.txt for details.

Features are available here (30Gb). Features for each video are provided in a NumPy array with one 3200-dimensional feature per second. The feature vector is a concatenation of RGB I3D features (columns 0-1023), Resnet-152 (columns 1024-3071) and audio VGG features (columns 3072-3199).

Temporal constraints, extracted from narration are available here.

Update 30/06/2019: added videos_val.csv with validation set from the paper, removed extra lines from the constraints.


Provided code can be used to train and evaluate the component model, proposed in the paper, on CrossTask dataset. It was tested with Python 3.7, PyTorch 1.0, NumPy 1.16 and Cython 0.29.

  1. Clone the repository
git clone
cd CrossTask
  1. Download and unpack the dataset
unzip '*.zip'
  1. Compile Cython code
python build_ext --inplace
  1. Run training
You can’t perform that action at this time.