Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Dimitri Zhukov
Dimitri Zhukov readme typos
Latest commit 2d79941 May 3, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE Initial commit Mar 19, 2019
README.md readme typos May 3, 2019
args.py code release Apr 30, 2019
data.py code release Apr 30, 2019
dp.pyx code release Apr 30, 2019
model.py code release Apr 30, 2019
setup.py code release Apr 30, 2019
train.py code release Apr 30, 2019

README.md

Cross-task weakly supervised learning from instructional videos

About

This is an implementation of the paper "Cross-task weakly supervised learning from instructional videos" by D. Zhukov, J.-B. Alayrac, R. G. Cinbis, D. Fouhey, I. Laptev and J. Sivic [arXiv]

Please, consider siting the paper, if you use our code or data:

@INPROCEEDINGS{Zhukov2019, author = {Zhukov, Dimitri and Alayrac, Jean-Baptiste and Cinbis, Ramazan Gokberk and Fouhey, David and Laptev, Ivan and Sivic, Josef}, title = {Cross-task weakly supervised learning from instructional videos}, booktitle = CVPR, year = {2019}, }

CrossTask dataset

CrossTask dataset contains instructional videos, collected for 83 different tasks. For each task we provide an ordered list of steps with manual descriptions. The dataset is divided in two parts: 18 primary and 65 related tasks. Videos for the primary tasks are collected manually and provided with annotations for temporal step boundaries. Videos for the related tasks are collected automatically and don't have annotations.

Tasks, video URLs and annotations are provided here. See readme.txt for details.

Features are available here (30Gb). Features for each video are provided in a NumPy array with one 3200-dimensional feature per second. The feature vector is a concatenation of RGB I3D features (columns 0-1023), Resnet-152 (columns 1024-3071) and audio VGG features (columns 3072-3199).

Temporal constraints, extracted from narration are available here.

Update 30/06/2019: added videos_val.csv with validation set from the paper, removed extra lines from the constraints.

Code

Provided code can be used to train and evaluate the component model, proposed in the paper, on CrossTask dataset. It was tested with Python 3.7, PyTorch 1.0, NumPy 1.16 and Cython 0.29.

  1. Clone the repository
git clone https://github.com/DmZhukov/CrossTask.git
cd CrossTask
  1. Download and unpack the dataset
wget https://www.di.ens.fr/~dzhukov/crosstask/crosstask_release.zip
wget https://www.di.ens.fr/~dzhukov/crosstask/crosstask_features.zip
wget https://www.di.ens.fr/~dzhukov/crosstask/crosstask_constraints.zip
unzip '*.zip'
  1. Compile Cython code
python setup.py build_ext --inplace
  1. Run training
python train.py
You can’t perform that action at this time.