Skip to content
Speech Transcripts for the V3C1 Dataset
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This repository contains transcripts for the videos of the V3C1, the first shard of the Vimeo Creative Commons Collection. These transcripts have been generated using the public Google Cloud Speech-to-Text API set to use English. The results are encoded as JSON maps where the key refers to the segment number of the video and the value contains the words spoken during this segment. For videos without any detected English speech, no file is present. All data is provided without any correctness guarantees. If you use the data provided in this repository, please cite the following corresponding publication:


  title={Deep Learning-Based Concept Detection in vitrivr},
  author={Rossetto, Luca and Parian, Mahnaz Amiri and Gasser, Ralph and Giangreco, Ivan and Heller, Silvan and Schuldt, Heiko},
  booktitle={International Conference on Multimedia Modeling},
You can’t perform that action at this time.