This repository is currently being refactored, and therefore files may change.
Automatic Speech-to-Text (AST)
Sequence-to-sequence model to train speech-to-text systems.
We preprocessed the English translations released by:
Improved Speech-to-Text Translation with the Fisher and Callhome Spanish–English Speech Translation Corpus, Matt Post, Gaurav Kumar, Adam Lopez, Damianos Karakos, Chris Callison-Burch and Sanjeev Khudanpur, IWSLT 2013
and make them available here.
Fisher Spanish speech data is available from LDC (LDC2010S01)
We use Chainer as our deep learning framework
- create a conda environment with Python 3:
conda create --name ast python=3
- activate new environment:
source activate ast
- install CuPy
pip install cupy-cuda91
- install chainer
pip install chainer
- check if Chainer detects GPU support. Launch python:
$ python Python 3.7.1 [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import chainer >>> chainer.backends.cuda.available True >>> chainer.backends.cuda.cudnn_enabled True >>>
- install NLTK. Used to extract stop word lists for target languages, and for computing evaluation metrics such as BLEU score.
conda install nltk
- install tqdm for progress bar support
conda install tqdm