Skip to content
Code to train Automatic Speech-to-Text (AST) models
Shell Python Perl
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This repository is currently being refactored, and therefore files may change.

Automatic Speech-to-Text (AST)

Sequence-to-sequence model to train speech-to-text systems.

Reference: Pre-training on high-resource speech recognition improves low-resource speech-to-text translation, Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez, Sharon Goldwater

Fisher data

We preprocessed the English translations released by:

Improved Speech-to-Text Translation with the Fisher and Callhome Spanish–English Speech Translation Corpus, Matt Post, Gaurav Kumar, Adam Lopez, Damianos Karakos, Chris Callison-Burch and Sanjeev Khudanpur, IWSLT 2013

and make them available here.

Fisher Spanish speech data is available from LDC (LDC2010S01)


We use Chainer as our deep learning framework


  1. create a conda environment with Python 3:

conda create --name ast python=3

  1. activate new environment:

source activate ast

  1. install CuPy

pip install cupy-cuda91

  1. install chainer

pip install chainer

  1. check if Chainer detects GPU support. Launch python:
$ python

Python 3.7.1
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import chainer
>>> chainer.backends.cuda.available
>>> chainer.backends.cuda.cudnn_enabled
  1. install NLTK. Used to extract stop word lists for target languages, and for computing evaluation metrics such as BLEU score.

conda install nltk

  1. install tqdm for progress bar support

conda install tqdm

You can’t perform that action at this time.