A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
Read about setting up Docker to run all this.
For more information about data requirements, see the data guide.
This library uses the task tool to run the more complex processes automatically. Once you've set up Kaldi Helpers, you can run the various pipeline tasks we've developed (or out of the box in the docker image). You can read about these tasks here.