This repository demonstrates the procedure and utilities used to automatically process large amounts of speech data in order to create a corpus which can be used to train models for speech processing, for example in automatic speech recognition.
The examples used here are based on the corpus of croatian parliamentary speech distributed using this link: http://hdl.handle.net/11356/1494
- Nikola Ljubešić nikola.ljubesic@ijs.si
- Danijel Koržinek danijel@pja.edu.pl
- Peter Rupnik peter.rupnik@ijs.si
- Ivo-Pavao Jazbec ipjazbec@gmail.com
The contents of this repository is described in the paper:
TODO
All the details are described in the tutorial notebook.