Skip to content

Code for bootstrapping ASR datasets from parliamentary recordings and transcripts

License

Notifications You must be signed in to change notification settings

clarinsi/parlaspeech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ParlaSpeech data preparation procedure

This repository demonstrates the procedure and utilities used to automatically process large amounts of speech data in order to create a corpus which can be used to train models for speech processing, for example in automatic speech recognition.

The examples used here are based on the corpus of croatian parliamentary speech distributed using this link: http://hdl.handle.net/11356/1494

Authors

Citation

The contents of this repository is described in the paper:

TODO

Description

All the details are described in the tutorial notebook.

About

Code for bootstrapping ASR datasets from parliamentary recordings and transcripts

Resources

License

Stars

Watchers

Forks

Packages

No packages published