armspeech - autoregressive probabilistic modelling for speech synthesis
This software provides a framework and example experiments for investigation into probabilistic modelling of speech for statistical speech synthesis. There is a particular focus on autoregressive models.
It grew out of experiments with autoregressive acoustic models for the author's PhD thesis, with the goal of allowing rapid prototyping of different models. As such it has been designed with productivity and flexibility in mind rather than runtime speed. It is very much a work in progress.
armspeech is hosted on github. To obtain the latest source code using git:
git clone git://github.com/MattShannon/armspeech.git
Many of the formats used in armspeech are similar to those used in HTS. In particular armspeech expects HTS-style speech parameter and label files, for example as produced by the HTS demo. The default method for generating audio from the generated speech parameters is to use the STRAIGHT vocoder. By default the experiments use the CMU ARCTIC corpus, speaker slt.
armspeech has the following dependencies:
- CMU ARCTIC corpus, processed into HTS-style speech parameter and label files (for example, by the HTS demo)
- if you want to generate audio, STRAIGHT vocoder (which requires MATLAB)
- if you want to generate audio, an appropriate HTS demo-style
- the codedep package for code-level dependency tracking
- python (>= 2.6) with recent numpy, scipy and matplotlib
- if using the HTS demo to generate the required files above (recommended), you should use the STRAIGHT version of the English speaker dependent training demo (which requires HTS, which in turn requires HTK). HTS 2.1 (for HTK 3.4) was used for testing.
To set-up this directory:
- add paths to an appropriate data directory and label directory in
expt_hts_demo/experiment.py(by editing the strings starting '## TBA'). The data directory should contain
.bapfiles. The label directory should contain
.labfiles, each of which is an alignment with full-context labels. Either phone-level or state-level alignments may be used (but note that some of the example experiments require state-level alignments).
mgcOrder(two places) and
subLabels(one place) in
expt_hts_demo/experiment.py(where the corpus objects are created) to have values appropriate for your corpus.
- if you want to generate audio, add an appropriate
scripts/Config.pmfile (e.g. copied from the HTS demo)
- if necessary make
You can then run example experiments using:
expt_hts_demo uses the
armspeech python package as a library, but
the latter is not intended to be a fully-fledged package suitable for separate
This may change as the code matures.
Please see the file
License for details of the license and warranty for armspeech.
Parts of the code in this directory are based on the following software packages:
- GPML toolbox v3.0
- HTS demo (STRAIGHT version of the English speaker dependent training demo for HTS 2.1)
Please use the issue tracker to submit bug reports.
The author of armspeech is Matt Shannon.