Python wrapper for the Cambridge Hidden Markov Model Toolkit
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

A Python wrapper for the Hidden Markov Model ToolKit

HTK is a venerable open-source modelling tool, which helped generations of linguists make state-of-the-art models of speech. Once upon a time, anyway; you have no reason not to use NLTK or hmmlearn these days.

If, like me, you're forced to use it by some artificial constraint, you'll find that it is batch-only, requires hundreds of intermediate files for most processes, often takes 10 ordered command arguments, has appalling C99 error messages, crashes if it finds or does not find newlines in specific places, and extremely dense docs. This wrapper makes using it a bit less painful.

The wrapper doesn't really reflect HTK's generality: it builds speaker models from wavs. My usecase took raw speech files from a pair of interlocutors, Labb-Cat annotations for their conversation, built models for each speaker, and then reported their overall 'accommodation' to their interlocutor over time.


  1. Install HTK and Python3.
  2. Get speech data, annotate it.
  3. Point Configs.root at your files.
  4. Run like so : python statesPerHmm=3 vectorType=LPC iterations=10