_HMMs.jl

Hidden Markov Models and Autoregressive HMMs in Julia — RoLi lab.

Fit and decode discrete-observation HMMs and ARHMMs, with optional GMM-based discretization of continuous features (e.g. for behavioral state inference).

Part 1 — Package: implementation and use

1. What this package does

The package implements two discrete-observation Hidden Markov Models:

HMM — classic model: emission depends only on the current hidden state.
ARHMM — autoregressive variant: emission depends on the current hidden state and the previous observation, so the next symbol is allowed to depend on the last one (useful for behavior, brain activity, sequences, etc.).

You get parameter estimation (Baum–Welch), state decoding (Viterbi), log-likelihood and AIC, and a helper to turn continuous features into discrete symbols via a GMM (so you can plug in behavioral or other time series and run the HMM on a discrete sequence).

Graphical models:

HMM: emission depends only on state	ARHMM: emission depends on state and previous observation

Click for more detail: HMM vs ARHMM

HMM: ( P(Y_t \mid S_t) ) — observation at time ( t ) depends only on hidden state ( S_t ).
ARHMM: ( P(Y_t \mid S_t, Y_{t-1}) ) — observation at ( t ) depends on state ( S_t ) and previous observation ( Y_{t-1} ), so the model can capture “what comes next” given “what just happened,” which is often useful for behavioral or symbolic sequences.

2. Implementation: HMM and ARHMM

Both models share the same hidden chain: (K) states, transition matrix (A), initial distribution (Pi). They differ only in the emission and how it is stored and updated. The dependency structure is shown in the graphical models above (HMM: ( S_t \to Y_t ) only; ARHMM: ( S_t \to Y_t ) and ( Y_{t-1} \to Y_t )).

Item	HMM	ARHMM
Emission	( \phi_{i,y} = P(Y_t = y \mid S_t = i) )	( e_{i,y,\ell} = P(Y_t = y \mid S_t = i,, Y_{t-1} = \ell) )
Stored in struct	`ϕ` (matrix ( K \times N ))	`e` (3D array ( K \times N \times N ))
Baum–Welch / Viterbi	Use ( \phi ) only	Use ( e ) with current and previous observation

Observation symbols are 1-based integers ( 1,\ldots,N ) (e.g. after GMM_labels, which maps GMM components to 1-based labels).

Click for implementation details (struct, algorithms)

Struct HMM holds: ( K, N, T ); ( \Pi, A ); emission (ϕ for HMM, e for ARHMM); sequences X (states) and Y (observations); and all Baum–Welch / Viterbi arrays (α, β, c, α_hat, β_hat, γ, ξ, T1, T2). The field model is the string "HMM" or "ARHMM" and selects which emission and which forward/backward/update formulas are used.
Baum–Welch is implemented in four steps: forward pass (with scaling c), backward pass, computation of ( \gamma ) and ( \xi ), then update of ( \Pi ), ( A ), and emission. The update step for ARHMM uses counts over ( (Y_t, Y_{t-1}) ) per state; for HMM it uses counts over ( Y_t ) per state.
Viterbi uses the usual recursion in log space; for ARHMM the emission term is ( \log e_{j,Y_t,Y_{t-1}} ), for HMM it is ( \log \phi_{j,Y_t} ).

3. Functionality: what the functions do

High-level usage: construct an HMM, set hmm.Y to your observation sequence, fit with hmm_fit, then decode with viterbi. Optionally use GMM_labels to build a discrete sequence from continuous features.

Function / step	What it does
`HMM(K, N, T, model)`	Builds an HMM with ( K ) states, ( N ) symbols, length ( T ). `model` is `"HMM"` or `"ARHMM"`. Initializes ( \Pi ), ( A ), and emission (random) and allocates all internal arrays.
`hmm.Y`	You assign the integer observation sequence (1-based, length ( T )).
`hmm_fit(hmm, n_iter)`	Runs Baum–Welch for `n_iter` iterations (forward → backward → ( \gamma,\xi ) → update), then computes log-likelihood and AIC and stores them in `hmm`.
`viterbi(hmm)`	Finds the most likely state sequence given `hmm.Y` and current parameters; writes the path into `hmm.X`.
`GMM_labels(n_components, Y_continuous)`	Fits a Gaussian mixture to the continuous matrix `Y_continuous` (rows = time, columns = features), returns 1-based discrete labels (one per time point) to use as `hmm.Y`.

4. Installation

From the repo root (or after cloning the repo):

cd /path/to/_HMMs.jl

In Julia:

using Pkg
Pkg.activate(".")
Pkg.instantiate()
using _HMMs

The package has one dependency: PyCall (used by GMM_labels to call sklearn’s Gaussian Mixture). If you don’t use GMM_labels, the rest of the code is pure Julia.

Click for dev install and troubleshooting

Development install from another project:
Pkg.develop(path="/path/to/_HMMs.jl") then using _HMMs.
PyCall / sklearn:
GMM_labels needs Python with sklearn. If PyCall fails, run Pkg.build("PyCall") or point PyCall to the correct Conda env where sklearn is installed.
1-based observations:
All observation indices and state indices in the package are 1-based (Julia convention). GMM_labels returns labels in ( 1,\ldots,N ) on purpose.

Part 2 — Example usage of ARHMM for inferring behavioral motifs in freely swimming larval zebrafish

Freely behaving larval zebrafish spontaneously transition through multiple behavioral states. These states are organized at different temporal scales: sleep and wake cycles at longer time scales; napping or sleep periods nested within days; exploitation vs. exploration at intermediate scales; and, at faster scales, movement vs. rest or swim bout vs. no swim bout.

Each state is associated with characteristic behaviors. The more ongoing behavioral signatures we record, the better we can tell states apart.

We therefore recorded freely swimming larval zebrafish at high spatiotemporal resolution in a featureless arena (30 mm × 30 mm). We extracted movement-related parameters such as speed/dispersal (how much the animal moves in a time window), tail/heading, and eye movement (left and right eye angles).

The figure below illustrates the setup and the kind of signals we use: trajectory and arena (panel A), a schematic of the fish and eye-angle measurement (B), dispersal over time (C), and left/right eye angles (D).

We assumed that a hidden process was responsible for the distinct signatures in these observed behavioral parameters. To infer that process, we discretized the continuous signals (via a Gaussian mixture) and fitted a 5-state ARHMM to three derived observables: dispersal, heading velocity (smoothed), and saccade rate—where saccade rate is obtained by scoring conjugate saccades in the eye-angle traces.

The ARHMM assigns each time point to one of five hidden states. The figure below shows the same input channels (dispersal, saccade rate, eye angles) and the inferred state sequence (bottom panel) over time.

The model recovers extended bouts of distinct activity and smooth transitions between them, indicating that the ARHMM is doing a good job of parsing behavior into interpretable, state-like motifs.

This pipeline—from raw tracking and eye/behavioral signals to discrete observables (GMM) and then to state inference (ARHMM)—is implemented in the accompanying notebook:

notebooks/Autoregressive_HMM_sleep_substate.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
figures		figures
notebooks		notebooks
src		src
.gitignore		.gitignore
Manifest.toml		Manifest.toml
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

_HMMs.jl

Part 1 — Package: implementation and use

1. What this package does

2. Implementation: HMM and ARHMM

3. Functionality: what the functions do

4. Installation

Part 2 — Example usage of ARHMM for inferring behavioral motifs in freely swimming larval zebrafish

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

cvikash/_HMMs.jl

Folders and files

Latest commit

History

Repository files navigation

_HMMs.jl

Part 1 — Package: implementation and use

1. What this package does

2. Implementation: HMM and ARHMM

3. Functionality: what the functions do

4. Installation

Part 2 — Example usage of ARHMM for inferring behavioral motifs in freely swimming larval zebrafish

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages