# Hidden Markov Model - Generalized Linear Model (HMM GLM) Tutorial Notebook #

This notebook will teach you how to use HMM-GLMs using StateSpaceDynamics.jl. HMM-GLMs are an extension of HMMs where the observation is dependent not only on the latent state, but also an input. These can be conceptualized as switching regression models, and are referred to as such throughout StateSpaceDynamics.jl

## 0: Environment Setup

In [1]:
using Pkg
using LinearAlgebra
using Plots
using Distributions
using Random
using StateSpaceDynamics
using StatsBase

const SSD = StateSpaceDynamics

StateSpaceDynamics

## 1: Introduction to HMM-GLMs ##

For a detailed explanation of the core Hidden Markov Model (HMM) concepts, including the hidden state evolution and transition matrix, please refer to the notebook `hmm.ipynb`. The primary difference between a standard HMM and an HMM-GLM (Hidden Markov Model with Generalized Linear Model) lies in the emission model.

Emission Model in HMM-GLMs
In an HMM-GLM, instead of directly modeling the emissions (observations) as a simple distribution (e.g., Gaussian or categorical), the emissions follow a generalized linear model (GLM). This means the observed data $x_t$ is a linear function of some input features, with parameters dependent on the hidden state $z_t$. 

Specifically:
$x_t=g^{-1}(X_tβ_{z_t})+ϵ$

$X_t$: Input feature vector at time $t$.

$\beta_{z_t}$: Regression coefficients for the hidden state $z_t$.

$g^{-1}(\cdot)$: Link function appropriate for the distribution of $x_t$ (e.g., logistic for binary outcomes, identity for continuous).

$\epsilon$: Noise term, often assumed to follow a Gaussian distribution.

The HMM-GLM can be conceptualized as a switching regression model, where the coefficients of the regression change discretely over time.

### Example

In the following example we will demonstrate how to create, sample from, and fit a Switching Gaussian Regression model using StateSpaceDynamics.jl

In [2]:
# Common dimensions for both models
input_dim = 2
output_dim = 2

# Parameters for the first Gaussian regression model

# THESE BETA MATRICES MIGHT BE TRANSPOSED

β1 = [1.0 0.8; -0.5 0.3; -0.2 0.4]  # Coefficients (3x2 matrix: 1 intercept + 2 features for 2 outputs)
Σ1 = [0.1 0.01; 0.01 0.2]  # Covariance matrix (2x2 matrix, since output_dim is 2)
include_intercept1 = true
λ1 = 0.01

# Instantiate the first Gaussian regression
regression1 = SSD.GaussianRegressionEmission(GaussianRegression(input_dim, output_dim, β1, Σ1, include_intercept1, λ1))


# Parameters for the first Gaussian regression model
β2 = [0.0 -5.0; -5.0 0.0; 4.0 -2.0]  # Coefficients (3x2 matrix: 1 intercept + 2 features for 2 outputs)
Σ2 = [0.01 0.01; 0.01 0.05]  # Covariance matrix (2x2 matrix, since output_dim is 2)
include_intercept2 = true
λ2 = 0.01

# Instantiate the second Gaussian regression
regression2 = SSD.GaussianRegressionEmission(GaussianRegression(input_dim, output_dim, β2, Σ2, include_intercept2, λ2))


# Initialize the Switching Gaussian Regression Model
model = SSD.SwitchingGaussianRegression(K=2, input_dim=3, output_dim=2)

# Assign the prebuild regression models to each state
model.B[1] = regression1
model.B[2] = regression2

n=1000


1000

In [6]:
Φ = rand(100,2)
n=100
true_labels, data = SSD.sample(model, Φ, n=n)

([2, 2, 1, 1, 1, 1, 1, 1, 1, 2  …  1, 1, 1, 1, 1, 1, 2, 2, 2, 2], [1.878923042514483 -6.300106983540043; -1.4047047707021907 -5.235923612075178; … ; 2.2143165284037787 -6.872844875433888; -2.512262998131283 -5.595872913649199])

In [7]:
data

100×2 Matrix{Float64}:
  1.87892   -6.30011
 -1.4047    -5.23592
  0.341696   0.772185
  0.639606   1.68005
  0.117925   0.911961
  0.658826   0.315069
  1.60341    1.69105
  1.09186    1.54226
  0.760772   0.480961
 -0.633124  -5.75163
  ⋮         
  1.20512    1.17742
  0.144288   0.169662
  0.257391   1.47979
  0.926707   0.640887
  1.34053    0.644708
 -0.475953  -6.58688
 -3.25821   -5.33169
  2.21432   -6.87284
 -2.51226   -5.59587