# Markov Regime-Switching Models

This notebook demonstrates the Markov switching module in `regimes`, which provides
probabilistic regime detection for time series. Unlike structural break tests that identify
discrete break points, Markov switching models estimate the *probability* of being in each
regime at every point in time.

**Topics covered:**
- MarkovRegression: switching intercept/mean models
- MarkovAR: autoregressive models with regime-switching coefficients
- MarkovADL: autoregressive distributed lag models with switching
- Model integration: `from_model()` and `.markov_switching()` convenience methods
- All five visualization functions
- Restricted transitions and non-recurring regimes
- Regime number selection (IC-based and sequential LRT)
- Sequential restriction testing
- End-to-end workflow

## Setup

In [None]:
import sys
from pathlib import Path

sys.path.insert(0, str(Path('..') / 'src'))

In [None]:
import warnings

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import regimes as rg

# Suppress optimization warnings from statsmodels for cleaner output
warnings.filterwarnings('ignore', category=UserWarning)
warnings.filterwarnings('ignore', category=RuntimeWarning)

print(f'regimes version: {rg.__version__}')

## Generate Regime-Switching Data

We create three datasets with clear regime structure to demonstrate different model types.

### Mean-Switching Data (for MarkovRegression)

Two regimes with different means: a 'low' regime (mu_0 = 0) and a 'high' regime (mu_1 = 4).

In [None]:
rng = np.random.default_rng(42)
n = 300

# Two well-separated regimes
y_mean = np.concatenate([
    rng.normal(0.0, 1.0, 100),   # Regime 0: low mean
    rng.normal(4.0, 1.0, 100),   # Regime 1: high mean
    rng.normal(0.0, 1.0, 100),   # Back to regime 0
])

fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(y_mean)
for bp in [100, 200]:
    ax.axvline(bp, color='red', linestyle='--', alpha=0.5)
ax.set_xlabel('Observation')
ax.set_ylabel('y')
ax.set_title('Mean-Switching Data (true breaks at 100, 200)')
plt.tight_layout()
plt.show()

### AR Data with Switching Coefficients (for MarkovAR)

An AR(1) process where the intercept shifts between regimes.

In [None]:
n_ar = 300
phi = 0.6  # stable AR coefficient across regimes

y_ar = np.zeros(n_ar)
for t in range(1, n_ar):
    c = 0.0 if t < 100 or t >= 200 else 3.0
    y_ar[t] = c + phi * y_ar[t - 1] + rng.standard_normal()

fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(y_ar)
for bp in [100, 200]:
    ax.axvline(bp, color='red', linestyle='--', alpha=0.5)
ax.set_xlabel('Observation')
ax.set_ylabel('y')
ax.set_title('AR(1) with Switching Intercept')
plt.tight_layout()
plt.show()

### Regression Data with Switching Coefficients (for MarkovADL / regression with exog)

A regression where both intercept and slope change between regimes.

In [None]:
n_reg = 300
x_reg = rng.standard_normal(n_reg)

y_reg = np.zeros(n_reg)
y_reg[:150] = 1.0 + 0.5 * x_reg[:150] + rng.normal(0, 0.5, 150)
y_reg[150:] = 3.0 + 1.5 * x_reg[150:] + rng.normal(0, 0.5, 150)

fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(y_reg)
ax.axvline(150, color='red', linestyle='--', alpha=0.5)
ax.set_xlabel('Observation')
ax.set_ylabel('y')
ax.set_title('Regression with Switching Intercept and Slope')
plt.tight_layout()
plt.show()

## MarkovRegression

The simplest Markov switching model: a regression where the intercept (and optionally
the variance) can switch between regimes.

### Basic Fit

In [None]:
model = rg.MarkovRegression(y_mean, k_regimes=2)
results = model.fit(search_reps=5)
print(results.summary())

### Regime Parameters

Each regime has its own set of estimated parameters:

In [None]:
for regime in range(results.k_regimes):
    print(f'Regime {regime}: {results.regime_params[regime]}')

print(f'\nTransition matrix:\n{results.regime_transition}')
print(f'Expected durations: {results.expected_durations}')

### Most Likely Regime and Regime Periods

In [None]:
print(f'Most likely regime (first 20): {results.most_likely_regime[:20]}')

print('\nRegime periods:')
for regime, start, end in results.regime_periods():
    print(f'  Regime {regime}: observations {start}--{end} ({end - start} obs)')

### Information Criteria and Confidence Intervals

In [None]:
print(f'AIC: {results.aic:.1f}')
print(f'BIC: {results.bic:.1f}')
print(f'HQIC: {results.hqic:.1f}')
print(f'Log-likelihood: {results.llf:.1f}')

print('\nConfidence intervals:')
ci = results.conf_int()
print(ci)

### Results as DataFrame

In [None]:
results.to_dataframe()

### Switching Variance

Allow the error variance to differ between regimes:

In [None]:
model_sv = rg.MarkovRegression(y_mean, k_regimes=2, switching_variance=True)
results_sv = model_sv.fit(search_reps=5)

print('With switching variance:')
for regime in range(results_sv.k_regimes):
    print(f'  Regime {regime}: {results_sv.regime_params[regime]}')

print(f'\nAIC (constant var): {results.aic:.1f}')
print(f'AIC (switching var): {results_sv.aic:.1f}')

## MarkovAR

Markov switching autoregressive models, where the intercept, AR coefficients, and/or
variance can switch between regimes.

In [None]:
ar_model = rg.MarkovAR(y_ar, k_regimes=2, order=1)
ar_results = ar_model.fit(search_reps=5)
print(ar_results.summary())

### AR Parameters by Regime

In [None]:
for regime in range(ar_results.k_regimes):
    print(f'Regime {regime}:')
    print(f'  Parameters: {ar_results.regime_params[regime]}')
    print(f'  AR coefficients: {ar_results.ar_params[regime]}')

print(f'\nExpected durations: {ar_results.expected_durations}')

### Non-Switching AR Coefficients

If the AR dynamics are stable but only the intercept switches, set `switching_ar=False`:

In [None]:
ar_model_fixed = rg.MarkovAR(y_ar, k_regimes=2, order=1, switching_ar=False)
ar_results_fixed = ar_model_fixed.fit(search_reps=5)

print('With switching_ar=False:')
for regime in range(ar_results_fixed.k_regimes):
    print(f'  Regime {regime}: {ar_results_fixed.regime_params[regime]}')

print(f'\nAIC (switching AR): {ar_results.aic:.1f}')
print(f'AIC (fixed AR):     {ar_results_fixed.aic:.1f}')

## MarkovADL

Autoregressive distributed lag models with Markov switching. These combine AR dynamics
with exogenous regressors, and both can switch between regimes.

In [None]:
# Use the regression data with an AR component
adl_model = rg.MarkovADL(
    y_reg, exog=x_reg.reshape(-1, 1),
    k_regimes=2, ar_order=1, exog_lags=1,
)
adl_results = adl_model.fit(search_reps=5)
print(adl_results.summary())

### Regime Parameters for ADL

In [None]:
for regime in range(adl_results.k_regimes):
    print(f'Regime {regime}: {adl_results.regime_params[regime]}')

## Model Integration

Markov switching models integrate with `regimes` OLS, AR, and ADL models through
two mechanisms:

1. **`from_model()`** class methods for explicit construction
2. **`.markov_switching()`** convenience method on model objects

### From OLS Model

In [None]:
# Create an OLS model
ols = rg.OLS(y_mean, has_constant=True)

# Convert to Markov switching via from_model()
ms_from_ols = rg.MarkovRegression.from_model(ols, k_regimes=2)
results_from_ols = ms_from_ols.fit(search_reps=5)

print(f'Regimes detected: {results_from_ols.k_regimes}')
for r in range(results_from_ols.k_regimes):
    print(f'  Regime {r}: {results_from_ols.regime_params[r]}')

### Convenience Method

In [None]:
# One-liner: create OLS and get Markov switching results
ms_results = rg.OLS(y_mean, has_constant=True).markov_switching(k_regimes=2)

print(f'Type: {type(ms_results).__name__}')
print(f'Log-likelihood: {ms_results.llf:.1f}')

### From AR Model

In [None]:
# AR model -> MarkovAR
ar = rg.AR(y_ar, lags=1)
ms_ar = rg.MarkovAR.from_model(ar, k_regimes=2)
ms_ar_results = ms_ar.fit(search_reps=5)

print(f'MarkovAR from AR model: {ms_ar_results.k_regimes} regimes')
print(f'AR order: {ms_ar_results.order}')

## Visualization

The Markov module includes five specialized plotting functions, all accessible as methods
on the results object.

### Smoothed Probabilities

P(S_t = j | Y_1, ..., Y_T) for each regime -- the probability of being in each regime
at each point in time, using the full sample information.

In [None]:
fig, axes = results.plot_smoothed_probabilities()
plt.show()

### Regime Shading

The time series with regime-colored background shading, where alpha (transparency)
is proportional to regime probability.

In [None]:
fig, ax = results.plot_regime_shading(y=y_mean)
plt.show()

### Transition Matrix

Heatmap of the estimated transition matrix P(S_t = i | S_{t-1} = j).

In [None]:
fig, ax = results.plot_transition_matrix()
plt.show()

### Parameter Time Series

Regime-specific parameters as step functions (or probability-weighted) over time.

In [None]:
# Step-function: shows parameter value for the most likely regime
fig, axes = results.plot_parameter_time_series()
plt.show()

In [None]:
# Probability-weighted: smooth blend based on regime probabilities
fig, axes = results.plot_parameter_time_series(weighted=True)
plt.show()

### AR Model Visualization

The same plots work for MarkovAR results:

In [None]:
fig, axes = ar_results.plot_smoothed_probabilities(title='MarkovAR: Smoothed Probabilities')
plt.show()

fig, ax = ar_results.plot_regime_shading(y=y_ar, title='MarkovAR: Regime Shading')
plt.show()

## Restricted Transitions

Markov switching models allow arbitrary restrictions on the transition matrix.
The most common use case is **non-recurring regimes** (structural breaks),
where transitions can only go forward through regimes.

### Non-Recurring Regimes (Structural Breaks)

A non-recurring model restricts the transition matrix so the process can never return
to a previous regime. This is equivalent to a structural break model estimated via
maximum likelihood.

In [None]:
from regimes.markov.restricted import RestrictedMarkovRegression

# Non-recurring 2-regime model (one structural break)
nr_model = RestrictedMarkovRegression.non_recurring(y_mean, k_regimes=2)
nr_results = nr_model.fit(search_reps=5)

print(f'Restricted transitions: {nr_results.restricted_transitions}')
print(f'Transition matrix:\n{nr_results.regime_transition}')

fig, ax = nr_results.plot_regime_shading(y=y_mean, title='Non-Recurring Regimes')
plt.show()

### Custom Restrictions

You can set any transition probability to a fixed value:

In [None]:
# Restrict P(0 -> 1) = 0: once in regime 1, cannot go back to regime 0
restricted_model = RestrictedMarkovRegression(
    y_mean, k_regimes=2, restrictions={(0, 1): 0.0}
)
restricted_results = restricted_model.fit(search_reps=5)

print(f'Transition matrix (restricted):\n{restricted_results.regime_transition}')
print(f'\nRestricted entries: {restricted_results.restricted_transitions}')

### Compare Restricted vs Unrestricted Likelihood

The restricted model should have a lower (or equal) log-likelihood:

In [None]:
# Unrestricted model (ordering=None for fair comparison)
unrestricted = rg.MarkovRegression(y_mean, k_regimes=2, ordering=None)
u_results = unrestricted.fit(search_reps=5)

print(f'Unrestricted log-likelihood: {u_results.llf:.2f}')
print(f'Restricted log-likelihood:   {restricted_results.llf:.2f}')
print(f'LR statistic: {2 * (u_results.llf - restricted_results.llf):.2f}')

### Restricted Transition Matrix Plot

The transition matrix plot highlights restricted (zero) entries:

In [None]:
fig, ax = restricted_results.plot_transition_matrix(
    title='Restricted Transition Matrix'
)
plt.show()

### Summary with Restricted Transitions

The summary clearly shows which transitions are restricted:

In [None]:
print(restricted_results.summary())

## Regime Number Selection

How many regimes does the data support? The `RegimeNumberSelection` class compares
models with different K using information criteria or sequential likelihood ratio tests.

### IC-Based Selection

In [None]:
from regimes.markov.selection import RegimeNumberSelection

sel = RegimeNumberSelection(y_mean, k_max=4, method='bic')
sel_results = sel.fit(verbose=True)

print(f'\nSelected K = {sel_results.selected_k} (by BIC)')

### Information Criteria Table

In [None]:
sel_results.ic_table

### IC Plot

In [None]:
fig, ax = sel_results.plot_ic()
plt.show()

### Sequential LRT Selection

Test K vs K+1 sequentially until the improvement is no longer significant:

In [None]:
sel_seq = RegimeNumberSelection(y_mean, k_max=4, method='sequential')
sel_seq_results = sel_seq.fit(verbose=True)

print(f'\nSelected K = {sel_seq_results.selected_k} (by sequential LRT)')
print(sel_seq_results.summary())

## Sequential Restriction Testing

These tests address specific questions about the transition structure:

- **NonRecurringRegimeTest**: Is the regime structure non-recurring (structural break)?
- **SequentialRestrictionTest**: GETS-style testing of individual transition probabilities

### Non-Recurring Regime Test

In [None]:
from regimes.markov.sequential_restriction import (
    NonRecurringRegimeTest,
    SequentialRestrictionTest,
)

# Test whether the regime structure is non-recurring
nr_test = NonRecurringRegimeTest(y_mean, k_regimes=2, method='chi_bar_squared')
nr_test_results = nr_test.fit()

print(nr_test_results.summary())

### Sequential Restriction Test

This GETS-style procedure tests each off-diagonal transition probability
and removes insignificant ones:

In [None]:
seq_test = SequentialRestrictionTest(
    y_mean, k_regimes=2,
    significance=0.05,
    multiple_testing='holm',
)
seq_results = seq_test.fit(verbose=True)

print(seq_results.summary())

## Complete Workflow

A typical analysis combines regime number selection, restriction testing, and
the final model estimation into a coherent workflow.

In [None]:
# Step 1: Generate data
rng_wf = np.random.default_rng(123)
y_wf = np.concatenate([
    rng_wf.normal(0.0, 1.0, 150),
    rng_wf.normal(5.0, 1.0, 150),
])

print('Step 1: Data generated (300 obs, true break at 150)')

In [None]:
# Step 2: Select number of regimes
sel = RegimeNumberSelection(y_wf, k_max=4, method='bic')
sel_results = sel.fit()

k_selected = sel_results.selected_k
print(f'Step 2: BIC selects K = {k_selected}')
print(sel_results.ic_table.to_string(index=False))

In [None]:
# Step 3: Test for non-recurring structure
nr_test = NonRecurringRegimeTest(y_wf, k_regimes=k_selected)
nr_results = nr_test.fit()

print(f'Step 3: Non-recurring test')
print(f'  LR statistic: {nr_results.lr_statistic:.3f}')
print(f'  p-value: {nr_results.p_value:.4f}')
print(f'  Rejected: {nr_results.rejected}')
is_recurring = 'Regime IS recurring (Markov)' if nr_results.rejected else 'Non-recurring (structural break) not rejected'
print(f'  Interpretation: {is_recurring}')

In [None]:
# Step 4: Fit the final model
if nr_results.rejected:
    # Recurring: use unrestricted Markov switching
    final_model = rg.MarkovRegression(y_wf, k_regimes=k_selected)
    model_type = 'Unrestricted Markov'
else:
    # Non-recurring: use restricted (structural break) model
    final_model = RestrictedMarkovRegression.non_recurring(
        y_wf, k_regimes=k_selected
    )
    model_type = 'Non-recurring (structural break)'

final_results = final_model.fit(search_reps=5)
print(f'Step 4: Fitted {model_type} model')
print(f'  Log-likelihood: {final_results.llf:.1f}')
for r in range(final_results.k_regimes):
    print(f'  Regime {r}: {final_results.regime_params[r]}')

In [None]:
# Step 5: Visualize the results
fig, axes = plt.subplots(3, 1, figsize=(10, 10))

# Panel 1: Regime shading
final_results.plot_regime_shading(y=y_wf, ax=axes[0])
axes[0].set_title(f'Final Model: {model_type}')

# Panel 2: Smoothed probabilities
probs = final_results.smoothed_marginal_probabilities
for j in range(final_results.k_regimes):
    axes[1].fill_between(
        range(len(probs)), probs[:, j],
        alpha=0.5, label=f'Regime {j}'
    )
axes[1].set_ylabel('Probability')
axes[1].set_title('Smoothed Probabilities')
axes[1].legend()

# Panel 3: IC comparison
sel_results.plot_ic(ax=axes[2])

plt.tight_layout()
plt.show()

In [None]:
# Summary
print(final_results.summary())

## Summary

The `regimes` Markov switching module provides a complete toolkit for regime analysis:

| Component | Class / Function |
|-----------|------------------|
| Mean-switching regression | `MarkovRegression` |
| AR with switching | `MarkovAR` |
| ADL with switching | `MarkovADL` |
| Restricted transitions | `RestrictedMarkovRegression`, `RestrictedMarkovAR` |
| Regime number selection | `RegimeNumberSelection` |
| Non-recurring test | `NonRecurringRegimeTest` |
| Sequential restriction | `SequentialRestrictionTest` |
| Smoothed probabilities plot | `plot_smoothed_probabilities()` |
| Regime shading plot | `plot_regime_shading()` |
| Transition matrix plot | `plot_transition_matrix()` |
| Parameter time series plot | `plot_parameter_time_series()` |
| IC comparison plot | `plot_ic()` |