### Case Study

Many strategies can be highly profitable depending on whether you are in an upward trending, downward trending or oscillating market. However, how can you tell what market you are in and in addition do this without an subjective bias?

Hidden Markov Models allow us to determine any number of states we like, given inputs such as returns and volatility (or any inputs of your choosing), to help ascertain what state - or regime - the market is in. This is part of a type of analysis known as Regime identification.

### Imports

In [1]:

import pandas as pd
import numpy as np

from pyhhmm.gaussian import GaussianHMM


import matplotlib.pyplot as plt
import fix_yahoo_finance
import yfinance


*** `fix_yahoo_finance` was renamed to `yfinance`. ***
Please install and use `yfinance` directly using `pip install yfinance -U`

More information: https://github.com/ranaroussi/yfinance



### Data Management

In [2]:
# Data Extraction
start_date = "2017-01-1"
end_date = "2022-06-1"
symbol = "SPY"
#[OLECTRA,"LT","CONCOR","ELGIEQUIP","IOC","BEL","TATAELXSI","^NSEI"]
stock_name = "IOC.NS"

data = yfinance.download(tickers = stock_name,start="2010-03-06",
                               interval = "1d", group_by = 'ticker', auto_adjust = True)



data = data[["Open", "High", "Low", "Close","Volume"]]


[*********************100%***********************]  1 of 1 completed


In [18]:
# Add Returns and Range
df = data.copy()
df["Returns"] = (df["Close"] / df["Close"].shift(1)) - 1
df["Range"] = (df["High"] / df["Low"]) - 1
df.dropna(inplace=True)
df.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Returns,Range
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2000-03-07,3.403754,3.403754,3.174288,3.305274,689184,-0.031381,0.072289
2000-03-08,3.308142,3.336825,3.155165,3.218268,368766,-0.026323,0.057576
2000-03-09,3.155164,3.440085,3.1217,3.325352,666198,0.033274,0.101991
2000-03-10,3.269898,3.346387,3.231654,3.282328,456822,-0.012938,0.035503
2000-03-13,3.250776,3.289021,3.099711,3.178112,441126,-0.031751,0.061073


In [22]:
# Structure Data
df = df.replace([np.inf, -np.inf], np.nan)
df = df.dropna()
inf_mask = np.isinf(df)

# Filter the Series to exclude Inf values
df = df[~inf_mask]

X_train = df[["Returns", "Range"]]
X_train.tail()

Unnamed: 0_level_0,Returns,Range
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2023-06-30,0.021824,0.014341
2023-07-03,0.043264,0.052574
2023-07-04,-0.006299,0.021728
2023-07-05,0.00898,0.024625
2023-07-06,0.032461,0.029167


                 Open       High        Low      Close    Volume   Returns  \
Date                                                                         
2000-03-07   3.403754   3.403754   3.174288   3.305274    689184 -0.031381   
2000-03-08   3.308142   3.336825   3.155165   3.218268    368766 -0.026323   
2000-03-09   3.155164   3.440085   3.121700   3.325352    666198  0.033274   
2000-03-10   3.269898   3.346387   3.231654   3.282328    456822 -0.012938   
2000-03-13   3.250776   3.289021   3.099711   3.178112    441126 -0.031751   
...               ...        ...        ...        ...       ...       ...   
2023-06-30  90.949997  91.949997  90.650002  91.300003  12595671  0.021824   
2023-07-03  91.599998  96.099998  91.300003  95.250000  32972732  0.043264   
2023-07-04  96.000000  96.400002  94.349998  94.650002  11918401 -0.006299   
2023-07-05  95.150002  95.699997  93.400002  95.500000  14283796  0.008980   
2023-07-06  96.000000  98.800003  96.000000  98.599998  32778334

### HMM Learning

In [23]:
# Train Model
model = GaussianHMM(n_states=4, covariance_type='full', n_emissions=2)
model.train([np.array(X_train.values)])

  super()._check_params_vs_input(X, default_n_init=10)
  covars_new = (self.covars_prior + cv_num) / (


LinAlgError: Array must not contain infs or NaNs

In [None]:
# Check Results
hidden_states = model.predict([X_train.values])[0]
print(hidden_states[:40])
len(hidden_states)

In [None]:
# Regime state means for each feature
model.means

In [None]:
# Regime state covars for each feature
model.covars

### Data Visualization

In [None]:
# Structure the prices for plotting
i = 0
labels_0 = []
labels_1 = []
labels_2 = []
labels_3 = []
prices = df["Close"].values.astype(float)
print("Correct number of rows: ", len(prices) == len(hidden_states))
for s in hidden_states:
    if s == 0:
        labels_0.append(prices[i])
        labels_1.append(float('nan'))
        labels_2.append(float('nan'))
        labels_3.append(float('nan'))
    if s == 1:
        labels_0.append(float('nan'))
        labels_1.append(prices[i])
        labels_2.append(float('nan'))
        labels_3.append(float('nan'))
    if s == 2:
        labels_0.append(float('nan'))
        labels_1.append(float('nan'))
        labels_2.append(prices[i])
        labels_3.append(float('nan'))
    if s == 3:
        labels_0.append(float('nan'))
        labels_1.append(float('nan'))
        labels_2.append(float('nan'))
        labels_3.append(prices[i])
    i += 1

In [None]:
# Plot Chart
fig = plt.figure(figsize= (18, 8))
plt.plot(labels_0, color="green")
plt.plot(labels_1, color="red")
plt.plot(labels_2, color="orange")
plt.plot(labels_3, color="black")
plt.show()

### Conclusion

Although work is still yet to be done in the following notebook, we can cleary see from the chart above, that the Hidden Markov Model has been able to identify market regimes based on returns and volatility behaviour.

### Useful Resources

HMM Colab Version with Backtest: https://colab.research.google.com/drive/12qzR8SrhfhQDBImKYQqUKdj6n60E9jNp?usp=sharing