# Markov Chain Model Tutorial

In this tutorial, we will learn how to build a Markov Chain model from a given dataset of observed state sequences. We will implement the model in Python, fit it with data, and use it to predict future states.

## Step-by-Step Implementation

1. Import necessary libraries.
2. Define the `MarkovChain` class.
3. Fit the model with data.
4. Predict future states.
5. Retrieve and visualize the transition matrix.

Let's start!

## Step 1: Import Necessary Libraries


In [5]:

import numpy as np
import pandas as pd

## Step 2: Define the `MarkovChain` Class

We will define a class `MarkovChain` that will manage the transition matrix, state list, and state index mapping.


In [6]:
class MarkovChain:
    def __init__(self):
        self.transition_matrix = None
        self.states = None
        self.state_index = None
    
    def fit(self, data):
        self.states = list(set(data))
        self.state_index = {state: idx for idx, state in enumerate(self.states)}
        
        num_states = len(self.states)
        self.transition_matrix = np.zeros((num_states, num_states))

        for (state_from, state_to) in zip(data[:-1], data[1:]):
            i = self.state_index[state_from]
            j = self.state_index[state_to]
            self.transition_matrix[i, j] += 1

        # Normalize the transition matrix to get probabilities
        row_sums = self.transition_matrix.sum(axis=1)
        for i in range(num_states):
            if row_sums[i] > 0:
                self.transition_matrix[i] /= row_sums[i]

    def predict(self, current_state, num_steps):
        if self.transition_matrix is None:
            raise ValueError("Model has not been fitted yet. Call fit() with training data.")

        predicted_states = []
        current_index = self.state_index.get(current_state)
        if current_index is None:
            raise ValueError("Current state not in the list of known states.")

        for _ in range(num_steps):
            next_state_dist = self.transition_matrix[current_index]
            next_index = np.random.choice(len(self.states), p=next_state_dist)
            current_index = next_index
            predicted_states.append(self.states[next_index])
        
        return predicted_states

    def get_transition_matrix(self):
        return pd.DataFrame(self.transition_matrix, index=self.states, columns=self.states)


## Step 3: Fit the Model with Data

We will use an example dataset to fit our Markov Chain model.


In [7]:
# Example dataset of observed state sequences
example_data = ['A', 'A', 'B', 'A', 'C', 'C', 'B', 'B', 'A', 'B', 'C', 'A']

mc = MarkovChain()
mc.fit(example_data)

print("Transition Matrix:")
print(mc.get_transition_matrix())


Transition Matrix:
          C         A         B
C  0.333333  0.333333  0.333333
A  0.250000  0.250000  0.500000
B  0.250000  0.500000  0.250000


## Step 5: Retrieve and Visualize the Transition Matrix

Finally, we will retrieve the transition matrix and visualize it to understand the state transitions better.


In [8]:
# Retrieve the transition matrix as a DataFrame for better readability
transition_matrix_df = mc.get_transition_matrix()
transition_matrix_df

Unnamed: 0,C,A,B
C,0.333333,0.333333,0.333333
A,0.25,0.25,0.5
B,0.25,0.5,0.25


And that’s it! You've successfully built and trained a Markov Chain model from data, and used it to make predictions. You can extend this model by incorporating more advanced features and applying it to more complex datasets. Happy modeling!
