## Introduction

A **Markov Chain** is a mathematical system that undergoes transitions from one state to another on a state space. It is a stochastic process, which means that the outcome is partially random. In a Markov Chain, the next state depends only on the current state, not the sequence of events that preceded it. This property is called the **Markov Property**.

A **Markov Chain** is a discrete-time stochastic dynamical system where the next state depends only on the current state. In a Markov Chain, the dynamics can be defined as follows:

$$
s_{k+1} \mid s_k \sim P(s_{k+1} \mid s_k), \quad k = 0, 1, 2, \dots
$$

Where:

- \( s_k \) is the state at time \( k \).
- The state space is the set of all possible states.
- \( P(s_{k+1} \mid s_k) \) is the probability of transitioning to the next state \( s_{k+1} \) given the current state \( s_k \).

The defining property of a Markov Chain is that the distribution of the next state depends only on the current state and not on the previous states:

$$
P(s_{k+1} \mid s_k, s_{k-1}, s_{k-2}, \dots, s_0) = P(s_{k+1} \mid s_k)
$$

In simple terms, once we know the current state, the past history of states does not affect the future state.


### Components of a Markov Chain
1. States: These represent possible situations. In this example, we have three states:
- Operation (S1)
- Obstacle (S2)
- End of Operation (S3)
2. Transition Probabilities: These represent the likelihood of moving from one state to another. For example, from "Operation" (S1) to "Obstacle" (S2), there is a 0.2 chance. These probabilities can be represented in a Probability Matrix.
3. Markov Property: The next state depends only on the current state, not on how we got there.

### Probability Matrix

The transition probabilities between the states are often summarized in a transition matrix, where each entry represents the probability of moving from one state to another.


In [2]:
class MarkovChain:
    def __init__(self):
        self.states = ['S1', 'S2', 'S3']
        self.transition_matrix = [
            [0.7, 0.2, 0.1],
            [0.5, 0.3, 0.2],
            [0.0, 0.0, 1.0]
        ]
        
    def next_state(self, current_state):
        import random
        probabilities = self.transition_matrix[current_state]
        return random.choices([0, 1, 2], probabilities)[0]
    
    def run_chain(self, start_state, steps):
        current_state = start_state
        episode = [self.states[current_state]]
        
        for _ in range(steps):
            next_state_index = self.next_state(current_state)
            next_state = self.states[next_state_index]
            episode.append(next_state)
            if next_state == 'S3':  #stop in terminal state
                break
            current_state = next_state_index
        
        return episode

markov_chain = MarkovChain()
start_state = 0
episode = markov_chain.run_chain(start_state, steps=10)

print("Generated Episode:", episode)


Generated Episode: ['S1', 'S3']


## Explanation of the Code

- States: We define three states in the MarkovChain class: S1 (Operation), S2 (Obstacle), and S3 (End of Operation).
- Transition Matrix: This 2D matrix defines the probabilities of moving from one state to another.
   - For instance, from S1 (Operation), the probabilities are:
        - 70% chance of staying in S1 (Operation),
        - 20% chance of moving to S2 (Obstacle),
        - 10% chance of moving to S3 (End of Operation).
    - Similarly, from S2 (Obstacle), the probabilities are:
        - 50% chance of moving to S1 (Operation),
        - 30% chance of staying in S2 (Obstacle),
        - 20% chance of moving to S3 (End of Operation).
    - S3 is a terminal state, so the probability of staying in S3 is 100%.
- next_state(): This function takes the current state and returns the next state based on the transition probabilities.
- run_chain(): This function simulates an episode in the Markov Chain by starting from an initial state and taking several steps, based on the transitions.
