### **Q1. Model the game of Snakes and Ladders (single-player game) as a Markov Process. Write out it’s state space and structure of transition probabilities.**

State space = $\{k,k=1,2,\cdots,100\}$ inidicating the position of the player.
$k=100$ is terminal state.

Here we define two dictionaries ladder and snake, where (key,value) pair are starts and ends of ladder or snake.

Transition matrix $P(k,k'):N \times S \rightarrow [0,1] $ are defined as

$P(k,k') =  \begin{cases}
   1  & k \in dic.keys(),\ k' = dic[k], dic = snake \ or \ ladder\\
   1/6   & k \notin dic.keys(),k'-k \leq 6\\
   0     & else
\end{cases}$



### **Q2. Create a transition map**

In [4]:
from rl import markov_process, distribution
from typing import (Dict, Iterable, Generic, Sequence, Tuple,
                    Mapping, Optional, TypeVar)
from dataclasses import dataclass
from rl.distribution import Categorical

In [9]:
@dataclass(frozen=True)
class SL_State:
    position: int

@dataclass(frozen=True)
class SL_Rules:
    stopstate: int
    snake: Dict[int,int]
    ladder: Dict[int,int]

SL_MapType = Dict[int, Iterable[Tuple[int, int]]]
SL_StateMap = Dict[int, SL_State]



class SL_MDP(markov_process.FiniteMarkovProcess[SL_State]):

    def __init__(self, sl_rule: SL_Rules):
        self.sl_rule = sl_rule
        #self.sl_states = {k:SL_State(k) for k in range(1,sl_rule.stopstate+1)}
        #self.ds_map = ds_map
        super().__init__(self.get_transition_map())

    def get_transition_map(self) -> markov_process.Transition[SL_State]:
        d: Dict[SL_State, Categorical[SL_State]] = {}
        n = self.sl_rule.stopstate
        for k in range(1,n):
            if k in self.sl_rule.snake.keys():
                d[SL_State(k)] = Categorical({SL_State(self.sl_rule.snake[k]):1.0})
            elif k in self.sl_rule.ladder.keys():
                d[SL_State(k)] = Categorical({SL_State(self.sl_rule.ladder[k]):1.0})
            else:
                if n-k>=6:
                    d[SL_State(k)] = Categorical({SL_State(k+i):1/6 for i in range(1,7)})
                else:
                    temp = n - k
                    d[SL_State(k)] = Categorical({**{SL_State(kp):1/6 for kp in range(k+1,n)},**{SL_State(n):(6-temp+1)/6}})
        d[SL_State(n)] = None
        return d




In [12]:
sl_rules = SL_Rules(stopstate=100, snake ={35:10,74:3},ladder = {1:26})
mdp = SL_MDP(sl_rules)
mdp

From State SL_State(position=1):
  To State SL_State(position=26) with Probability 1.000
From State SL_State(position=2):
  To State SL_State(position=3) with Probability 0.167
  To State SL_State(position=4) with Probability 0.167
  To State SL_State(position=5) with Probability 0.167
  To State SL_State(position=6) with Probability 0.167
  To State SL_State(position=7) with Probability 0.167
  To State SL_State(position=8) with Probability 0.167
From State SL_State(position=3):
  To State SL_State(position=4) with Probability 0.167
  To State SL_State(position=5) with Probability 0.167
  To State SL_State(position=6) with Probability 0.167
  To State SL_State(position=7) with Probability 0.167
  To State SL_State(position=8) with Probability 0.167
  To State SL_State(position=9) with Probability 0.167
From State SL_State(position=4):
  To State SL_State(position=5) with Probability 0.167
  To State SL_State(position=6) with Probability 0.167
  To State SL_State(position=7) with Proba