## Frog Puzzle

Problem Statement:
1. The frog has equal probability of landing on all the remaining steps;
2. The problem has certain number of steps, which can be user defined. As we are expecting finite number of steps, we cap the maximum number of steps to be N = 100 and minimum number of steps to be N = 1.

In [8]:
#import section - use the abstract Markov Process class from rl directory
from rl.markov_process import FiniteMarkovRewardProcess, RewardTransition
from rl.distribution import Categorical
from typing import Dict, Tuple
import numpy as np
import itertools
import matplotlib.pyplot as plt

In [9]:
class FrogFMRP(FiniteMarkovRewardProcess[int]):
    
    def __init__(self, N : int):
                
        self.N : int = max(min(N,100),1)
        super().__init__(self.get_transition_reward_map())
        
    def get_transition_reward_map(self) -> RewardTransition[int]:
        
        trans_map: Dict[int, Categorical[Tuple[int,int]]] = \
            {i : Categorical({(j,1):float(1./(self.N-i)) for j in range(i+1,self.N+1)}) 
             for i in range(self.N)}
            
        trans_map[self.N] = None                 
        
        return trans_map

In [10]:
# Test with the website's example:
frogJump = FrogFMRP(N = 10)

print("The average jumps to reach the other side is: %.3f."%(
    frogJump.get_value_function_vec(gamma = 1)[0]))

The average jumps to reach the other side is: 2.929.
