in this notebook we are going to implement 2 dynamic programming algorithms : policy iteration and value iteration.
Thes algorithms are useful when we have the markov decision process of the environment (model-based algorithms)
The environment used in this notebook is FrozenLake8x8 from openai gym environments