Running Value Iteration on the Taxi environment
I barrowed heavily from allanbreyes github.com/allanbreyes/gym-solutions/blob/master/analysis/mdp.py The goal was to learn a lot about value iteration, and I've achieved that. My code is simpler than Allen's, so it's worth a look if you are still learning. Otherwise check out his original, it covers policy iteration and multiple openAI environments.