In these notebooks we solve a non-slippery version of the FrozenLake environment.
This is a very simple task, which is primarily used as a unit test for implementating new components to the coax package.
SARSA <sarsa> Expected SARSA <expected_sarsa> Q-Learning <qlearning> Double Q-Learning <double_qlearning>
REINFORCE <reinforce> A2C <a2c> PPO <ppo> DDPG <ddpg> TD3 <td3>
Stochastic SARSA <stochastic_sarsa> Stochastic Expected-SARSA <stochastic_expected_sarsa> Stochastic Q-Learning <stochastic_qlearning> Stochastic Double Q-Learning <stochastic_double_qlearning>