A collection of canonical Reinforcement Learning algorithms in jax, python
TODO:
- Planning by dynamic programming
- On-policy Monte-Carlo
- Off-policy Monte-Carlo
- Q-Learning
- Sarsa
- Expected Sarsa
- Off-policy n-step Sarsa
- On-policy n-step Sarsa
- n-step Expected Sarsa