A collection of Reinforcement Learning (RL) Methods I have implemented in jax/flax, flux and pytorch with particular effort put into readability and reproducibility.
- Python >= 3.8
- jax
$ git clone https://github.com/BeeGass/Agents.git
$ cd Agents/agents-jax
$ python main.py
- PyTorch >= 1.10
$ cd Agents/agents-pytorch
$ python main.py
- TODO
- TODO
$ cd Agents/agents-flux
$ # TBA
Config File Template
TBA
Weights And Biases Integration
TBA
Model | NumPy/Vanilla | Jax/Flax | Flux | Config | Paper |
---|---|---|---|---|---|
Policy Evaluation | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
Policy Improvement | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
Policy Iteration | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
Value Iteration | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
On-policy first visit Monte-Carlo prediction | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
On-policy first visit Monte-Carlo control | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
Sarsa (on-policy TD control) | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
Q-learing (off-policy TD control) | ☑ | ☐ | ☐ | ☐ | DS595-RL-Projects |
Model | PyTorch | Jax/Flax | Flux | Config | Paper |
---|---|---|---|---|---|
DQN | ☐ | ☐ | ☐ | ☐ | Link |
DDPG | ☐ | ☐ | ☐ | ☐ | Link |
DRQN | ☐ | ☐ | ☐ | ☐ | Link |
Dueling-DQN | ☐ | ☐ | ☐ | ☐ | Link |
Double-DQN | ☐ | ☐ | ☐ | ☐ | Link |
PER | ☐ | ☐ | ☐ | ☐ | Link |
Rainbow | ☐ | ☐ | ☐ | ☐ | Link |
Model | PyTorch | Jax/Flax | Flux | Config | Paper |
---|---|---|---|---|---|
PPO | ☐ | ☐ | ☐ | ☐ | Link |
TRPO | ☐ | ☐ | ☐ | ☐ | Link |
SAC | ☐ | ☐ | ☐ | ☐ | Link |
A2C | ☐ | ☐ | ☐ | ☐ | Link |
A3C | ☐ | ☐ | ☐ | ☐ | Link |
TD3 | ☐ | ☐ | ☐ | ☐ | Link |
Model | PyTorch | Jax/Flax | Flux | Config | Paper |
---|---|---|---|---|---|
World Models | ☐ | ☐ | ☐ | ☐ | Link |
Dream to Control | ☐ | ☐ | ☐ | ☐ | Link |
Dream to Control v2 | ☐ | ☐ | ☐ | ☐ | Link |
@software{Gass_Agents_2021,
author = {Gass, B.A., Gass, B.A.},
doi = {10.5281/zenodo.1234},
month = {12},
title = {{Agents}},
url = {https://github.com/BeeGass/Agents},
version = {1.0.0},
year = {2021}
}