Baseline implementation of recurrent PPO using truncated BPTT
deep-learning
deep-reinforcement-learning
pytorch
recurrent-neural-networks
lstm
gru
policy-gradient
recurrence
recurrent
pomdp
actor-critic
truncated
proximal-policy-optimization
ppo
on-policy
bptt
-
Updated
Apr 28, 2024 - Jupyter Notebook