You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current PGAME Replay Buffer is using jax.lax.dynamic_update_slice to add new transition to the replay buffer.
However, this is not acting like a circular buffer, meaning that if a batch contain more transitions than the size remaining in the buffer, it would delete the more recent transitions instead of the oldest ones.
The text was updated successfully, but these errors were encountered:
Indeed, the current replay buffer, inspired by Brax's implementation, does not handle overflow correctly.
We recently had a look at Brax's replay buffer implementation and they recently spotted this issue as well and fixed it. I would suggest to use their new way to handle it. What do you think about it?
Hi :)
The current PGAME Replay Buffer is using jax.lax.dynamic_update_slice to add new transition to the replay buffer.
However, this is not acting like a circular buffer, meaning that if a batch contain more transitions than the size remaining in the buffer, it would delete the more recent transitions instead of the oldest ones.
The text was updated successfully, but these errors were encountered: