You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sorry Ilya, I found a bug in my previous implementation. I used a jax.device_put() when sampling from the buffer, which wastes time. When I fixed this bug, the throughput is close to this implementation now.
implicit_q_learning/policy.py
Line 66 in 09d7002
Hi Ilya,
Many thanks for the nice work. I have a question of the
sample_actions()
function, why do we need the_sample_actions()
? Isn't it redundant?Maybe we can simply:
Further, I tried to reimplement IQL with
TrainState
. I found that useTrainState
is slower than this implementation (~100-200 fps).The text was updated successfully, but these errors were encountered: