Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about the sample_actions() #5

Closed
fuyw opened this issue Mar 10, 2022 · 3 comments
Closed

A question about the sample_actions() #5

fuyw opened this issue Mar 10, 2022 · 3 comments

Comments

@fuyw
Copy link

fuyw commented Mar 10, 2022

@functools.partial(jax.jit, static_argnames=('actor_def', 'distribution'))

Hi Ilya,

Many thanks for the nice work. I have a question of the sample_actions() function, why do we need the _sample_actions()? Isn't it redundant?

Maybe we can simply:

@functools.partial(jax.jit, static_argnames=('actor_def'))
def sample_actions(rng, actor_def, actor_params, observations, temperature):
    dist = actor_def.apply({'params': actor_params}, observations, temperature)
    rng, key = jax.random.split(rng)
    return rng, dist.sample(seed=key)

Further, I tried to reimplement IQL with TrainState. I found that use TrainState is slower than this implementation (~100-200 fps).

@fuyw fuyw changed the title A question about the `` A question about the sample_actions() Mar 10, 2022
@ikostrikov
Copy link
Owner

@fuyw I think there is some bug on Windows otherwise: ikostrikov/jaxrl#18

That's cool! Is it this implementation? I will take a look.

@fuyw
Copy link
Author

fuyw commented Mar 10, 2022

Thanks for the reply. Yes it is, and I just refactored the code according to the flax official examples.

For simplicity, I replaced the tfd to distrax, and this does not matters in my experiments.

@fuyw
Copy link
Author

fuyw commented Mar 10, 2022

Sorry Ilya, I found a bug in my previous implementation. I used a jax.device_put() when sampling from the buffer, which wastes time. When I fixed this bug, the throughput is close to this implementation now.

@fuyw fuyw closed this as completed Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants