Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Main difference from DrQ? #1

Closed
miriaford opened this issue May 1, 2020 · 3 comments
Closed

Main difference from DrQ? #1

miriaford opened this issue May 1, 2020 · 3 comments

Comments

@miriaford
Copy link

Thanks for sharing the code!

I wonder what's the main algorithmic difference between DrQ-SAC and RAD-SAC? You only mentioned DrQ in passing in the paper, but didn't elaborate. Thanks!

@MishaLaskin
Copy link
Owner

RAD and DrQ are concurrent (published 2 days apart). Main difference:

In addition to data aug, DrQ modifies underlying SAC algo by weighing Q functions (both Q and target Q). RAD does not modify the underlying algo at all, it achieves same results only with data aug and can plug and play with any RL algo (we also show that it works with PPO with SOTA test-time generalization on ProcGen).

RAD also extensively ablates a variety of data augs and provides insight as to why random crop works well.

@denisyarats
Copy link

denisyarats commented May 1, 2020

One of the authors of DrQ here.

  1. in DrQ, we demonstrate that both data augmentation and Q-function regularization are helpful. In Figure 1 we show that just with data augmentation you can achieve SOTA:
    image

  2. We then show (in Figure 2), that our additional Q-function regularization provides additional boost:
    image

  3. Finally, our results are still better across the board (and we run 10 seeds not 3 :)). Here is the table:

image

Data augmentation alone is not enough for SOTA performance on some harder tasks.

@TaoHuang13
Copy link

enough

@denisyarats When comparing DrQ with RAD, do you convert the 'step' in RAD eval.log to 'environment step'?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants