Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor #51

Closed
3 tasks done
Trinkle23897 opened this issue May 20, 2020 · 2 comments
Closed
3 tasks done

Refactor #51

Trinkle23897 opened this issue May 20, 2020 · 2 comments
Assignees
Labels
enhancement Feature that is not a new algorithm or an algorithm enhancement

Comments

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented May 20, 2020

  • DQN/DDPG/TD3/SAC with n-step return, in process_fn
  • PER interface
  • Batch over batch: do not copy
@Trinkle23897 Trinkle23897 added the enhancement Feature that is not a new algorithm or an algorithm enhancement label May 20, 2020
@Trinkle23897 Trinkle23897 self-assigned this May 20, 2020
Trinkle23897 added a commit that referenced this issue May 27, 2020
@mtaohuang
Copy link

maybe the n-step return can be moved into sample method of collector, with n-step parameter?
In this way the policy doesn't need access to the buffer or indices in the buffer.
I think this should be the way to do it, can you think of any drawbacks?

@Trinkle23897
Copy link
Collaborator Author

No, I think it is a part of the algorithm, like GAE.

BFAnas pushed a commit to BFAnas/tianshou that referenced this issue May 5, 2024
BFAnas pushed a commit to BFAnas/tianshou that referenced this issue May 5, 2024
BFAnas pushed a commit to BFAnas/tianshou that referenced this issue May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature that is not a new algorithm or an algorithm enhancement
Projects
None yet
Development

No branches or pull requests

2 participants