Refactor #51

Trinkle23897 · 2020-05-20T09:00:17Z

DQN/DDPG/TD3/SAC with n-step return, in process_fn
PER interface
Batch over batch: do not copy

The text was updated successfully, but these errors were encountered:

mtaohuang · 2020-06-07T06:37:56Z

maybe the n-step return can be moved into sample method of collector, with n-step parameter?
In this way the policy doesn't need access to the buffer or indices in the buffer.
I think this should be the way to do it, can you think of any drawbacks?

Trinkle23897 · 2020-06-07T08:02:14Z

No, I think it is a part of the algorithm, like GAE.

Trinkle23897 added the enhancement Feature that is not a new algorithm or an algorithm enhancement label May 20, 2020

Trinkle23897 self-assigned this May 20, 2020

Trinkle23897 added a commit that referenced this issue May 27, 2020

item3 of #51

de556fd

Trinkle23897 added a commit that referenced this issue Jun 2, 2020

compute_nstep_returns (item 2 of #51)

ff81a18

Trinkle23897 closed this as completed in dc451df Jun 3, 2020

BFAnas pushed a commit to BFAnas/tianshou that referenced this issue May 5, 2024

item3 of thu-ml#51

9eaaa5c

BFAnas pushed a commit to BFAnas/tianshou that referenced this issue May 5, 2024

compute_nstep_returns (item 2 of thu-ml#51)

0c883c7

BFAnas pushed a commit to BFAnas/tianshou that referenced this issue May 5, 2024

nstep all (fix thu-ml#51)

28a1e82

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor #51

Refactor #51

Trinkle23897 commented May 20, 2020 •

edited

Loading

mtaohuang commented Jun 7, 2020

Trinkle23897 commented Jun 7, 2020

Refactor #51

Refactor #51

Comments

Trinkle23897 commented May 20, 2020 • edited Loading

mtaohuang commented Jun 7, 2020

Trinkle23897 commented Jun 7, 2020

Trinkle23897 commented May 20, 2020 •

edited

Loading