Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix for SAC-discrete. #60

Merged
merged 3 commits into from Sep 18, 2020
Merged

Bug fix for SAC-discrete. #60

merged 3 commits into from Sep 18, 2020

Conversation

toshikwa
Copy link
Contributor

@toshikwa toshikwa commented Sep 17, 2020

Hi, thank you for your great work :)

I fixed some bugs related to #54 and #56.
I tested it in CartPole.py and saw the training converged more stably.

Changes

  • fix min_qf_next_target to properly calculate expectations over policy.
  • similarly fix policy_loss.
  • fix max_probability_action to properly get argmax over each sample, not entire batch. (It doesn't actually affect the algorithm, though.)
  • fix device for Replay_Memory to train SAC-Discrete on CPU.
  • fix errors due to the update of PyTorch (now torch>=1.4.0 works!!).

Thanks :)

@p-christ
Copy link
Owner

thanks a lot

@p-christ p-christ merged commit 8a0dd27 into p-christ:master Sep 18, 2020
@toshikwa
Copy link
Contributor Author

Hi @p-christ

Thank you for merging.
Could you close relevant issues(54, 56)??

Also, I found that this repo got older and there were a few bugs.
Can I fix bugs and clean up codes??

Thanks :)

@p-christ
Copy link
Owner

p-christ commented Sep 20, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants