Skip to content

Conversation

@ervteng
Copy link
Contributor

@ervteng ervteng commented Nov 18, 2020

Proposed change(s)

In certain environments (particularly with large action spaces), discrete actions can become NaN occasionally. This is especially evident with the Match3 environment.

This PR adds a small epsilon value to any division or log function that may converge very close to zero, particularly in the MultiCategoricalDistribution and CategoricalDistInstance.

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@ervteng ervteng requested a review from chriselion November 18, 2020 00:13
normalized_probs = raw_probs / torch.sum(raw_probs, dim=-1).unsqueeze(-1)
normalized_logits = torch.log(normalized_probs + EPSILON)
return normalized_logits
# Zero out masked logits, then subtract a large value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to the paper/blog post that this was adapted from?

@ervteng ervteng merged commit 993f822 into master Nov 18, 2020
@delete-merged-branch delete-merged-branch bot deleted the develop-fix-nan-merge branch November 18, 2020 22:30
ervteng pushed a commit that referenced this pull request Nov 18, 2020
…ing Match3 (#4664)

* match3 settings

* Add epsilon to log

* Add another epsilon

* Revert match3 configs

* NaN-free masking method

* Add comment for paper

* Add comment for paper

Co-authored-by: Chris Elion <chris.elion@unity3d.com>
@ervteng ervteng mentioned this pull request Nov 18, 2020
10 tasks
ervteng pushed a commit that referenced this pull request Nov 18, 2020
* match3 settings

* Add epsilon to log

* Add another epsilon

* Revert match3 configs

* NaN-free masking method

* Add comment for paper

* Add comment for paper

Co-authored-by: Chris Elion <chris.elion@unity3d.com>

Co-authored-by: Chris Elion <chris.elion@unity3d.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants