use int64 steps, check for NaN actions #4607

chriselion · 2020-10-26T23:36:21Z

Proposed change(s)

Use int64 for tensorflow policy steps, to avoid int32 overflow after 2**31 steps.
Also use int64 tensor for torch policy steps. The int32 overflow problem doesn't exist in torch since it use float tensors by default. But if setting the global step to a large number and then call get_current_step() where the step count is converted to int, the conversion will cause numerical error and the result will not be as expected.
Check for NaN actions before they get sent back to Unity. Currently, we only check the observations, but if we send NaN actions, we'll very likely get NaN observations and "blame" the observations.
After this change, old Tensorflow model will become incompatible and cannot be loaded by new code. Torch is not affected.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

still needs tests and handle torch

chriselion · 2020-10-26T23:39:11Z

ml-agents/mlagents/trainers/policy/tf_policy.py

+        action = run_out.get("action")
+        # Fast NaN check on the action
+        # See https://stackoverflow.com/questions/6736590/fast-check-for-nan-in-numpy for background.
+        d = np.sum(action)


There's a few ways to do this; we were using np.dot for observations at one point:

ml-agents/ml-agents-envs/mlagents/envs/brain.py

Line 245 in 81acfaf

d = np.dot(np_obs, np_obs)

but that morphed to np.mean during some refactoring:
https://github.com/Unity-Technologies/ml-agents/pull/3022/files#diff-614d11115c4d74319295ea34b44a3b7d8ab40b43865c1c3e351a8f6c9602c673R92

np.sum is one of the recommended approaches in the stackoverflow thread.

chriselion · 2020-10-26T23:40:32Z

ml-agents/mlagents/trainers/policy/tf_policy.py

+        d = np.sum(action)
+        has_nan = np.isnan(d)
+        if has_nan:
+            raise RuntimeError("NaN action detected.")


Just raising for now; there's not much the user can do about it. We could use np.nan_to_num but that would probably trigger every step and give bad results.

* check nan action for torch * step overflow test * use int tensor for global step in torch

vincentpierre

Do not forget to cherry pick on release_10 if you want these changes in the next release

* use int64 steps * check for NaN actions Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com>

* use int64 steps * check for NaN actions Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com> Co-authored-by: Chris Elion <chris.elion@unity3d.com>

use int64 steps, check for NaN actions

95a51cd

still needs tests and handle torch

chriselion commented Oct 26, 2020

View reviewed changes

Chris Elion and others added 2 commits October 26, 2020 17:37

fix unit test

3d0607b

Check int overflow/ nan action for torch and add tests (#4646)

bcc0ba0

* check nan action for torch * step overflow test * use int tensor for global step in torch

chriselion requested review from dongruoping and vincentpierre November 13, 2020 19:04

Ruo-Ping Dong added 2 commits November 13, 2020 11:17

Merge branch 'master' into MLA-1503-int-overflow-nan

86725f3

Update changelog

0c9d4b8

vincentpierre approved these changes Nov 13, 2020

View reviewed changes

dongruoping merged commit 1ee83d8 into master Nov 13, 2020

delete-merged-branch bot deleted the MLA-1503-int-overflow-nan branch November 13, 2020 20:47

dongruoping pushed a commit that referenced this pull request Nov 13, 2020

use int64 steps, check for NaN actions (#4607)

56a4163

* use int64 steps * check for NaN actions Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com>

dongruoping mentioned this pull request Nov 13, 2020

Cherry-pick int64 fix to release 10 #4654

Merged

10 tasks

vincentpierre pushed a commit that referenced this pull request Nov 16, 2020

use int64 steps, check for NaN actions (#4607) (#4654)

2753bc5

* use int64 steps * check for NaN actions Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com> Co-authored-by: Chris Elion <chris.elion@unity3d.com>

github-actions bot locked as resolved and limited conversation to collaborators Nov 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use int64 steps, check for NaN actions #4607

use int64 steps, check for NaN actions #4607

Uh oh!

chriselion commented Oct 26, 2020 •

edited by dongruoping

Loading

Uh oh!

chriselion Oct 26, 2020

Uh oh!

chriselion Oct 26, 2020

Uh oh!

vincentpierre left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

use int64 steps, check for NaN actions #4607

use int64 steps, check for NaN actions #4607

Uh oh!

Conversation

chriselion commented Oct 26, 2020 • edited by dongruoping Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

Uh oh!

chriselion Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

chriselion Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

vincentpierre left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chriselion commented Oct 26, 2020 •

edited by dongruoping

Loading