-
Notifications
You must be signed in to change notification settings - Fork 4.4k
[bug-fix] Move POCA critic to default device #5124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
vincentpierre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you manage to train on GPU after this fix ?
It's still failing: (Yamato job here: https://yamato.cds.internal.unity3d.com/jobs/497-ml-agents/tree/develop-poca-gpu/.yamato%252Fpytest-gpu.yml%2523pytest_gpu/5821068/job) need to debug |
|
One thing I noticed in networks.py:line 271 Since we're using default tensor type we should avoid calling |
|
Do you think we should enable gpu tests in all python PRs? I think we've seen couple times this fails in nightly CI and would be nice to catch them earlier. |
|
@dongruoping yep just caught that, the fix works 👍 |
I agree. Spoke with @surfnerd about this; I've added it to the Release Checklist. I guess we have to weight the cost of spinning up a GPU machine vs. catching these earlier. Not sure what the tradeoff is. |
Can you also fix the |
* Move critic to default device * Make sure to clone onto default device * Add some debug stuff * Some more debug * Fix issue * Fix bool tensor too
Proposed change(s)
Move POCA critic to default device. Was failing Yamato GPU test without this. Needs to be cherry-picked into R15.
Types of change(s)
Checklist
Other comments