Return atomic #41028

ngimel · 2020-07-06T19:18:11Z

Per title. This is not used currently in the pytorch codebase, but it is a legitimate usecase, and we have extensions that want to do that and are forced to roll their own atomic implementations for non-standard types. Whether atomic op returns old value or not should not affect performance, compiler is able to generate correct code depending on whether return value is used. https://godbolt.org/z/DBU_UW.
Atomic operations for non-standard integer types (1,2 and 8 byte-width) are left as is, with void return.

facebook-github-bot

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-07-08T20:12:49Z

@ngimel merged this pull request in 58d7d91.

jeffdaily · 2020-07-08T21:03:56Z

@ngimel would you support a PR to have a non-returning interface in addition to your change here? Since atomic return values are not currently used in the pytorch codebase, it would be self-documenting to use something like atomicAddNoReturn. Specifically for ROCm, we can better optimize when the user opts in like this. Forcing atomic to return a value prohibits our optimization today.

Which extensions are you referring to that utilize atomic return values?

ngimel · 2020-07-08T21:10:32Z

Some internal custom ops (optimizers mostly) want to have old value, and they are compiling for double in half because that's how dispatch macros work, so they can't use built-in atomicAdd, and are forced to copy-paste things from THCAtomic adding return value. It's a legitimate request for someone to return old value, so having no mechanism for doing that is not ideal.
Sure, we can have atomiAddNoReturn. Out of curiosity, what's preventing ROCm from doing the same thing that nvcc does, and compile to fast instruction if return is ignored, and slow instruction if it's used? I posted godbolt in the PR description illustrating nvcc behavior.

Summary: Enables an important performance optimization for ROCm, in light of the discussion in #41028. CC jithunnair-amd sunway513 Pull Request resolved: #60607 Reviewed By: jbschlosser Differential Revision: D29409894 Pulled By: ngimel fbshipit-source-id: effca258a0f37eaefa35674a7fd19459ca7dc95b

Summary: Enables an important performance optimization for ROCm, in light of the discussion in pytorch#41028. CC jithunnair-amd sunway513 Pull Request resolved: pytorch#60607 Reviewed By: jbschlosser Differential Revision: D29409894 Pulled By: ngimel fbshipit-source-id: effca258a0f37eaefa35674a7fd19459ca7dc95b

Natalia Gimelshein added 2 commits July 6, 2020 12:08

allow atomic ops to return old value

5a6eb59

fix return types

5d97f98

ngimel requested review from ezyang and mcarilli July 6, 2020 19:18

ezyang approved these changes Jul 7, 2020

View reviewed changes

facebook-github-bot reviewed Jul 8, 2020

View reviewed changes

facebook-github-bot closed this in 58d7d91 Jul 8, 2020

facebook-github-bot added the merged label Jul 8, 2020

mruberry added the Merged label Oct 28, 2020

jeffdaily mentioned this pull request Jun 23, 2021

use explicitly non-returning GPU atomics #60607

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Return atomic #41028

Return atomic #41028

Uh oh!

ngimel commented Jul 6, 2020

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented Jul 8, 2020

Uh oh!

jeffdaily commented Jul 8, 2020

Uh oh!

ngimel commented Jul 8, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Return atomic #41028

Return atomic #41028

Uh oh!

Conversation

ngimel commented Jul 6, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 8, 2020

Uh oh!

jeffdaily commented Jul 8, 2020

Uh oh!

ngimel commented Jul 8, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants