[inductor] Fix nan-handling of max and min reductions #99881

peterbell10 · 2023-04-24T14:14:08Z

Stack from ghstack (oldest at bottom):

This adds helpers that replace tritons minimum, maximum, min and
max with the correct NaN prrpagation. I also removed
ops.int_minimum in favor of ops.minimum because we can just omit
the nan-checks by checking the dtype.

cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire

pytorch-bot · 2023-04-24T14:14:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99881

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Degradation on most runner types due to networking outage

✅ No Failures

As of commit 22e315e:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. [ghstack-poisoned]

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. ghstack-source-id: b3428393e95b4fdbd64ddda22f0d50fc01cf0dcf Pull Request resolved: pytorch#99881

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. ghstack-source-id: b6e1fd2ca0e3ec5ad21966a93885c87e3f2904b7 Pull Request resolved: pytorch#99881

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. ghstack-source-id: 2a79ad0d0989ac5338e25e4f6bfcefbcb87ef7f9 Pull Request resolved: pytorch#99881

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]

ngimel · 2023-04-27T00:08:43Z

torch/_inductor/triton_helpers.py

+@triton.jit
+def is_floating(x):
+    # Addition to promote scalars to tensor
+    x += tl.zeros((1,), tl.int1)


does this change generated code/increase overhead? minimum/maximum fns are used pretty often (including in register-sensitive contexts, e.g. when fusing relu to matmul), so we should avoid increasing register pressure.

If you mean this exact line then it will get DCE'd. Or do you mean the NaN checks?

Yeah this line, if it's dce'd it's great

Just to substantiate this a bit more, here is the triton IR generated for triton_helper.min:

tt.func private @"min__fp32S1_4S__1cconstexpr[1]"(%arg0: tensor<1x4xf32>) -> tensor<1xf32> { %0 = "tt.reduce"(%arg0) ({ ^bb0(%arg1: f32, %arg2: f32): %1 = tt.call @minimum__fp32_fp32__(%arg1, %arg2) : (f32, f32) -> f32 tt.reduce.return %1 : f32 }) {axis = 1 : i32} : (tensor<1x4xf32>) -> tensor<1xf32> tt.return %0 : tensor<1xf32> } tt.func private @minimum__fp32_fp32__(%arg0: f32, %arg1: f32) -> f32 { %0 = arith.cmpf olt, %arg0, %arg1 : f32 %1 = tt.call @is_floating__fp32__(%arg0) : (f32) -> i1 %2 = scf.if %1 -> (i1) { %4 = arith.cmpf une, %arg0, %arg0 : f32 %5 = arith.ori %0, %4 : i1 scf.yield %5 : i1 } else { scf.yield %0 : i1 } %3 = arith.select %2, %arg0, %arg1 : f32 tt.return %3 : f32 } tt.func private @is_floating__fp32__(%arg0: f32) -> i1 { %0 = tt.call @"zeros____0cconstexpr[(constexpr[1],)]_1cconstexpr[int1]"() : () -> tensor<1xi1> %1 = tt.splat %arg0 : (f32) -> tensor<1xf32> %2 = arith.uitofp %0 : tensor<1xi1> to tensor<1xf32> %3 = arith.addf %1, %2 : tensor<1xf32> %true = arith.constant true tt.return %true : i1 }

Admittedly, it's pretty ugly. However, it is massively simplified after just the inlining pass, which is the very first pass in triton's optimizer.

%11 = "tt.reduce"(%10) ({ ^bb0(%arg5: f32, %arg6: f32): %16 = arith.cmpf olt, %arg5, %arg6 : f32 %17 = arith.cmpf une, %arg5, %arg5 : f32 %18 = arith.ori %16, %17 : i1 %19 = arith.select %18, %arg5, %arg6 : f32 tt.reduce.return %19 : f32 }) {axis = 1 : i32} : (tensor<1x4xf32>) -> tensor<1xf32>

You can see it removed the branch on is_floating and all of the associated code.

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. ghstack-source-id: cf09703d4ce7a6bf3e1083e088005b1f1f8ca077 Pull Request resolved: pytorch#99881

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. ghstack-source-id: 21300ac256d53b03c957ec821a0b2bfe53a324d4 Pull Request resolved: pytorch#99881

Revert "[inductor] Stop using `x + tl.zeros(...)` in generated triton (#100163)" This reverts commit 5b98910. Revert "[inductor] Fix argmin/max with duplicate values (#99920)" This reverts commit 659dcc5. Revert "[inductor] Fix nan-handling of max and min reductions (#99881)" This reverts commit f9c3fcd. [ghstack-poisoned]

Revert "[inductor] Stop using `x + tl.zeros(...)` in generated triton (#100163)" This reverts commit 5b98910. Revert "[inductor] Fix argmin/max with duplicate values (#99920)" This reverts commit 659dcc5. Revert "[inductor] Fix nan-handling of max and min reductions (#99881)" This reverts commit f9c3fcd. ghstack-source-id: 85531baedfb245e48512be97c0ed90eba1685664 Pull Request resolved: #100517

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. ghstack-source-id: 21300ac256d53b03c957ec821a0b2bfe53a324d4 Pull Request resolved: pytorch#99881

This adds helpers that replace tritons `minimum`, `maximum`, `min` and `max` with the correct NaN prrpagation. I also removed `ops.int_minimum` in favor of `ops.minimum` because we can just omit the nan-checks by checking the dtype. cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]

This was referenced Apr 24, 2023

[inductor] Create triton_helpers module for helper functions #99880

Closed

[inductor] Move reduction_type special cases out of make_reduction #99660

Closed

[inductor] Reduce duplication of reduction combine functions #99661

Closed

github-actions bot added ciflow/inductor module: inductor labels Apr 24, 2023

pytorchbot added the open source label Apr 24, 2023

peterbell10 added topic: bug fixes topic category release notes: inductor labels Apr 24, 2023

peterbell10 added 3 commits April 24, 2023 16:34

peterbell10 mentioned this pull request Apr 24, 2023

[inductor] Fix argmin/max with duplicate values #99920

Closed

peterbell10 added 2 commits April 24, 2023 20:41

peterbell10 marked this pull request as ready for review April 25, 2023 12:39

peterbell10 requested a review from ngimel April 25, 2023 12:39

ngimel reviewed Apr 27, 2023

View reviewed changes

ngimel approved these changes Apr 27, 2023

View reviewed changes

peterbell10 mentioned this pull request Apr 27, 2023

[inductor] Stop using x + tl.zeros(...) in generated triton #100163

Closed

pytorchmergebot added the Merged label Apr 27, 2023

pytorchmergebot closed this in f9c3fcd Apr 27, 2023

ezyang mentioned this pull request May 3, 2023

Revert use of tl.reduce #100517

Closed

peterbell10 reopened this May 3, 2023

peterbell10 mentioned this pull request May 3, 2023

[DO NOT MERGE] Revert use of tl.reduce #100541

Closed

peterbell10 mentioned this pull request May 3, 2023

[inductor] Enable conditional use of tl.reduce #100569

Closed

peterbell10 closed this May 3, 2023

facebook-github-bot deleted the gh/peterbell10/541/head branch June 8, 2023 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor] Fix nan-handling of max and min reductions #99881

[inductor] Fix nan-handling of max and min reductions #99881

peterbell10 commented Apr 24, 2023 •

edited

pytorch-bot bot commented Apr 24, 2023 •

edited

ngimel Apr 27, 2023

peterbell10 Apr 27, 2023

ngimel Apr 27, 2023

peterbell10 Apr 27, 2023

[inductor] Fix nan-handling of max and min reductions #99881

[inductor] Fix nan-handling of max and min reductions #99881

Conversation

peterbell10 commented Apr 24, 2023 • edited

pytorch-bot bot commented Apr 24, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99881

❗ 1 Active SEVs

✅ No Failures

ngimel Apr 27, 2023

Choose a reason for hiding this comment

peterbell10 Apr 27, 2023

Choose a reason for hiding this comment

ngimel Apr 27, 2023

Choose a reason for hiding this comment

peterbell10 Apr 27, 2023

Choose a reason for hiding this comment

peterbell10 commented Apr 24, 2023 •

edited

pytorch-bot bot commented Apr 24, 2023 •

edited