inductor: improve the index range check for index_expr vec check #102263

XiaobingSuper · 2023-05-25T10:12:04Z

Stack from ghstack (oldest at bottom):

-> inductor: improve the index range check for index_expr vec check #102263

Fix #102065.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10

[ghstack-poisoned]

pytorch-bot · 2023-05-25T10:12:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102263

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c980fcf:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 251417a3fc16e87771ec92085bd6c896244ab7eb Pull Request resolved: #102263

lezcano

I don't think this fix is correct. The issue that this is hitting comes from the fact that @peterbell10 's optimisation is now providing us with better index variables at compile time (which is good), and this uncovers a latent bug in the C++ code generation. In particular, the following code

pytorch/torch/_inductor/codegen/cpp.py

Lines 1992 to 2009 in 4882cd0

    
           opt_ctx: OptimizationContext = node_ctx.get_opt_ctx() 
        
           assert opt_ctx 
        
           max_expr = expr.replace( 
        
               ir.ModularIndexing, mod_indexing_rep 
        
           ).replace(ir.FloorDiv, indexing_div_rep) 
        
           min_expr = max_expr 
        
           for idx in range(len(self.ranges)): 
        
               max_expr = sympy.maximum( 
        
                   max_expr, 
        
                   self.itervars[idx], 
        
                   sympy.Interval(0, self.ranges[idx]), 
        
               ) 
        
               min_expr = sympy.minimum( 
        
                   min_expr, 
        
                   self.itervars[idx], 
        
                   sympy.Interval(0, self.ranges[idx]), 
        
               ) 
        
           i32_iinfo = numpy.iinfo(numpy.int32)

is not correct.
Here you are trying to bound the value of a multivariate function on a cube (a product of intervals). In the case of this example, the intervals are [0,7] x [0,7] and the function is i0**2 - 2 * i0 * i1 + i1 ** 2 (that's the lowering of (i0 - i1)**2). The code there is trying to find the maximum over this region by calling scipy.maximum. This fails, because maximum just works for univariate functions.

This optimisation is performed in the triton path in
https://github.com/pytorch/pytorch/blob/main/torch/_inductor/optimize_indexing.py
via an eager algorithm. A proper fix here would lift this logic to common.py and use it on the C++ codegen as well. This should be fairly simple, as the code there works at an IR level.

I have #100549 open that touches that code as well (but doesn't change the API). I'll try to have it merged today so that we don't step on each other.

test/inductor/test_cpu_repro.py

lezcano · 2023-05-25T11:42:41Z

test/inductor/test_cpu_repro.py

@@ -376,6 +376,22 @@ def fn(a):
        a = torch.randn(1, 3)
        self.common(fn, (a,))

+    def test_index_propagation_issue_102065(self):


A slightly more concise repro is the following:

import torch @torch.compile def fn(x): x = torch.arange(x.numel()) return (x.unsqueeze(0) - x.unsqueeze(1))**2 fn(torch.randn(8))

Discussed with @XiaobingSuper offline. Probably, we can extend the optimize_indexing.py to reduce the precision of dtype of index_expr too.

torch/_inductor/index_propagation.py

Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

…vec check ghstack-source-id: be24fae8e6b8ae1e65e8dcf19ae780c530ebd288 Pull Request resolved: #102263

Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

…vec check ghstack-source-id: 465b6bc6186d391358a2eedba49e78c7bf4be4a7 Pull Request resolved: #102263

torch/_inductor/codegen/common.py

Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

… check" Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

ghstack-source-id: 9afb4e04462ce04841fcbe5ce2f19cae86f45c2c Pull Request resolved: #102263

ghstack-source-id: 056d86d54d85945756840c6d1f245fc7287cf342 Pull Request resolved: #102263

ghstack-source-id: 9df49eddaae1eebe4c3a92c2686f50d8b492c778 Pull Request resolved: #102263

… check" Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

ghstack-source-id: 7b298e516127342d2713bc8ae86252998115ab27 Pull Request resolved: #102263

jgong5 · 2023-05-31T02:49:17Z

torch/_inductor/optimize_indexing.py

+    if len(free_symbols) == 0:
+        return ValueRanges(expr, expr)
+
+    def replace_symbols_for_deriv(expr, ignore_mod=False):


nit: ignore_mod is never used.

jgong5 · 2023-05-31T02:51:35Z

torch/_inductor/codegen/cpp.py

+                        for k, v in zip(self.itervars, self.ranges)
+                        if k in free_symbols
+                    }
+                    if not vars_ranges:


When would this happen? No free symbols?

Yes, there has a test cast: expr is s0, which is not in vars_ranges.

Got it. I guess it is ks0 in the case of dynamic shape"? Not necessarily in this PR but we may consider to guard the range within int32 range to get more optimizations.

jgong5 · 2023-05-31T03:39:08Z

torch/_inductor/codegen/cpp.py

+                            and expr <= i32_iinfo.max
+                            and expr >= i32_iinfo.min
+                        )
+                    expr_ranges = get_expr_range(expr, vars_ranges)


Does get_expr_range have the assumption that all free symbols in expr exist in vars_range? Would expr contains symbols that are not part of the itervars, like tmp vars from indirect indexing?

No, there doesn't have such an assumption that all free symbols in expr exist in vars_range.

… check" Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

ghstack-source-id: aac5c1e3eb9d2b7967539a7f88d95acd3257fd2f Pull Request resolved: #102263

lezcano · 2023-05-31T07:49:55Z

torch/_inductor/codegen/cpp.py

+                        for k, v in zip(self.itervars, self.ranges)
+                        if k in free_symbols
+                    }
+                    if not vars_ranges or len(vars_ranges) != len(free_symbols):


Is this equivalent to any(x.startwith("tmp") for x in free_symbols)? In other words, here we are asking whether the expression has indirect indexing, as we don't have bounds at hand for those here.
If that's the case, using this other more explicit condition and leaving a comment would help to understand the logic here.

I don't think it is equivalent to any(x.startwith("tmp") for x in free_symbols). There has a case that the expr is just kernel input(s0), which is not related to itervars and it is not buffer indexing.

Right, then same question with startswith("tmp") or startswith("s"). It'd be nice to know more clearly which cases we know how to treat and which we don't.

Yes, I agree with you that knowing more clearly what cases have is better. Let me add this as a TODO work.

Approving to unblock, but I think that a better check here would be to make sure that free_symbols is a subset of self.itervars, i.e., we have all the information to compute the upper bound on the index explicitly.

lezcano · 2023-05-31T14:52:58Z

torch/_inductor/codegen/cpp.py

+                        for k, v in zip(self.itervars, self.ranges)
+                        if k in free_symbols
+                    }
+                    if not vars_ranges or len(vars_ranges) != len(free_symbols):


Approving to unblock, but I think that a better check here would be to make sure that free_symbols is a subset of self.itervars, i.e., we have all the information to compute the upper bound on the index explicitly.

XiaobingSuper · 2023-06-01T00:13:03Z

@pytorchbot merge

pytorchmergebot · 2023-06-01T00:15:05Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…orch#102263) Fix pytorch#102065. Pull Request resolved: pytorch#102263 Approved by: https://github.com/lezcano, https://github.com/peterbell10, https://github.com/jgong5

inductor: fallback when IndexPropVars is empty

4c70634

[ghstack-poisoned]

github-actions bot added ciflow/inductor module: inductor labels May 25, 2023

XiaobingSuper added a commit that referenced this pull request May 25, 2023

inductor: fallback when IndexPropVars is empty

90015a3

ghstack-source-id: 251417a3fc16e87771ec92085bd6c896244ab7eb Pull Request resolved: #102263

XiaobingSuper requested review from peterbell10 and lezcano May 25, 2023 10:12

XiaobingSuper added the release notes: inductor label May 25, 2023

pytorchbot added the open source label May 25, 2023

lezcano requested changes May 25, 2023

View reviewed changes

peterbell10 requested changes May 25, 2023

View reviewed changes

torch/_inductor/index_propagation.py Outdated Show resolved Hide resolved

Update on "inductor: fallback when IndexPropVars is empty"

82792eb

Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

XiaobingSuper added a commit that referenced this pull request May 30, 2023

inductor: using node buffer size to check index range for index_expr …

06afe35

…vec check ghstack-source-id: be24fae8e6b8ae1e65e8dcf19ae780c530ebd288 Pull Request resolved: #102263

Update on "inductor: fallback when IndexPropVars is empty"

d0b5a4d

Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

XiaobingSuper requested review from lezcano and peterbell10 May 30, 2023 07:03

XiaobingSuper added a commit that referenced this pull request May 30, 2023

inductor: using node buffer size to check index range for index_expr …

b8e466b

…vec check ghstack-source-id: 465b6bc6186d391358a2eedba49e78c7bf4be4a7 Pull Request resolved: #102263

XiaobingSuper marked this pull request as draft May 30, 2023 07:44

lezcano reviewed May 30, 2023

View reviewed changes

torch/_inductor/codegen/common.py Outdated Show resolved Hide resolved

XiaobingSuper added 3 commits May 30, 2023 10:13

Update on "inductor: fallback when IndexPropVars is empty"

0e5f69b

Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

Update on "inductor: improve the index range check for index_expr vec…

5c77e56

… check" Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

Update on "inductor: improve the index range check for index_expr vec…

b89576a

… check" Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

XiaobingSuper added a commit that referenced this pull request May 30, 2023

inductor: improve the index range check for index_expr vec check

aeb05c5

ghstack-source-id: 9afb4e04462ce04841fcbe5ce2f19cae86f45c2c Pull Request resolved: #102263

XiaobingSuper marked this pull request as ready for review May 30, 2023 14:47

XiaobingSuper changed the title ~~inductor: fallback when IndexPropVars is empty~~ inductor: improve the index range check for index_expr vec check May 30, 2023

XiaobingSuper added a commit that referenced this pull request May 30, 2023

inductor: improve the index range check for index_expr vec check

4174b31

ghstack-source-id: 056d86d54d85945756840c6d1f245fc7287cf342 Pull Request resolved: #102263

XiaobingSuper requested review from lezcano and jgong5 May 30, 2023 14:57

XiaobingSuper added a commit that referenced this pull request May 30, 2023

inductor: improve the index range check for index_expr vec check

35c7118

ghstack-source-id: 9df49eddaae1eebe4c3a92c2686f50d8b492c778 Pull Request resolved: #102263

Update on "inductor: improve the index range check for index_expr vec…

4121ea6

… check" Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

XiaobingSuper added a commit that referenced this pull request May 31, 2023

inductor: improve the index range check for index_expr vec check

521e567

ghstack-source-id: 7b298e516127342d2713bc8ae86252998115ab27 Pull Request resolved: #102263

jgong5 reviewed May 31, 2023

View reviewed changes

Update on "inductor: improve the index range check for index_expr vec…

c980fcf

… check" Fix #102065. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 [ghstack-poisoned]

XiaobingSuper added a commit that referenced this pull request May 31, 2023

inductor: improve the index range check for index_expr vec check

179510c

ghstack-source-id: aac5c1e3eb9d2b7967539a7f88d95acd3257fd2f Pull Request resolved: #102263

XiaobingSuper requested a review from jgong5 May 31, 2023 06:26

lezcano reviewed May 31, 2023

View reviewed changes

jgong5 approved these changes May 31, 2023

View reviewed changes

peterbell10 approved these changes May 31, 2023

View reviewed changes

lezcano approved these changes May 31, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 1, 2023

pytorchmergebot added the merging label Jun 1, 2023

pytorchmergebot added Merged and removed merging labels Jun 1, 2023

pytorchmergebot closed this in 49cd184 Jun 1, 2023

facebook-github-bot deleted the gh/XiaobingSuper/125/head branch June 8, 2023 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inductor: improve the index range check for index_expr vec check #102263

inductor: improve the index range check for index_expr vec check #102263

XiaobingSuper commented May 25, 2023 •

edited

pytorch-bot bot commented May 25, 2023 •

edited

lezcano left a comment

lezcano May 25, 2023

jgong5 May 30, 2023

jgong5 May 31, 2023

XiaobingSuper May 31, 2023

jgong5 May 31, 2023

XiaobingSuper May 31, 2023 •

edited

jgong5 May 31, 2023

jgong5 May 31, 2023

XiaobingSuper May 31, 2023

lezcano May 31, 2023

XiaobingSuper May 31, 2023

lezcano May 31, 2023

XiaobingSuper May 31, 2023

lezcano May 31, 2023

lezcano May 31, 2023

XiaobingSuper commented Jun 1, 2023

pytorchmergebot commented Jun 1, 2023

	opt_ctx: OptimizationContext = node_ctx.get_opt_ctx()
	assert opt_ctx
	max_expr = expr.replace(
	ir.ModularIndexing, mod_indexing_rep
	).replace(ir.FloorDiv, indexing_div_rep)
	min_expr = max_expr
	for idx in range(len(self.ranges)):
	max_expr = sympy.maximum(
	max_expr,
	self.itervars[idx],
	sympy.Interval(0, self.ranges[idx]),
	)
	min_expr = sympy.minimum(
	min_expr,
	self.itervars[idx],
	sympy.Interval(0, self.ranges[idx]),
	)
	i32_iinfo = numpy.iinfo(numpy.int32)

inductor: improve the index range check for index_expr vec check #102263

inductor: improve the index range check for index_expr vec check #102263

Conversation

XiaobingSuper commented May 25, 2023 • edited

pytorch-bot bot commented May 25, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102263

✅ No Failures

lezcano left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

XiaobingSuper May 31, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

XiaobingSuper commented Jun 1, 2023

pytorchmergebot commented Jun 1, 2023

Merge started

XiaobingSuper commented May 25, 2023 •

edited

pytorch-bot bot commented May 25, 2023 •

edited

XiaobingSuper May 31, 2023 •

edited