[inductor] Use shape env bounds in inductor bounds.py (#106175) #106568

masnesral · 2023-08-03T19:18:41Z

Stack from ghstack (oldest at bottom):

-> [inductor] Use shape env bounds in inductor bounds.py (#106175) #106568

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization).

Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175)

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) [ghstack-poisoned]

pytorch-bot · 2023-08-03T19:18:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106568

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ 3 Unrelated Failures

As of commit 19d1452:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) ghstack-source-id: c14e787da45d219bc537148a7e60f3d65d4684c1 Pull Request resolved: #106568

eellison

awesome 🚀 I have a couple comments, but looks good!

eellison · 2023-08-03T19:42:12Z

torch/_inductor/bounds.py

+        tracing_context = torch._guards.TracingContext.get()
+        if tracing_context:
+            subs = [
+                (k, v.upper)


we should have separate substitutions for upper and lower bound

@eellison - thanks; meant to ask you about the lower bound handling. I might not understand exactly what we're doing here. If we have a range that's a symbolic expression of 4 * s0, for example, but we also have a constrained var_range available like s0: ValueRanges(lower=2, upper=100) ... I guess I assumed that only the upper bound was relevant for the int64 -> int32 optimization and we could use 4 * 100 = 400 for the upper bound. How is the lower bound relevant exactly? Do you just mean that we can also further constrain the lower bound on this line (and not assume 0)?

ValueRanges(0, s - 1)

If x has bounds [a, b], then -x has bounds [-b, -a], so you need to track both.

torch/_inductor/lowering.py

eellison · 2023-08-03T19:45:01Z

test/inductor/test_torchinductor.py

+                return a[y.to(torch.int64)]
+
+            def fn2(a: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
+                from torch._export.constraints import constrain_as_size


nit, this can be a top level import

lezcano

The same should be done in bound_sympy, as that will also affect the bounds we use to elide guards.

Even more, if var_ranges can be an expression, then we can probably just implement this in bound_sympy and delegate the calculation of the bounds in replacement_vars to bound_sympy

lezcano · 2023-08-03T20:38:34Z

torch/_inductor/bounds.py

+            ]
+        self.replacement_vals = {}
+        for k, v in loop_body.var_ranges.items():
+            s = v.subs(subs)


Is var_ranges just a symint, or can it be a expression? If it can be an expression, you should use bound_sympy to compute the actual bounds given the subs. If it's not, we can simply look up v in subs and get its range if it has free_symbols.

@lezcano (Pardon my being a noob here). loop_body.var_ranges can contain expressions, e.g., I see 4 * s0. Given that, are you simply suggesting to use the bound_sympy helper to evaluate the bounds instead of the subs method I've used here? Playing around with that helper function, it seems like it indeed probably does what I want. But it sounds like you anticipate some modifications are needed in that helper first? If so, I don't understand that part.

Have a look at how bound_sympy is implemented. Before calling into the bulk of the logic, it tries to do something very similar to what you're doing here. Just move the logic you implemented here to bound_sympy and you should be able to avoid repeating it :)

Thanks. I might be naive, but looks like bound_sympy might "just work". I'll put up a new rev to show you what I mean.

…ounds.py (#106175)" Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) [ghstack-poisoned]

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) ghstack-source-id: 07391fa49b1a8a2266428854080c4870bef7bc42 Pull Request resolved: #106568

lezcano · 2023-08-04T21:09:41Z

torch/_inductor/bounds.py

+        context = torch._guards.TracingContext.get()
+        ranges = context.fake_mode.shape_env.var_to_range if context else {}


We want to move this to bound_sympy so that other users of bound_sympy also get these better bounds for free.

@lezcano I don't understand this code base (obviously), so I just want to extra clarify: Are you sure that's always appropriate? Purely from an API perspective, it seems a little strange to silently add new ValueRanges beyond what were explicitly provided by the caller. Do all usages of bound_sympy want these ranges from the tracing context included?

The tracing context should only be set if its valid to use. If it is set, we can use it.

…6175)" Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) [ghstack-poisoned]

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) ghstack-source-id: 99dbbba40c768555f3835378bf3f3387a6e293bd Pull Request resolved: #106568

eellison

has some compilation time issues but I think those may already exist, so maybe it's worth addressing those in a follow up. will defer to @lezcano

eellison · 2023-08-07T20:41:54Z

torch/utils/_sympy/value_ranges.py

+
+    # If there's a tracing context, augment available constrained ranges.
+    context = torch._guards.TracingContext.get()
+    if context:


This will be O(n^2) if we do it for every symbol. Although it looks like bound_sympy already has O(n^2) behavior, e.g.

pytorch/torch/utils/_sympy/value_ranges.py

Lines 559 to 562 in 3d0c767

# Give some bounds to the free variables via their SymPy assumptions

# TODO A better way of doing this would be to assign them a range upon creation, as

# size variables can come with a lower bound of 2, as we specialise on 0 and 1

ranges = deepcopy(ranges)

and the calculating of bounds happens recursively without caching the calculation (as far as i can tell):

https://github.com/pytorch/pytorch/blob/main/torch/utils/_sympy/interp.py#L89-L93

I don't know what you mean by "calculating the bounds happens recursively". As far as I can tell, I don't think it does.

Something that can be done better is avoiding to do the deepcopy, and simply merging the dictionaries as {**x, **y}. This would avoid a few unnecessary copies. Same same for the unbounded_vars bit, where we can avoid creating a new dictionary if we created it here.

Recursive case
args = [sympy_interp(analysis, env, arg) for arg in expr.args] # type: ignore[arg-type]
handler_name = handlers()[expr.func]
handler = getattr(analysis, handler_name)

We're recursively computing the bounds of symbols without storing them in the environment.

But that doesn't call bound_sympy

lezcano

Just two small comments, otherwise it LGTM

lezcano · 2023-08-08T08:46:24Z

torch/utils/_sympy/value_ranges.py

+
+    # If there's a tracing context, augment available constrained ranges.
+    context = torch._guards.TracingContext.get()
+    if context:


I don't know what you mean by "calculating the bounds happens recursively". As far as I can tell, I don't think it does.

Something that can be done better is avoiding to do the deepcopy, and simply merging the dictionaries as {**x, **y}. This would avoid a few unnecessary copies. Same same for the unbounded_vars bit, where we can avoid creating a new dictionary if we created it here.

lezcano · 2023-08-08T08:46:43Z

torch/_inductor/lowering.py

@@ -4520,6 +4520,14 @@ def fn(*args, **kwargs):
 register_inplace(aten.__ixor__, aten.__xor__)


+@register_lowering(aten.sym_constrain_range)
+def sym_constrain_range(a, min, max):


This is not used?

@lezcano , I believe this part is needed. Otherwise we get:

... torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: MissingOperatorWithoutDecomp: missing lowering target: aten.sym_constrain_range.default

…nds.py (#106175)" Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) [ghstack-poisoned]

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) ghstack-source-id: 2ecd073c871df1aaa1d0837a84a32152d47a13b5 Pull Request resolved: #106568

…6175)" Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) [ghstack-poisoned]

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) ghstack-source-id: bd9fd39e9593651948e44d2d2bfca7094909a1b7 Pull Request resolved: #106568

lezcano

Nice addition! Thank you!

… bounds.py (#106175)" Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) [ghstack-poisoned]

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) ghstack-source-id: a9315817e1dd14905d3f728252ba4ca04fb486e9 Pull Request resolved: #106568

…6175)" Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) [ghstack-poisoned]

Summary: If constrained ranges are available, use them in bounds.py before value range analysis (to enable Int64 -> Int32 optimization). Test Plan: New unit test in test_torchinductor.py to mark a tensor as dynamic, then constrain with constrain_as_size (as outlined in #106175) ghstack-source-id: fe546e1b50430f2648ab9724dc0b21d0380bf2f0 Pull Request resolved: #106568

masnesral · 2023-08-10T21:21:57Z

@pytorchbot merge

pytorchmergebot · 2023-08-10T21:23:55Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

github-actions bot added module: inductor ciflow/inductor labels Aug 3, 2023

masnesral requested a review from eellison August 3, 2023 19:19

eellison reviewed Aug 3, 2023

View reviewed changes

eellison requested a review from lezcano August 3, 2023 19:50

lezcano reviewed Aug 3, 2023

View reviewed changes

lezcano requested a review from ysiraichi August 3, 2023 20:43

lezcano reviewed Aug 4, 2023

View reviewed changes

masnesral requested review from lezcano and eellison August 7, 2023 18:53

eellison reviewed Aug 7, 2023

View reviewed changes

lezcano reviewed Aug 8, 2023

View reviewed changes

lezcano mentioned this pull request Aug 8, 2023

[inductor] Enable multilayer reductions with dynamic shapes #106747

Closed

eellison approved these changes Aug 8, 2023

View reviewed changes

masnesral requested a review from lezcano August 9, 2023 01:48

lezcano approved these changes Aug 9, 2023

View reviewed changes

masnesral added the topic: not user facing topic category label Aug 10, 2023

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 10, 2023

pytorchmergebot added the merging label Aug 10, 2023

pytorchmergebot added Merged and removed merging labels Aug 11, 2023

pytorchmergebot closed this in 8f77433 Aug 11, 2023

facebook-github-bot deleted the gh/masnesral/1/head branch August 14, 2023 14:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor] Use shape env bounds in inductor bounds.py (#106175) #106568

[inductor] Use shape env bounds in inductor bounds.py (#106175) #106568

masnesral commented Aug 3, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Aug 3, 2023 •

edited

eellison left a comment

eellison Aug 3, 2023

masnesral Aug 3, 2023

lezcano Aug 3, 2023

eellison Aug 3, 2023

lezcano left a comment

lezcano Aug 3, 2023

masnesral Aug 3, 2023

lezcano Aug 4, 2023

masnesral Aug 4, 2023

lezcano Aug 4, 2023

masnesral Aug 7, 2023

eellison Aug 7, 2023

eellison left a comment

eellison Aug 7, 2023

lezcano Aug 8, 2023

eellison Aug 8, 2023

lezcano Aug 8, 2023

lezcano left a comment

lezcano Aug 8, 2023

lezcano Aug 8, 2023

masnesral Aug 9, 2023

lezcano left a comment

masnesral commented Aug 10, 2023

pytorchmergebot commented Aug 10, 2023

		context = torch._guards.TracingContext.get()
		ranges = context.fake_mode.shape_env.var_to_range if context else {}

	# Give some bounds to the free variables via their SymPy assumptions
	# TODO A better way of doing this would be to assign them a range upon creation, as
	# size variables can come with a lower bound of 2, as we specialise on 0 and 1
	ranges = deepcopy(ranges)

[inductor] Use shape env bounds in inductor bounds.py (#106175) #106568

[inductor] Use shape env bounds in inductor bounds.py (#106175) #106568

Conversation

masnesral commented Aug 3, 2023 • edited by pytorch-bot bot

pytorch-bot bot commented Aug 3, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106568

✅ 3 Unrelated Failures

eellison left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lezcano left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eellison left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lezcano left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lezcano left a comment

Choose a reason for hiding this comment

masnesral commented Aug 10, 2023

pytorchmergebot commented Aug 10, 2023

Merge started

masnesral commented Aug 3, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Aug 3, 2023 •

edited