Complete revamp of float/promotion sympy handling #126905

ezyang · 2024-05-22T20:41:14Z

Stack from ghstack (oldest at bottom):

At a high level, the idea behind this PR is:

Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.)
Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers.

The story begins in torch/utils/_sympy/functions.py. Here, I make some changes to how we represent certain operations in sympy expressions:

FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing).
ModularIndexing, LShift, RShift now assert they are given integer inputs.
Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver
TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division.
Trunc is split to TruncToFloat and TruncToInt.
Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result.
RoundDecimal updated to consistently only ever return a float
Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing)

In torch/init.py, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations. Also, we modify torch.sym_min and torch.sym_max to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information.

We also need to introduce some new op handlers in torch/_inductor/ops_handler.py:

to_int for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy
int_truediv for Python-style integer true division, which has higher precision than casting to floats and then running truediv

These changes have consequences. First, we need to make some administrative changes:

Actually wire up these Sympy functions from SymInt/SymFloat in torch/fx/experimental/sym_node.py, including the new promotion rules (promote2)
Add support for new Sympy functions in torch/utils/_sympy/interp.py, torch/utils/_sympy/reference.py
- In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function
- TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here
Add printer support for the Sympy functions in torch/_inductor/codegen/common.py, torch/_inductor/codegen/cpp_utils.py, torch/_inductor/codegen/triton.py. int_truediv and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet
Update ValueRanges logic to use new sympy functions in torch/utils/_sympy/value_ranges.py. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions.

In torch/fx/experimental/symbolic_shapes.py we need to make some symbolic reasoning adjustments:

Avoid generation of rational subexpressions by removing simplification of x // y into floor(x / y). This simplification then triggers an addition simplification rule (x + y) / c --> x / c + y / c which is bad because x / c is a rational number now
_assert_bound_is_rational is no more, we no longer generate rational bounds
Don't intersect non-int value ranges with the int_range
Support more sympy Functions for guard SYMPY_INTERP
Assert the type of value range is consistent with the variable type

The new asserts uncovered necessary bug fixes:

torch/_inductor/codegen/cpp.py, torch/_inductor/select_algorithm.py, torch/_inductor/sizevars.py - Ensure Wild/Symbol manually allocated in Inductor is marked is_integer so it's accepted to build expressions
torch/_inductor/utils.py - make sure you actually pass in sympy.Expr to these functions
torch/_inductor/ir.py - make_contiguous_strides_for takes int/SymInt, not sympy.Expr!
torch/export/dynamic_shapes.py - don't use infinity to represent int ranges, instead use sys.maxsize - 1

Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at test/test_proxy_tensor.py

Reland notes. This requires this internal fbcode diff https://www.internalfb.com/phabricator/paste/view/P1403322587 but I cannot prepare the diff codev due to https://fb.workplace.com/groups/osssupport/posts/26343544518600814/

It also requires this Executorch PR pytorch/executorch#3911 but the ET PR can be landed prior to this landing.

Signed-off-by: Edward Z. Yang ezyang@meta.com

cc @gchanan @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @albanD @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-05-22T20:41:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126905

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rebase your PRs: Unstable CUDA signal in CI caused by cudnn 9 update

✅ You can merge normally! (1 Unrelated Failure)

As of commit 3861dde with merge base f681e36 ():

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

inductor / linux-jammy-cpu-py3.8-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf, 1, 1, linux.24xl.spr-metal, unstable) (gh) (#126993)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: 837ba28a2bf0f6b6273bb036836a4b60cf327652 Pull Request resolved: #126905

[ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: a258465552bbd870de16d09c790828a63dca8a4b Pull Request resolved: #126905

[ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: 59f5e496f871ea3a8bba5de59647910a53c1bd8b Pull Request resolved: #126905

ezyang · 2024-05-23T04:50:33Z

Judging from early testing results, the removal of FloorDiv(x, y) --> floor(x / y) needs to be compensated for, need to find some alternate reasoning to solve for it.

[ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: fcec215ed5360354e905b0644148676901ed4c29 Pull Request resolved: #126905

ezyang · 2024-05-23T18:30:23Z

torch/_inductor/graph.py

-                if isinstance(n.meta["val"], torch.SymInt):
+                if isinstance(
+                    n.meta["val"], (torch.SymInt, torch.SymFloat, torch.SymBool)
+                ):


This one is odd, I got a completely incorrect lowering with operator.truediv was sent to lowerings. This is clearly correct, but it's probably also worth figuring out why we have incorrect lowerings in the lowering dict.

tracking issue

ezyang · 2024-05-23T18:36:57Z

torch/fx/experimental/symbolic_shapes.py

@@ -4653,7 +4658,7 @@ def trivial_solve(lhs, rhs):
                            # Propagate the value ranges.  It doesn't really
                            # matter if we use truediv or floordiv, because we
                            # have established divisibility.
-                            self._update_var_to_range(i1, SymPyValueRangeAnalysis.truediv(
+                            self._update_var_to_range(i1, SymPyValueRangeAnalysis.floordiv(


I'm not sure this actually does anything lol

?
Don't we have tests that test these things?

I changed it from to truediv to floordiv because there was a test that was failing at the time, but then I changed some more stuff so I don't remember if this actually matters now. floordiv is nice though because it produces an int, so the typing is more accurate this way.

Summary: Pull Request resolved: pytorch#3911 Original PR pytorch/pytorch#126905 Below line forces Sandcastle to run only specified contbuilds. build_only[github-export-checks,executorch,pytorch_benchmark,pytorch_quantization,pytorch_distributed,pytorch_distributed_gpu,pytorch_dynamo_inductor,pytorch_functorch,pytorch_fx2trt,pytorch_diff_train_tests_ads,glow_fb_pytorch_tests,training_platform,training_platform_compatibility,training_toolkit_applications,training_toolkit_examples,training_toolkit_model_optimization,dper3_pytorch,xplat_caffe2,pytorch_dev,android-pytorch-instrumentation-tests,smart__pytorch__github_first_try_merge,frl-target-determinator,f6-buck,training_platform_for_github,sigmoid_cpu,sigmoid_gpu,aiplatform_modelprocessing_for_github,accelerators_workloads_models_slimdsnn,ae_aotinductor_benchmark_test,aps_,apf,aps_deterministic_ne_tests,dper_lib_silvertorch,torchrec,torchrec_fb,deeplearning_aot_inductor,aiplatform_modelstore] #skipfbcodelongtail Differential Revision: D58294450

[ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: 3c0bd424790ef16ebcef4ea058edec475346f5ea Pull Request resolved: #126905

ezyang · 2024-06-09T01:16:33Z

@pytorchbot merge

pytorchmergebot · 2024-06-09T01:18:39Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Pull Request resolved: pytorch#3911 Original PR pytorch/pytorch#126905 Below line forces Sandcastle to run only specified contbuilds. build_only[github-export-checks,executorch,pytorch_benchmark,pytorch_quantization,pytorch_distributed,pytorch_distributed_gpu,pytorch_dynamo_inductor,pytorch_functorch,pytorch_fx2trt,pytorch_diff_train_tests_ads,glow_fb_pytorch_tests,training_platform,training_platform_compatibility,training_toolkit_applications,training_toolkit_examples,training_toolkit_model_optimization,dper3_pytorch,xplat_caffe2,pytorch_dev,android-pytorch-instrumentation-tests,smart__pytorch__github_first_try_merge,frl-target-determinator,f6-buck,training_platform_for_github,sigmoid_cpu,sigmoid_gpu,aiplatform_modelprocessing_for_github,accelerators_workloads_models_slimdsnn,ae_aotinductor_benchmark_test,aps_,apf,aps_deterministic_ne_tests,dper_lib_silvertorch,torchrec,torchrec_fb,deeplearning_aot_inductor,aiplatform_modelstore] #skipfbcodelongtail Differential Revision: D58294450

ezyang · 2024-06-09T01:50:02Z

torch/fx/experimental/symbolic_shapes.py

+                    # done.
+                    rat_b_bound = self.bound_sympy(r[1])
+                    b_bound = ValueRanges(CeilToInt(rat_b_bound.lower), FloorToInt(rat_b_bound.upper))
+                    self._update_var_to_range(b, b_bound)


These are the new changes from the internal revert. It turned out that the old logic (that discarded solutions here which were not integral) impeded replacements entirely when you had a nontrivial value range on the symbol being substituted. So I reallow rational intermediate values and round them at the very end (since fractional solutions cannot be real entries in the value range). This also works because I deleted all of the is integer asserts in our custom functions.

More discussion at #128053

ezyang · 2024-06-09T01:51:15Z

This PR is BC breaking in the following way: if you had an exported program which had floating point size compute (e.g., doing true division on a shape), we may now export torch.sym_float operator whereas previously we relied on implicit promotion in truediv operator. This change in export format is why I needed to adjust Executorch repository.

shazqadeer · 2024-06-09T22:38:22Z

FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc.

The above sentence is from the PR summary.

Wondering if the new treatment of float floor division is accurate from the point of view of Python semantics?

ezyang · 2024-06-10T01:15:46Z

Wondering if the new treatment of float floor division is accurate from the point of view of Python semantics?

In fact I did check CPython sources for how floordiv on floats was implemented and it is done precisely this way.

Summary: At a high level, the idea behind this PR is: * Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.) * Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers. The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions: * FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing). * ModularIndexing, LShift, RShift now assert they are given integer inputs. * Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver * TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division. * Trunc is split to TruncToFloat and TruncToInt. * Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result. * RoundDecimal updated to consistently only ever return a float * Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing) In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations. Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information. We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**: * `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy * `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv` These changes have consequences. First, we need to make some administrative changes: * Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2) * Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py** * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here * Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet * Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions. In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments: * Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now * `_assert_bound_is_rational` is no more, we no longer generate rational bounds * Don't intersect non-int value ranges with the `int_range` * Support more sympy Functions for guard SYMPY_INTERP * Assert the type of value range is consistent with the variable type The new asserts uncovered necessary bug fixes: * **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions * **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions * **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr! * **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1 Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py** **Reland notes.** This requires this internal fbcode diff https://www.internalfb.com/phabricator/paste/view/P1403322587 but I cannot prepare the diff codev due to https://fb.workplace.com/groups/osssupport/posts/26343544518600814/ It also requires this Executorch PR #3911 but the ET PR can be landed prior to this landing. Signed-off-by: Edward Z. Yang <ezyang@meta.com> X-link: pytorch/pytorch#126905 Approved by: https://github.com/xadupre, https://github.com/lezcano bypass-github-export-checks Reviewed By: atalman Differential Revision: D58333817 Pulled By: ezyang fbshipit-source-id: 7b6c6f8184db7ca4ac55fc938ac97183f6969ce4

In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range. After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better. But especially after #126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. **test/test_sympy_utils.py** describes some basic properties of the number, and **torch/utils/_sympy/numbers.py** has the actual implementation. The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments. Fixes #127396 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #127693 Approved by: https://github.com/lezcano ghstack dependencies: #126905

At a high level, the idea behind this PR is: * Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.) * Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers. The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions: * FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing). * ModularIndexing, LShift, RShift now assert they are given integer inputs. * Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver * TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division. * Trunc is split to TruncToFloat and TruncToInt. * Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result. * RoundDecimal updated to consistently only ever return a float * Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing) In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations. Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information. We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**: * `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy * `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv` These changes have consequences. First, we need to make some administrative changes: * Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2) * Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py** * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here * Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet * Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions. In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments: * Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now * `_assert_bound_is_rational` is no more, we no longer generate rational bounds * Don't intersect non-int value ranges with the `int_range` * Support more sympy Functions for guard SYMPY_INTERP * Assert the type of value range is consistent with the variable type The new asserts uncovered necessary bug fixes: * **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions * **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions * **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr! * **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1 Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py** Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: pytorch#126905 Approved by: https://github.com/xadupre, https://github.com/lezcano

This reverts commit f0dd11d. Reverted pytorch#128043 on behalf of https://github.com/atalman due to Sorry reverting because in conflict with [pytorch#126905](pytorch#126905) which needs to be reverted ([comment](pytorch#128043 (comment)))

This reverts commit 901226a. Reverted pytorch#127661 on behalf of https://github.com/atalman due to Sorry reverting because in conflict with pytorch#126905 which needs to be reverted, will be relanding it ([comment](pytorch#127661 (comment)))

…6905)" This reverts commit 2f7cfec. Reverted pytorch#126905 on behalf of https://github.com/atalman due to Sorry need to revert - failing internally ([comment](pytorch#126905 (comment)))

At a high level, the idea behind this PR is: * Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.) * Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers. The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions: * FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing). * ModularIndexing, LShift, RShift now assert they are given integer inputs. * Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver * TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division. * Trunc is split to TruncToFloat and TruncToInt. * Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result. * RoundDecimal updated to consistently only ever return a float * Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing) In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations. Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information. We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**: * `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy * `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv` These changes have consequences. First, we need to make some administrative changes: * Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2) * Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py** * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here * Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet * Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions. In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments: * Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now * `_assert_bound_is_rational` is no more, we no longer generate rational bounds * Don't intersect non-int value ranges with the `int_range` * Support more sympy Functions for guard SYMPY_INTERP * Assert the type of value range is consistent with the variable type The new asserts uncovered necessary bug fixes: * **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions * **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions * **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr! * **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1 Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py** **Reland notes.** This requires this internal fbcode diff https://www.internalfb.com/phabricator/paste/view/P1403322587 but I cannot prepare the diff codev due to https://fb.workplace.com/groups/osssupport/posts/26343544518600814/ It also requires this Executorch PR pytorch/executorch#3911 but the ET PR can be landed prior to this landing. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: pytorch#126905 Approved by: https://github.com/xadupre, https://github.com/lezcano

In a previous life, we used sympy.oo to represent the lower/upper bounds of integer ranges. Later, we changed this to be sys.maxsize - 1 for a few reasons: (1) sometimes we do tests on a value being exactly sys.maxsize, and we wanted to avoid a data dependent guard in this case, (2) sympy.oo corresponds to floating point infinity, so you get incorrect types for value ranges with oo, and (3) you can do slightly better reasoning if you assume that input sizes fall within representable 64-bit integer range. After working in the sys.maxsize regime for a bit, I've concluded that this was actually a bad idea. Specifically, the problem is that you end up with sys.maxsize in your upper bound, and then whenever you do any sort of size-increasing computation like size * 2, you end up with 2 * sys.maxsize, and you end up doing a ton of arbitrary precision int computation that is totally unnecessary. A symbolic bound is better. But especially after pytorch#126905, we can't go back to using sympy.oo, because that advertises that it's not an integer, and now your ValueRanges is typed incorrectly. So what do we do? We define a new numeric constant `int_oo`, which is like `sympy.oo` but it advertises `is_integer`. **test/test_sympy_utils.py** describes some basic properties of the number, and **torch/utils/_sympy/numbers.py** has the actual implementation. The rest of the changes of the PR are working out the implications of this change. I'll give more commentary as inline comments. Fixes pytorch#127396 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: pytorch#127693 Approved by: https://github.com/lezcano ghstack dependencies: pytorch#126905

Update

aecb5a1

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: cpu CPU specific problem (e.g., perf, algorithm) module: inductor release notes: fx release notes category labels May 22, 2024

ezyang added a commit that referenced this pull request May 22, 2024

Complete revamp of float/promotion sympy handling

9e81141

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: 837ba28a2bf0f6b6273bb036836a4b60cf327652 Pull Request resolved: #126905

github-actions bot requested review from albanD, antoniojkim, bdhirsh, miladm and SherlockNoMad May 22, 2024 20:41

Update

6de3c41

[ghstack-poisoned]

ezyang added a commit that referenced this pull request May 22, 2024

Complete revamp of float/promotion sympy handling

eaa5ba2

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: a258465552bbd870de16d09c790828a63dca8a4b Pull Request resolved: #126905

Update

14837a9

[ghstack-poisoned]

ezyang added a commit that referenced this pull request May 23, 2024

Complete revamp of float/promotion sympy handling

f77dedb

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: 59f5e496f871ea3a8bba5de59647910a53c1bd8b Pull Request resolved: #126905

Update

dcb363e

[ghstack-poisoned]

ezyang requested review from avikchaudhuri, gmagogsfm, tugsbayasgalan and zhxchen17 as code owners May 23, 2024 15:20

ezyang requested review from lezcano and removed request for gmagogsfm, zhxchen17, avikchaudhuri and tugsbayasgalan May 23, 2024 15:20

ezyang added a commit that referenced this pull request May 23, 2024

Complete revamp of float/promotion sympy handling

0a7b5d8

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: fcec215ed5360354e905b0644148676901ed4c29 Pull Request resolved: #126905

ezyang commented May 23, 2024

View reviewed changes

Update

3861dde

[ghstack-poisoned]

ezyang added a commit that referenced this pull request Jun 9, 2024

Complete revamp of float/promotion sympy handling

8d24893

Signed-off-by: Edward Z. Yang <ezyang@meta.com> ghstack-source-id: 3c0bd424790ef16ebcef4ea058edec475346f5ea Pull Request resolved: #126905

pytorchmergebot added the merging label Jun 9, 2024

ezyang commented Jun 9, 2024

View reviewed changes

ezyang added the module: bc-breaking Related to a BC-breaking change label Jun 9, 2024

pytorch-bot bot added the topic: bc breaking topic category label Jun 9, 2024

pytorchmergebot closed this in 3964a3e Jun 9, 2024

pytorchmergebot removed the merging label Jun 9, 2024

huydhn mentioned this pull request Jun 12, 2024

[inductor] enable fx graph cache on torchbench #128239

Closed

zxd1997066 mentioned this pull request Jun 13, 2024

[inductor][cpu]hf_BigBird AMP multiple thread static/dynamic shape default/CPP wrapper performance regression #128513

Open

ezyang mentioned this pull request Jul 3, 2024

[dynamic shapes] Symbolic guards with float values #119338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete revamp of float/promotion sympy handling #126905

Complete revamp of float/promotion sympy handling #126905

ezyang commented May 22, 2024 •

edited

Loading

pytorch-bot bot commented May 22, 2024 •

edited

Loading

ezyang commented May 23, 2024

ezyang May 23, 2024

lezcano Jun 3, 2024

ezyang Jun 3, 2024

ezyang May 23, 2024

lezcano Jun 3, 2024

ezyang Jun 3, 2024

ezyang commented Jun 9, 2024

pytorchmergebot commented Jun 9, 2024

ezyang Jun 9, 2024 •

edited

Loading

ezyang commented Jun 9, 2024

shazqadeer commented Jun 9, 2024

ezyang commented Jun 10, 2024 •

edited

Loading

Complete revamp of float/promotion sympy handling #126905

Complete revamp of float/promotion sympy handling #126905

Conversation

ezyang commented May 22, 2024 • edited Loading

pytorch-bot bot commented May 22, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126905

❗ 1 Active SEVs

✅ You can merge normally! (1 Unrelated Failure)

ezyang commented May 23, 2024

ezyang May 23, 2024

Choose a reason for hiding this comment

lezcano Jun 3, 2024

Choose a reason for hiding this comment

ezyang Jun 3, 2024

Choose a reason for hiding this comment

ezyang May 23, 2024

Choose a reason for hiding this comment

lezcano Jun 3, 2024

Choose a reason for hiding this comment

ezyang Jun 3, 2024

Choose a reason for hiding this comment

ezyang commented Jun 9, 2024

pytorchmergebot commented Jun 9, 2024

Merge started

ezyang Jun 9, 2024 • edited Loading

Choose a reason for hiding this comment

ezyang commented Jun 9, 2024

shazqadeer commented Jun 9, 2024

ezyang commented Jun 10, 2024 • edited Loading

ezyang commented May 22, 2024 •

edited

Loading

pytorch-bot bot commented May 22, 2024 •

edited

Loading

ezyang Jun 9, 2024 •

edited

Loading

ezyang commented Jun 10, 2024 •

edited

Loading