Fix guarding issues w/ numpy #106431

voznesenskym · 2023-08-01T23:30:59Z

import torch
import numpy as np


def _inf_nan_preprocess(t, t_np):
    t_np = np.nan_to_num(t_np)
    return t, t_np


@torch.compile()
def fn():
    # shape, dims format
    test_cases = (
        (3, 3),
        (4, 4),
        (5, 5),
    )

    for shape in test_cases:
        t = torch.randn(shape, dtype=torch.complex64)
        t_np = np.random.randn(*shape).astype(np.complex64)

        _, t_np = _inf_nan_preprocess(t, t_np)
        print(t, t_np)  # Just a side effect so that compilation kicks in


fn()

cc @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @anijain2305

pytorch-bot · 2023-08-01T23:31:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106431

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 2 Pending, 3 Unrelated Failures

As of commit f38030f:

BROKEN TRUNK - The following jobs failed but were present on the merge base 3db2550:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

linux-focal-rocm5.6-py3.8 / test (default, 2, 3, linux.rocm.gpu, unstable) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

voznesenskym · 2023-08-02T00:20:57Z

torch/_dynamo/config.py

 # lose optimization opportunities this way.  Devs, if your benchmark model is failing
 # this way, you should figure out why instead of suppressing it.
-suppress_errors = os.environ.get("TORCHDYNAMO_SUPPRESS_ERRORS", "1") == "1"
+suppress_errors = False


@lezcano this is intentional, this flag makes it hard to debug.

voznesenskym · 2023-08-02T00:36:58Z

torch/_dynamo/guards.py

            self._produce_guard_code(guard, [shape_guard], shape_env=True)

-    def TENSOR_MATCH(self, guard: Guard):
+    def TENSOR_MATCH(self, guard: Guard, value=None):


if .get() invokes an fn, like in the source added in this PR (NumpyTensorSource) - we do not have id stability, which is required between example values from get and traced values at fake-ification time, because of how we build the sizes/strides for tensor guards.

I think I really need a comment here explaining what the value arg does. Still not sure from the comment either, need to read more.

okay, will add a comment.

lezcano

Note that, with this PR in the current state, we still have that a function that just has NumPy inputs and NumPy outputs is not being compiled. A small modification of the example in the OP shows this.

OTOH, they seem to have correct guards, and their shapes are correctly traced with symints when dynamic shapes kick in.

torch/_dynamo/utils.py

torch/_dynamo/source.py

torch/_dynamo/variables/tensor.py

linux-foundation-easycla · 2023-08-02T20:11:16Z

The committers listed above are authorized under a signed CLA.

✅ login: voznesenskym / name: Michael Voznesensky (3f693fe, 640bc03, 2a22a32, f38030f)

voznesenskym · 2023-08-02T20:18:55Z

@lezcano I undid your mega merge, this PR is back to being based on head of your branch. If you need work there, push it to that branch and rebase this branch, don't push to this branch please :P

ezyang · 2023-08-02T20:57:03Z

torch/_dynamo/codegen.py


    def add_graph_output(self, value):
-        graph_outputs_key = id(value.proxy)
+        graph_outputs_key = id(value.as_proxy())


does this... do anything lol

Its the better convention, but no, should be same.

ezyang · 2023-08-02T20:58:01Z

torch/_dynamo/guards.py

-    def TENSOR_MATCH(self, guard: Guard):
+    def TENSOR_MATCH(self, guard: Guard, value=None):
        if guard.is_nn_module():
            self.ID_MATCH(guard)


numpy ndarray on nn module 🤔 hilarious failure mode

This will not fail, but will overspecialize. HOWEVER - we throw out nn module guards, so this will just be okay for now.

ezyang · 2023-08-02T21:01:14Z

torch/_dynamo/variables/tensor.py

        **kwargs,
    ):
        super().__init__(proxy, **kwargs)
+        self.proxy = proxy


What's going on here? Are there two proxies floating around now?

We probably don't need this line, super already does that

ezyang · 2023-08-02T21:01:42Z

torch/_dynamo/variables/builder.py

        )
-        options = {"source": source}
+        options = {"source": source, "guards": tensor_vt.guards}
        numpy_ndarray_variable = wrap_fx_proxy_cls(


It appears these are the substantive changes

torch/_dynamo/variables/builder.py

ezyang · 2023-08-02T21:04:57Z

torch/_dynamo/variables/builder.py

            example_value=value,
-            guards=self.make_guards(GuardBuilder.TENSOR_MATCH),
+            guards=self.make_guards(
+                functools.partial(GuardBuilder.TENSOR_MATCH, value=value)


So are you passing value for EVERYBODY, not just ndarray?

Yes, this actually saves us an eval, and should always be sound.

voznesenskym · 2023-08-05T15:50:30Z

The main PR has all the tests passing minus the one this one is fixing. Once this one is ready to be merged I think the CI in the main PR should be fully green.

Sounds good will get this over.

lezcano · 2023-08-06T08:16:43Z

torch/_dynamo/utils.py

+    if isinstance(value, np.ndarray):
+        return torch.as_tensor(value)


Better to remove this one, as, if our logic is correct, an np.ndarray should never get to this point. We should probably even assert not isinstance(value, np.ndarray)

Let me double check, I am pretty certain we do hit this, and that is why I added it. Why would you expect us to not get here?

lezcano · 2023-08-06T08:26:42Z

The issue in #106431 (review) is still present (i.e., if you remove the tensor arg to the _inf_nan_preprocess function we don't compile anything) but this shouldn't be blocking to land the main PR, I don't think.

voznesenskym · 2023-08-06T20:34:47Z

The issue in #106431 (review) is still present (i.e., if you remove the tensor arg to the _inf_nan_preprocess function we don't compile anything) but this shouldn't be blocking to land the main PR, I don't think.

Let me take a look :)

voznesenskym · 2023-08-07T02:06:44Z

The issue in #106431 (review) is still present (i.e., if you remove the tensor arg to the _inf_nan_preprocess function we don't compile anything) but this shouldn't be blocking to land the main PR, I don't think.

Let me take a look :)

BTW this would be in another PR - its not blocking this one. I have not looked yet, but my money is on something that looks for tensor compute in frames. We do this in a few specific places. Will report back.

lezcano

Yep, another PR SGTM.

On the interaction with torch_np, this PR LGTM. Let's wait for Ed's review about the general strategy tho.

voznesenskym · 2023-08-07T18:11:10Z

@pytorchbot merge

pytorchmergebot · 2023-08-07T18:12:51Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

larryliu0820 · 2023-08-07T18:54:04Z

torch/_dynamo/variables/misc.py

+                numpy_to_tensor_wrapper(func),
+                *proxy_args_kwargs(args, kwargs),
            )
+            return NumpyNdarrayVariable.create(tx, proxy, **options)


What is the difference before and after? Or is this just refactoring.

lezcano · 2023-08-07T19:06:54Z

I merged it because @voznesenskym was going to.
This has been merged into #106211. If any of the reviewers @yanboliang @anijain2305 @ezyang have any comments, you can put them there and I'll address them.

@lezcano

RFC: pytorch/rfcs#54 First commit is the contents of https://github.com/Quansight-Labs/numpy_pytorch_interop/ We have already been using this in core for the last few months as a external dependency. This PR pulls all these into core. In the next commits, I do a number of things in this order - Fix a few small issues - Make the tests that this PR adds pass - Bend backwards until lintrunner passes - Remove the optional dependency on `torch_np` and simply rely on the upstreamed code - Fix a number dynamo tests that were passing before (they were not tasting anything I think) and are not passing now. Missing from this PR (but not blocking): - Have a flag that deactivates tracing NumPy functions and simply breaks. There used to be one but after the merge stopped working and I removed it. @lezcano to investigate. - #106431 (comment). @voznesenskym to submit a fix after we merge. All the tests in `tests/torch_np` take about 75s to run. This was a work by @ev-br, @rgommers @honno and I. I did not create this PR via ghstack (which would have been convenient) as this is a collaboration, and ghstack doesn't allow for shared contributions. Pull Request resolved: #106211 Approved by: https://github.com/ezyang

github-actions bot added module: inductor module: dynamo ciflow/inductor labels Aug 1, 2023

github-actions bot requested review from SherlockNoMad, albanD, antoniojkim, bdhirsh, ezyang, jbschlosser, miladm and wconstab August 1, 2023 23:31

voznesenskym commented Aug 2, 2023

View reviewed changes

lezcano reviewed Aug 2, 2023

View reviewed changes

torch/_dynamo/utils.py Outdated Show resolved Hide resolved

lezcano reviewed Aug 2, 2023

View reviewed changes

torch/_dynamo/utils.py Show resolved Hide resolved

lezcano reviewed Aug 2, 2023

View reviewed changes

torch/_dynamo/source.py Outdated Show resolved Hide resolved

ezyang reviewed Aug 2, 2023

View reviewed changes

torch/_dynamo/variables/tensor.py Show resolved Hide resolved

lezcano force-pushed the torch_np branch from cbbc191 to 4c32930 Compare August 2, 2023 17:16

voznesenskym force-pushed the voz/torch_np branch 2 times, most recently from 1a8a61f to 49c975a Compare August 2, 2023 20:18

ezyang reviewed Aug 2, 2023

View reviewed changes

torch/_dynamo/variables/builder.py Outdated Show resolved Hide resolved

ezyang reviewed Aug 2, 2023

View reviewed changes

lezcano mentioned this pull request Aug 4, 2023

NumPy support in torch.compile #106211

Closed

voznesenskym force-pushed the voz/torch_np branch from 3f693fe to dcf6212 Compare August 5, 2023 16:39

voznesenskym added 2 commits August 5, 2023 16:40

Feedback

640bc03

Fix

2a22a32

voznesenskym force-pushed the voz/torch_np branch from dcf6212 to 3f693fe Compare August 5, 2023 16:41

Fix

f38030f

voznesenskym requested review from ezyang and lezcano August 6, 2023 05:04

voznesenskym changed the title ~~[WIP] Fix guarding issues w/ numpy~~ Fix guarding issues w/ numpy Aug 6, 2023

lezcano reviewed Aug 6, 2023

View reviewed changes

lezcano approved these changes Aug 7, 2023

View reviewed changes

ezyang requested review from anijain2305, larryliu0820 and yanboliang August 7, 2023 13:55

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 7, 2023

pytorchmergebot added the merging label Aug 7, 2023

pytorchmergebot removed the merging label Aug 7, 2023

larryliu0820 reviewed Aug 7, 2023

View reviewed changes

lezcano merged this pull request into torch_np Aug 7, 2023

lezcano pushed a commit that referenced this pull request Aug 7, 2023

Fix guarding issues w/ numpy (#106431)

2139957

lezcano pushed a commit that referenced this pull request Aug 8, 2023

Fix guarding issues w/ numpy (#106431)

22865ad

github-actions bot deleted the voz/torch_np branch February 4, 2025 02:02

		if isinstance(value, np.ndarray):
		return torch.as_tensor(value)

Fix guarding issues w/ numpy #106431

Fix guarding issues w/ numpy #106431

Uh oh!

Conversation

voznesenskym commented Aug 1, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/106431

⏳ 2 Pending, 3 Unrelated Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

linux-foundation-easycla bot commented Aug 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

voznesenskym commented Aug 2, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

voznesenskym Aug 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

voznesenskym commented Aug 5, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano commented Aug 6, 2023

Uh oh!

voznesenskym commented Aug 6, 2023

Uh oh!

voznesenskym commented Aug 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

voznesenskym commented Aug 7, 2023

Uh oh!

pytorchmergebot commented Aug 7, 2023

Merge failed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano commented Aug 7, 2023

Uh oh!

voznesenskym commented Aug 1, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 1, 2023 •

edited

Loading

lezcano left a comment •

edited

Loading

linux-foundation-easycla bot commented Aug 2, 2023 •

edited

Loading

voznesenskym Aug 5, 2023 •

edited

Loading

voznesenskym commented Aug 7, 2023 •

edited

Loading