Skip to content

Conversation

@min-jean-cho
Copy link
Collaborator

@min-jean-cho min-jean-cho commented Jan 30, 2023

As found in #92709, thanks to @ngimel and @jansel, currently torch.Tensor.fn points to UserDefinedObjectVariable rather than TorchVariable. The root cause is due to #92709 (review). To prevent this, build TorchVariable of torch.Tensor.fn pointing to torch.ops.aten.fn.

This issue propagates to torch.Tensor.fn causing graph break with nopython=True.

import torch
import torch._dynamo as dynamo

#op = torch.ops.aten.abs_ # no graph break
op = torch.Tensor.abs_ # graph break 
args = torch.empty(10)

def foo(args):
    return op(args)

opt_foo = dynamo.optimize("inductor", nopython=True)(foo)
y_ = opt_foo(args)

cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 30, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/93243

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 96e2e3a:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@dagitses dagitses requested a review from jansel January 31, 2023 20:48
@dagitses dagitses added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 31, 2023
@min-jean-cho
Copy link
Collaborator Author

Hi @jansel, please help have a look if it makes sense to support torch.Tensor.{fn} as torch.ops.aten.{fn}, thanks !

@min-jean-cho min-jean-cho requested review from jgong5 and ngimel February 1, 2023 23:04
"to_sparse": {f32, f64},
"uniform": {f16, f32, f64},
# AssertionError: Tensor-likes are not close!
"uniform": {f16},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that these started working. I'd expect randomness to not work with opinfo testing.

Copy link
Collaborator Author

@min-jean-cho min-jean-cho Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why eager and inductor randomness match -- not sure if the default generator is global or configured separately and the backend is setting the seed (by the default generator) .

Here's a simple reproducer:

import torch
import torch._dynamo as dynamo

op = torch.Tensor.bernoulli_
args = torch.empty(10000)

def foo(args):
    #torch.manual_seed(42)
    return op(args)

opt_foo = dynamo.optimize("inductor")(foo)

y = foo(args)
y_ = opt_foo(args)
print(y)
print(y_)
print(torch.allclose(y, y_))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a graph break here? Try changing dynamo.optimize("inductor", nopython=True).

Copy link
Collaborator Author

@min-jean-cho min-jean-cho Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there's no graph break with this PR. I could reproduce the randomness match with nopython=True as well. I've generally noticed they match for CPU fp32, fp64. Even raises the warning, [WARNING] using triton random, expect difference from eager

guards=make_guards(GuardBuilder.FUNCTION_MATCH),
)
elif (
not is_allowed(value)
Copy link
Contributor

@jansel jansel Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just make is_allowed() return True for these?

Should be able to modify:

def _allowed_function_ids():

with something like:

for name in dir(torch.Tensor):
   method = getattr(torch.Tensor, name)
   if isinstance(method, types.MethodDescriptorType):
       torch_object_ids[id(method)] = f"torch.Tensor.{name}"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, torch.Tensor.{name} are method descriptors not module or method, hence doesn't have __module__ attribute. This issue will propagate to when generating fx graph and finding the module of the method:

def _find_module_of_method(orig_method: Callable[..., Any]) -> str:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change FX to handle that? I suspect it just involves adding a case to the code printer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jansel, your suspect was right. Please help have another look.

@min-jean-cho min-jean-cho requested a review from jansel February 2, 2023 22:39
min-jean-cho and others added 4 commits February 2, 2023 19:58
Co-authored-by: Jason Ansel <jansel@jansel.net>
Co-authored-by: Jason Ansel <jansel@jansel.net>
@min-jean-cho min-jean-cho requested a review from jansel February 3, 2023 04:10
@min-jean-cho
Copy link
Collaborator Author

Thanks @jansel, for the help in root cause and review.

# warn_only=False correctly raises RuntimeError: put_ does not have a deterministic implementation
# warn_only=True logs warning from the FallbackKernel: torch.ops.aten.put_.default, instead of as UserWarning:
# [W Context.cpp:%(lineno)] Warning: put_ does not have a deterministic implementation
@skipIfTorchInductor("warning is logged from the FallbackKernel: torch.ops.aten.put_.default when warn_only=True")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jansel, skipping the following test. Warning when warn_only=True is correctly raised form the fallback kernel when graph break is eliminated.

@min-jean-cho
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 7, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

pytorchmergebot commented Feb 7, 2023

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda11.6-py3 / test (functorch, 1, 1, windows.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled. If you believe this is a mistake,then you can re trigger it through pytorch-bot.

@min-jean-cho
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@PaliC
Copy link
Contributor

PaliC commented Feb 7, 2023

Edit: for Proof of Concept on #94324

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda11.6-py3 / test (functorch, 1, 1, windows.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team

Rule superuser

Raised by workflow job

PaliC added a commit that referenced this pull request Feb 7, 2023
… dropdown"


We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 7, 2023
We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 7, 2023
… dropdown"


We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 7, 2023
We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 7, 2023
… dropdown"


We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 7, 2023
We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 8, 2023
… dropdown"


We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 8, 2023
We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 8, 2023
… dropdown"


We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
PaliC added a commit that referenced this pull request Feb 8, 2023
We create a new type of error `MergeError` which effectively does what `handle_exception` was doing before in creating an error, however, it lets you add extra internal input. 

We use this new error type in order to move the broken rule during merges to the  dev infra dropdown.

Nit: Also minor fix MandatoryChecksMissingError now actually uses the broken rule

Addresses: pytorch/test-infra#1081

Result should be something in the vein of #93243 (comment) (albeit with a different error message/summary)


[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: dynamo module: inductor open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants