Guard on data-dependent expression from torch._debug_has_internal_overlap(self) in copy_ #122477

ezyang · 2024-03-22T03:45:33Z

🐛 Describe the bug

Internal xref: https://fb.workplace.com/groups/6829516587176185/posts/6998611496933359 with reproducer (not minimized)

Relevant stack

File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/_subclasses/fake_impls.py:150, in dispatch_to_op_implementations_dict(fake_mode, func, *args, **kwargs)
    148 @register_op_impl(op_implementations_dict.__contains__)
    149 def dispatch_to_op_implementations_dict(fake_mode, func, *args, **kwargs):
--> 150     return op_implementations_dict[func](fake_mode, func, *args, **kwargs)
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/_subclasses/fake_impls.py:479, in multi_device_op_default(fake_mode, func, *args, **kwargs)
    474 @register_op_impl(aten._unsafe_index_put.default)
    475 @register_op_impl(aten.copy.default)
    476 @register_op_impl(aten.copy_.default)
    477 @register_op_impl(aten.slice_scatter.default)
    478 def multi_device_op_default(fake_mode, func, *args, **kwargs):
--> 479     return run_and_return_new_tensor_of_input_device(fake_mode, func, args, kwargs)
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/_subclasses/fake_impls.py:389, in run_and_return_new_tensor_of_input_device(fake_mode, func, args, kwargs)
    387 out_device = new_kwargs["input"].device
    388 with in_kernel_invocation_manager(fake_mode):
--> 389     out = func(*args, **kwargs)
    390     if not is_noncontiguous_supported(out_device):
    391         out = out.new_empty(out.shape)
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/_ops.py:600, in OpOverload.__call__(self_, *args, **kwargs)
    597 def __call__(self_, *args, **kwargs):  # noqa: B902
    598     # use `self_` to avoid naming collide with aten ops arguments that
    599     # are named "self". This way, all the aten ops can be called by kwargs.
--> 600     return self_._op(*args, **kwargs)
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/_meta_registrations.py:368, in meta_copy_(self, src, non_blocking)
    361 @register_meta(aten.copy_.default)
    362 def meta_copy_(self, src, non_blocking=False):
    363     # This code simulates the original decomp from inductor,
    364     # which runs most of the meta checks that we care about.
    365     # In theory, we should make this more robust by carefully
    366     # auditing our C++ copy_() kernel and copying the checks here.
--> 368     if torch._debug_has_internal_overlap(self) == 1:  # 1 == MemOverlap::Yes
    369         raise RuntimeError(
    370             "more than one element of the written-to tensor refers to a single memory location"
    371         )
    373     if isinstance(src, Tensor):
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/fx/experimental/sym_node.py:374, in SymNode.guard_bool(self, file, line)
    371 def guard_bool(self, file, line):
    372     # TODO: use the file/line for some useful diagnostic on why a
    373     # guard occurred
--> 374     r = self.shape_env.evaluate_expr(self.expr, self.hint, fx_node=self.fx_node)
    375     try:
    376         return bool(r)
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/fx/experimental/recording.py:265, in record_shapeenv_event.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    263 self.events.append(event)
    264 try:
--> 265     return event.run(self)
    266 except Exception:
    267     self.events.pop()
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/fx/experimental/recording.py:160, in ShapeEnvEvent.run(self, shape_env)
    157     replacearg(index=3, key="fx_node", fn=maybe_convert_node)
    159 # Actually call the method with the converted arguments.
--> 160 return self.f(*args, **kwargs)
File /data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/fx/experimental/symbolic_shapes.py:4138, in ShapeEnv.evaluate_expr(self, orig_expr, hint, fx_node, expect_rational, size_oblivious, forcing_spec)
   4131         if not size_oblivious:
   4132             size_oblivious_result = self._maybe_evaluate_static(
   4133                 expr,
   4134                 expect_rational=expect_rational,
   4135                 size_oblivious=True
   4136             )
-> 4138         raise self._make_data_dependent_error(
   4139             expr.xreplace(self.var_to_val),
   4140             expr,
   4141             size_oblivious_result=size_oblivious_result
   4142         )
   4143     expr = new_expr
   4145 concrete_val = compute_concrete_val()
GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(IsNonOverlappingAndDenseIndicator(u23 + u25 + u27 + u29, 2, 5, 1), 1) (unhinted: Eq(IsNonOverlappingAndDenseIndicator(u23 + u25 + u27 + u29, 2, 5, 1), 1)).  (Size-like symbols: u23, u25, u29, u27)

Potential framework code culprit (scroll up for full backtrace):
  File "/data/users/tkaruturi/fbsource/buck-out/v2/gen/fbcode/a5cb042decdaac95/bento/kernels/__bento_kernel_modai_sdk__/bento_kernel_modai_sdk#link-tree/torch/_meta_registrations.py", line 368, in meta_copy_
    if torch._debug_has_internal_overlap(self) == 1:  # 1 == MemOverlap::Yes

For more information, run with TORCH_LOGS="dynamic"
For extended logs when we create symbols, also add TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="u23,u25,u29,u27"
If you suspect the guard was triggered from C++, add TORCHDYNAMO_EXTENDED_DEBUG_CPP=1
For more debugging help, see https://docs.google.com/document/d/1HSuTTVvYH1pTew89Rtpeu84Ht3nQEFTYhAX3Ypa_xJs/edit?usp=sharing

In this particular case, we should just not do the overlap check.

Versions

main

cc @msaroufim @bdhirsh @anijain2305 @zou3519 @chauhang

The text was updated successfully, but these errors were encountered:

ezyang added small We think this is a small issue to fix. Consider knocking off high priority small issues oncall: pt2 module: dynamic shapes labels Mar 22, 2024

jansel added triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed triage review labels Mar 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guard on data-dependent expression from torch._debug_has_internal_overlap(self) in copy_ #122477

Guard on data-dependent expression from torch._debug_has_internal_overlap(self) in copy_ #122477

ezyang commented Mar 22, 2024 •

edited by pytorch-bot bot

Guard on data-dependent expression from torch._debug_has_internal_overlap(self) in copy_ #122477

Guard on data-dependent expression from torch._debug_has_internal_overlap(self) in copy_ #122477

Comments

ezyang commented Mar 22, 2024 • edited by pytorch-bot bot

🐛 Describe the bug

Versions

ezyang commented Mar 22, 2024 •

edited by pytorch-bot bot