[Bug Fix] Enabling custom op name without adding stack frame locations #9676

vkovinicTT · 2025-10-13T08:27:17Z

This PR fixes two related issues in PopulateXlaOpMetadata that prevented proper usage of CustomOpNameMetaData:

Fix 1: Support custom op names without stack frame locations

This will enable user to change the name of the op using CustomOpNameMetaData without necessarily having to add stack frame locations. Example of usage (if we pass 0 for max_stack_depth field it will just prepend custom prefix to the current op name):

torch_xla._XLAC._set_xla_custom_op_name_prefix(tensor, your_custom_name_prefix, 0)

Additionaly, this will guard the AddStackFrameLocations() function from passing invalid number for the max_stack_depth. If we were to pass number that is <= 0, we would get a segmentation fault due to improper iterator dereferencing (which could've happen before this change).

The AddStackFrameLocations() function in stack_frame_index_builder.cpp uses reverse iterators and assumes at least one iteration occurs:

  auto frame_it = frame_info.rbegin();
  for (; frame_it != frame_info.rend() && depth < max_stack_depth; ++frame_it) {
    // Loop never executes when max_stack_depth == 0
  }
  --frame_it;  // ← Segfault: iterator is still at rbegin(), decrement goes past-the-end
  metadata_to_populate.set_source_file(frame_it->file);  // ← Dereference invalid iterator

Fix 2: Prevent scope from overwriting custom metadata

Problem:

Even when users set custom metadata via _set_xla_custom_op_name_prefix(), the nmeta.scope field was unconditionally overwriting the custom op_name_prefix. This affected operations like add and mul which have scope set (e.g., aten::add.3), resulting in loss of user-provided semantic location information.

Changes:

Modified the condition from if (!nmeta.scope.empty()) to else if (!nmeta.scope.empty())
This ensures custom metadata takes precedence: custom metadata is used if available, otherwise scope is used as fallback

Precedence hierarchy (now correctly implemented):

Custom user metadata (via SetUserMetadata APIs) - highest priority
Scope-based naming (auto-generated by torch-xla) - fallback
Bare op_type - default

vkovinicTT · 2025-10-13T16:50:21Z

Thanks for the fast review 🚀

vkovinicTT · 2025-10-15T06:34:05Z

I noticed the tests have been canceled a few times - is that part of the normal process?

[Ticket](#1011) ### Problem `HLO` operations in compiled graphs lack semantic context (module hierarchy, source file/line), making debugging and profiling difficult. PyTorch's FX graph captures this metadata, but it's lost during export and execution since the generated `forward()` code is a flat sequence of operations. ### Solution Inject FX metadata into lazy `IR nodes` at runtime using `TorchDispatchMode` with a **counter-based** mapping approach: 1. **Compile-time**: Extract metadata from FX nodes after all passes complete, building ordered list of semantic locations (format: `ModuleClass[instance]/func_name(file.py:line)/`) 2. **Runtime**: Intercept operations during lazy graph construction via `MetadataDispatchMode`, attaching metadata to XLA tensors using **torch-xla**'s `_set_xla_custom_op_name_prefix` API **The counter-based mapping** works because FX enforces topological node ordering, code generation preserves this order, and `TorchDispatchMode` intercepts operations in execution order—maintaining a 1:1 correspondence between FX nodes and dispatched operations. ### Key Changes - `utils.py`: Added `extract_nodes_info()` for metadata extraction and `MetadataDispatchMode` for runtime interception - `backend.py`: Integrated metadata extraction in `torch_pass_pipeline()` after `recompile()`, and metadata injection in `XLAExecutor` - Controlled via `XLA_HLO_DEBUG=1` environment variable ### Important Notes - In order for this to work I also had to make changes in the `pytorch-xla` repo. I've already merged the PR in our fork, and [here](pytorch/xla#9676) is the opened PR in the `pytorch/xla` repo (which has been approved 🚀). - Any future FX passes **MUST** be added before `compiled_graph.recompile()` and `extract_nodes_info()`. Extracting metadata before passes complete causes misalignment between FX node order and runtime execution order. ### Result [Here](https://gist.github.com/vkovinicTT/212b50c0e4382d54494a28b436daf0ee) is the example of the model, and [here](https://gist.github.com/vkovinicTT/efe3e5b51e4f08abab8013ee2c340c70) are the locs that we will get in TTIR with this change.

vkovinicTT added 3 commits October 13, 2025 08:23

enabled renaming op name without adding stack frame locations

f085531

prevent scope from overwriting custom metadata

b901b9a

linting

d61d277

vkovinicTT mentioned this pull request Oct 13, 2025

[Bug Fix] Enabling custom op name without adding stack frame locations tenstorrent/pytorch-xla#11

Merged

qihqi approved these changes Oct 13, 2025

View reviewed changes

qihqi enabled auto-merge (squash) October 13, 2025 16:38

vkovinicTT mentioned this pull request Oct 15, 2025

Injecting FX metadata via TorchDispatchMode tenstorrent/tt-xla#1699

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug Fix] Enabling custom op name without adding stack frame locations #9676

[Bug Fix] Enabling custom op name without adding stack frame locations #9676

Uh oh!

vkovinicTT commented Oct 13, 2025

Uh oh!

vkovinicTT commented Oct 13, 2025

Uh oh!

vkovinicTT commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Bug Fix] Enabling custom op name without adding stack frame locations #9676

Are you sure you want to change the base?

[Bug Fix] Enabling custom op name without adding stack frame locations #9676

Uh oh!

Conversation

vkovinicTT commented Oct 13, 2025

Fix 1: Support custom op names without stack frame locations

Fix 2: Prevent scope from overwriting custom metadata

Uh oh!

vkovinicTT commented Oct 13, 2025

Uh oh!

vkovinicTT commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants