Skip to content

Rework call site identification#6298

Merged
rostan-t merged 3 commits into
NVIDIA:mainfrom
rostan-t:ndd-callsite-resolver
Apr 17, 2026
Merged

Rework call site identification#6298
rostan-t merged 3 commits into
NVIDIA:mainfrom
rostan-t:ndd-callsite-resolver

Conversation

@rostan-t

@rostan-t rostan-t commented Apr 14, 2026

Copy link
Copy Markdown
Collaborator

Category:

Refactoring (Redesign of existing code that doesn't affect functionality)

Description:

Call site identification for stack trace doctoring in dynamic mode is currently performed with hardcoded values to skip dynamic mode's internal plumbing. Although tested, this is brittle and can break easily when the code changes.

This PR removes the hardcoded values and replaces it by marking internal frames "transparent". Identification is then performed by walking up the stack trace, skipping transparent frames. Even if this is done for all calls to DALI functions, the overhead should be small as there are typically not more than 6 frames to walk up.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
    • test_exceptions.py: *
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Comment thread dali/python/nvidia/dali/experimental/dynamic/_batch2tensor.py Dismissed
Comment thread dali/python/nvidia/dali/experimental/dynamic/_batch2tensor.py Dismissed
Comment thread dali/python/nvidia/dali/experimental/dynamic/_tensor.py Fixed
@rostan-t rostan-t force-pushed the ndd-callsite-resolver branch from 653ab9a to 3b45192 Compare April 14, 2026 16:43
@greptile-apps

greptile-apps Bot commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR replaces the hardcoded frame-skip depth used for call-site identification in dynamic mode with a cleaner mark_transparent / resolve_callsite_frame mechanism. Internal plumbing wrappers (call, fn_call, batch_to_tensor, tensor, as_tensor, and their makefun/NVTXRange wrappers) are all marked transparent so stack-walking correctly lands on the user's frame. A follow-up commit removes the now-dead capture_stack helper and a third commit adds a guard to skip frame capture in synchronous eval modes where no async scheduling occurs.

Confidence Score: 5/5

Safe to merge; the refactoring is correct and all remaining findings are P2 style suggestions.

The core transparent-frame logic is sound: mark_transparent correctly follows wrapped chains, resolve_callsite_frame reliably walks past all internal wrappers, and the Invocation assertion in schedule() is safe because _call_stack is always populated when EvalMode is eager. The only P2 finding is a minor performance inconsistency in _batch2tensor.py (missing the same EvalMode guard that _op_builder.py already applies).

_batch2tensor.py — minor EvalMode guard omission; no correctness impact.

Important Files Changed

Filename Overview
dali/python/nvidia/dali/experimental/dynamic/_callsite.py New module introducing mark_transparent / resolve_callsite_frame / capture_stack_from_frame utilities; cleanly follows wrapped chains.
dali/python/nvidia/dali/experimental/dynamic/_exceptions.py Dead capture_stack removed per earlier review; no remaining callers, clean change.
dali/python/nvidia/dali/experimental/dynamic/_invocation.py Now accepts caller_frame directly; captures call stack only in deferred/eager modes (value ≤ eager) where it is actually needed; assertion in schedule() is safe.
dali/python/nvidia/dali/experimental/dynamic/_op_builder.py Correctly applies mark_transparent to generated call/fn_call wrappers and their makefun-wrapped counterparts; EvalMode guard present before resolve_callsite_frame() call.
dali/python/nvidia/dali/experimental/dynamic/_batch2tensor.py BatchToTensor.call and batch_to_tensor marked transparent; resolve_callsite_frame() is called unconditionally unlike _op_builder.py which guards it behind an EvalMode check.
dali/python/nvidia/dali/experimental/dynamic/_tensor.py tensor and as_tensor decorated with @_callsite.mark_transparent so they are skipped during call-site resolution; straightforward addition.

Sequence Diagram

sequenceDiagram
    participant U as User code
    participant T as tensor() [transparent]
    participant B2T as batch_to_tensor() [transparent]
    participant BT as BatchToTensor.__call__() [transparent]
    participant R as resolve_callsite_frame()
    participant I as Invocation.__init__

    U->>T: tensor(batch)
    T->>B2T: batch_to_tensor(batch, ...)
    B2T->>BT: _batch_to_tensor_instance(batch, ...)
    BT->>R: sys._getframe(1) → BT frame
    Note over R: walk f_back, skipping transparent frames
    R-->>BT: returns User code frame
    BT->>I: Invocation(..., caller_frame=user_frame)
    I-->>BT: _call_stack captured (eager/deferred) or None (sync)
Loading

Fix All in Claude Code

Reviews (5): Last reviewed commit: "Skip caller frame capture in synchronous..." | Re-trigger Greptile

Comment thread dali/python/nvidia/dali/experimental/dynamic/_invocation.py
Comment thread dali/python/nvidia/dali/experimental/dynamic/_callsite.py
@rostan-t

Copy link
Copy Markdown
Collaborator Author

!build

@dali-automaton

Copy link
Copy Markdown
Collaborator

CI MESSAGE: [48516693]: BUILD STARTED

Comment on lines +16 to +18
from . import _device, _invocation, _op_builder
from ._batch import Batch, Tensor, as_batch, batch
from ._callsite import mark_transparent, resolve_callsite_frame

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refrain from reorganizing imports, especially with other PRs in flight.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a file that's very likely to be changed by other PRs. Other PRs currently in flight don't touch it.

@mzient mzient assigned mzient and unassigned szalpal Apr 15, 2026
Comment thread dali/python/nvidia/dali/experimental/dynamic/_op_builder.py Outdated
@rostan-t rostan-t force-pushed the ndd-callsite-resolver branch from 63286e6 to f02d3e2 Compare April 15, 2026 17:56
@rostan-t

Copy link
Copy Markdown
Collaborator Author

!builld

@rostan-t

Copy link
Copy Markdown
Collaborator Author

!build

@dali-automaton

Copy link
Copy Markdown
Collaborator

CI MESSAGE: [48675105]: BUILD STARTED

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
@rostan-t rostan-t force-pushed the ndd-callsite-resolver branch from f02d3e2 to 0edcbea Compare April 16, 2026 08:27
@rostan-t

Copy link
Copy Markdown
Collaborator Author

!build

@dali-automaton

Copy link
Copy Markdown
Collaborator

CI MESSAGE: [48676204]: BUILD STARTED

@dali-automaton

Copy link
Copy Markdown
Collaborator

CI MESSAGE: [48676204]: BUILD PASSED

@rostan-t rostan-t merged commit f98e32a into NVIDIA:main Apr 17, 2026
7 checks passed
@rostan-t rostan-t deleted the ndd-callsite-resolver branch April 17, 2026 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants