Skip to content

Fix plugin EP profiling timestamp skew on macOS#27994

Merged
tianleiwu merged 2 commits intomainfrom
tlwu/20260406/plugin_ep_profiling_macos_fix
Apr 10, 2026
Merged

Fix plugin EP profiling timestamp skew on macOS#27994
tianleiwu merged 2 commits intomainfrom
tlwu/20260406/plugin_ep_profiling_macos_fix

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

Description

This fixes a flaky failure in the plugin EP profiling tests on macOS, where reconstructed plugin event timestamps could land a few microseconds outside the correlated ORT parent event interval.

The current example plugin profiler reconstructs EP-relative timestamps by combining ORT's profiling-start offset with elapsed time from the EP clock. That reconstruction is close but not exact across clocks, and on macOS the skew was enough to fail the strict containment checks in KernelPluginEp_SessionProfiling with cases like ep_start < parent_start by a small margin.

Instead of weakening the test, this change keeps the strict contract and fixes the profiler output so child EP events are always emitted within the correlated ORT parent event interval.

Key Changes

File Change
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.h Stores the correlated ORT parent event start timestamp and duration on each collected EP event, and adds the helper signature updates needed to propagate that metadata.
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.cc Captures parent event timing from Ort::ConstProfilingEvent, attaches it to EP events during StopEventImpl, and clamps the reconstructed EP start/end interval to the parent ORT interval before emitting the final profiling event.

Why This Change Is Needed

  • The plugin EP profiling tests intentionally require strict nesting: EP child events must stay within the ORT parent event interval.
  • The existing implementation reconstructs EP timestamps from two different clocks, which can drift by a few microseconds depending on platform timing behavior.
  • macOS exposed that drift often enough to make KernelPluginEp_SessionProfiling flaky even though the logical event ordering was correct.
  • Clamping the emitted child interval to the already-correlated parent interval preserves the expected semantics and removes the platform-specific skew from the final profiling output.

Testing

  • ninja -C build/cuda/Debug onnxruntime_autoep_test
  • cd build/cuda/Debug && ./onnxruntime_autoep_test --gtest_filter=OrtEpLibrary.KernelPluginEp_SessionProfiling
  • cd build/cuda/Debug && ./onnxruntime_autoep_test --gtest_filter=OrtEpLibrary.KernelPluginEp_RunProfiling

Notes For Reviewers

  • This is intentionally scoped to the example plugin EP profiling path used by the AutoEP tests.
  • The change avoids relaxing any assertions in test_execution.cc; it fixes the emitted profiling data instead.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes flaky AutoEP plugin-EP profiling tests on macOS by ensuring emitted EP child event timestamps are always strictly contained within their correlated ORT parent event interval (despite small cross-clock reconstruction skew).

Changes:

  • Propagates correlated ORT parent event start timestamp and duration into each collected EP event.
  • Updates the ORT event boundary helper (PopOrtEvent) to carry parent timing metadata.
  • Clamps reconstructed EP event start/end timestamps to the correlated ORT parent interval before emitting profiling events.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.h Extends stored EP event metadata with correlated ORT parent timing and updates helper function signatures.
onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.cc Captures ORT parent timing in StopEventImpl and clamps reconstructed EP event intervals within the ORT parent bounds in EndProfilingImpl.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@adrianlizarraga adrianlizarraga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@tianleiwu tianleiwu enabled auto-merge (squash) April 10, 2026 08:24
@tianleiwu tianleiwu merged commit bbb0cd0 into main Apr 10, 2026
97 of 98 checks passed
@tianleiwu tianleiwu deleted the tlwu/20260406/plugin_ep_profiling_macos_fix branch April 10, 2026 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants