[Profiler] Provide a method to profile Triton XPU Kernel's accuracy execution time. #1066

chengjunlu · 2024-05-08T08:26:20Z

There is no stand along profiler tools for Triton XPU now.

We used to use:

the Torch legacy profiler with the IPEX extension. (This is going to be removed by IPEX)
The new torch profiler with the Kineto extended by IPEX. (This depends on the Kineto and Torch)
Use the synchronization wait on the host to measure the performance. (This is not accurate with host overheads.)

The Triton has a new component for profiling performance of the Triton kernel. It worth to support it for the Triton XPU.

tdeng5 · 2024-05-16T01:14:30Z

It is the highest priority for collecting accurate Triton performance data for the coming Triton Demo on Jun 25.

etiotto · 2024-05-17T15:50:19Z

I have added post review comments to the PR that closed this issue, see #1136 (comment).

I am concerned the benchmarks use a different way than the do_bench Triton uses to compute timing.

vlad-penkin added this to the 0.3 [Triton] Language and Runtime milestone May 8, 2024

vlad-penkin added enhancement New feature or request performance labels May 8, 2024

tdeng5 changed the title ~~[Profiler] Support Triton XPU kernel profiling thru the Proton~~ [Profiler] Provide an accuracy method to profile Triton XPU Kernel's execution time. May 16, 2024

tdeng5 changed the title ~~[Profiler] Provide an accuracy method to profile Triton XPU Kernel's execution time.~~ [Profiler] Provide a method to profile Triton XPU Kernel's accuracy execution time. May 16, 2024

chengjunlu linked a pull request May 16, 2024 that will close this issue

To profile the Intel GPU kernels with the SYCL event profiling time stamp instead of barrier time diff. #1136

Merged

chengjunlu closed this as completed in #1136 May 17, 2024

chengjunlu mentioned this issue May 17, 2024

[Profiler] To support Triton profiler Proton to profile the Intel GPU kernel performance #1145

Open

etiotto reopened this May 17, 2024

etiotto assigned chengjunlu May 17, 2024

vlad-penkin modified the milestones: 0.3 [Triton] Language and Runtime, 4.0 [Performance] Core Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Profiler] Provide a method to profile Triton XPU Kernel's accuracy execution time. #1066

[Profiler] Provide a method to profile Triton XPU Kernel's accuracy execution time. #1066

chengjunlu commented May 8, 2024

tdeng5 commented May 16, 2024 •

edited

etiotto commented May 17, 2024

[Profiler] Provide a method to profile Triton XPU Kernel's accuracy execution time. #1066

[Profiler] Provide a method to profile Triton XPU Kernel's accuracy execution time. #1066

Comments

chengjunlu commented May 8, 2024

tdeng5 commented May 16, 2024 • edited

etiotto commented May 17, 2024

tdeng5 commented May 16, 2024 •

edited