[NFC][SYCL] Avoid unnecessary `std::unique_ptr<HostKernel<...>>` instantiations #17640

aelovikov-intel · 2025-03-25T16:54:36Z

The change improves "host" compilation times for cases with multiple kernels.

…antiations The change improves "host" compilation times for cases with multiple kernels.

uditagarwal97 · 2025-03-25T16:59:13Z

The change improves "host" compilation times for cases with multiple kernels.

Out of curiosity, do you happen to have performance improvement numbers? if so, please share them.

aelovikov-intel · 2025-03-25T21:31:14Z

The change improves "host" compilation times for cases with multiple kernels.

Out of curiosity, do you happen to have performance improvement numbers? if so, please share them.

14.4s -> 8.6s for $ time clang++ -fsycl a.cpp -o /dev/null on something like

  int *p;
  sycl::detail::loop<2>([&](auto outer_idx) {
    sycl::detail::loop<200>([&](auto idx) {
      auto krn = [=]() {
        *p = 42;
      };
      auto s = [&](sycl::handler &cgh) {
        // sycl::detail::CheckDeviceCopyable<decltype(krn)>();
        static_assert(std::is_invocable_r_v<void, decltype(krn)>);
        static_assert(!std::is_invocable_r_v<void, decltype(krn), sycl::handler>);
        static_assert(!std::is_invocable_r_v<void, decltype(krn), sycl::kernel_handler>);
        // krn();
        cgh.single_task(krn);
      };
      (void)sycl::detail::type_erased_cgfo_ty{s};
      static_assert(std::is_invocable_r_v<void, decltype(s), sycl::handler &>);
      q.submit(s);
    });
  });

bader · 2025-03-25T22:34:18Z

14.4s -> 8.6s for $ time clang++ -fsycl a.cpp -o /dev/null on something like

@aelovikov-intel, are you using debug build of the compiler? All these numbers seem to be too high.

aelovikov-intel · 2025-03-25T22:36:22Z

14.4s -> 8.6s for $ time clang++ -fsycl a.cpp -o /dev/null on something like

@aelovikov-intel, are you using debug build of the compiler? All these numbers seem to be too high.

It's 400 kernels, see compile-time loops.

bader · 2025-03-26T15:14:25Z

14.4s -> 8.6s for $ time clang++ -fsycl a.cpp -o /dev/null on something like

@aelovikov-intel, are you using debug build of the compiler? All these numbers seem to be too high.

It's 400 kernels, see compile-time loops.

@aelovikov-intel, would you mind checking if this change improves compile time of SYCL-CTS, please? SYCL-CTS compilation time with DPC++ compiler on GitHub runner exceeds the limit. I wonder if this change helps to fix this problem.

aelovikov-intel · 2025-03-26T17:44:43Z

14.4s -> 8.6s for $ time clang++ -fsycl a.cpp -o /dev/null on something like

@aelovikov-intel, are you using debug build of the compiler? All these numbers seem to be too high.

It's 400 kernels, see compile-time loops.

@aelovikov-intel, would you mind checking if this change improves compile time of SYCL-CTS, please? SYCL-CTS compilation time with DPC++ compiler on GitHub runner exceeds the limit. I wonder if this change helps to fix this problem.

Surprisingly, it might. 13m -> 10.5m on SPR+PVC, where there are a few tests that a bottlenecks. On a less powerful system might be even more impactful.

That said, I'm not sure how stable/reproducible the gain is.

bader · 2025-03-26T17:49:25Z

14.4s -> 8.6s for $ time clang++ -fsycl a.cpp -o /dev/null on something like

@aelovikov-intel, are you using debug build of the compiler? All these numbers seem to be too high.

It's 400 kernels, see compile-time loops.

@aelovikov-intel, would you mind checking if this change improves compile time of SYCL-CTS, please? SYCL-CTS compilation time with DPC++ compiler on GitHub runner exceeds the limit. I wonder if this change helps to fix this problem.

Surprisingly, it might. 13m -> 10.5m on SPR+PVC, where there are a few tests that a bottlenecks. On a less powerful system might be even more impactful.

That said, I'm not sure how stable/reproducible the gain is.

Thanks! I expect tests checking math built-ins and vector operations to benefit from this change. These tests auto-generate a lot of kernels.

[NFC][SYCL] Avoid unnecessary std::unique_ptr<HostKernel<...>> inst…

a2baaa8

…antiations The change improves "host" compilation times for cases with multiple kernels.

aelovikov-intel requested a review from a team as a code owner March 25, 2025 16:54

aelovikov-intel requested a review from uditagarwal97 March 25, 2025 16:54

aelovikov-intel temporarily deployed to WindowsCILock March 25, 2025 16:54 — with GitHub Actions Inactive

aelovikov-intel mentioned this pull request Mar 25, 2025

[SYCL] Refactor sycl::detail::HostKernel #17621

Closed

uditagarwal97 approved these changes Mar 25, 2025

View reviewed changes

aelovikov-intel temporarily deployed to WindowsCILock March 25, 2025 17:36 — with GitHub Actions Inactive

aelovikov-intel mentioned this pull request Mar 25, 2025

[SYCL] Avoid unnecessary kernel copies #17584

Merged

aelovikov-intel merged commit 42990a6 into intel:sycl Mar 26, 2025
44 of 46 checks passed

aelovikov-intel deleted the host-kernel-compile-time branch March 26, 2025 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NFC][SYCL] Avoid unnecessary `std::unique_ptr<HostKernel<...>>` instantiations #17640

[NFC][SYCL] Avoid unnecessary `std::unique_ptr<HostKernel<...>>` instantiations #17640

Uh oh!

aelovikov-intel commented Mar 25, 2025

Uh oh!

uditagarwal97 commented Mar 25, 2025

Uh oh!

aelovikov-intel commented Mar 25, 2025

Uh oh!

bader commented Mar 25, 2025

Uh oh!

aelovikov-intel commented Mar 25, 2025

Uh oh!

bader commented Mar 26, 2025

Uh oh!

Uh oh!

aelovikov-intel commented Mar 26, 2025

Uh oh!

bader commented Mar 26, 2025

Uh oh!

Uh oh!

[NFC][SYCL] Avoid unnecessary std::unique_ptr<HostKernel<...>> instantiations #17640

[NFC][SYCL] Avoid unnecessary std::unique_ptr<HostKernel<...>> instantiations #17640

Uh oh!

Conversation

aelovikov-intel commented Mar 25, 2025

Uh oh!

uditagarwal97 commented Mar 25, 2025

Uh oh!

aelovikov-intel commented Mar 25, 2025

Uh oh!

bader commented Mar 25, 2025

Uh oh!

aelovikov-intel commented Mar 25, 2025

Uh oh!

bader commented Mar 26, 2025

Uh oh!

Uh oh!

aelovikov-intel commented Mar 26, 2025

Uh oh!

bader commented Mar 26, 2025

Uh oh!

Uh oh!

[NFC][SYCL] Avoid unnecessary `std::unique_ptr<HostKernel<...>>` instantiations #17640

[NFC][SYCL] Avoid unnecessary `std::unique_ptr<HostKernel<...>>` instantiations #17640