Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve get_kernel_bundle performance #5496

Merged

Conversation

romanovvlad
Copy link
Contributor

@romanovvlad romanovvlad commented Feb 7, 2022

The patch improves performance of get_kernel_bundle version which
takes a list of kernel_id's.

There are three main improvements:

  1. For faster search of binary image for given kernel ids a special map
    is prebuilt during initial image registration.
  2. To avoid constructing a vector of kernel_id's for a device image each
    time a special map is prebuilt during initial image registration.
  3. The vector of kernel_id's for device images is now shared to avoid
    unnecessary copies.

@romanovvlad
Copy link
Contributor Author

romanovvlad commented Feb 7, 2022

@steffenlarsen FYI

@steffenlarsen steffenlarsen self-requested a review February 7, 2022 10:02
@romanovvlad
Copy link
Contributor Author

Addressing kernel_bundle_api fail in intel/llvm-test-suite#818

@romanovvlad
Copy link
Contributor Author

/summary:run

sycl/source/detail/program_manager/program_manager.cpp Outdated Show resolved Hide resolved
sycl/source/backend.cpp Outdated Show resolved Hide resolved
sycl/source/detail/device_image_impl.hpp Outdated Show resolved Hide resolved
sycl/source/detail/device_image_impl.hpp Outdated Show resolved Hide resolved
@romanovvlad romanovvlad marked this pull request as ready for review February 17, 2022 13:14
@romanovvlad romanovvlad requested a review from a team as a code owner February 17, 2022 13:14
@romanovvlad
Copy link
Contributor Author

/summary:run

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@romanovvlad romanovvlad merged commit 4817b3f into intel:sycl Feb 18, 2022
smaslov-intel pushed a commit to smaslov-intel/llvm that referenced this pull request Feb 19, 2022
The patch improves performance of get_kernel_bundle version which
takes a list of kernel_id's.

There are three main improvements:

    1. For faster search of binary image for given kernel ids a special map
    is prebuilt during initial image registration.
    2. To avoid constructing a vector of kernel_id's for a device image each
    time a special map is prebuilt during initial image registration.
    3. The vector of kernel_id's for device images is now shared to avoid
    unnecessary copies.
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Feb 23, 2022
* upstream/sycl: (2757 commits)
  [SYCL][Doc] Fixing incorrect merge of community Readme.md with our version (intel#5636)
  [SYCL] Change USM pooling parameters. (intel#5457)
  [CI] Fix cache location on Windows (intel#5603)
  [SYCL][NFC] Fix a warning about uninitialized struct members (intel#5610)
  [Buildbot] Update Windows GPU version to 101.1340 (intel#5620)
  Fix SPIRV -> OCL barrier call argument attributes
  Move SPV_INTEL_memory_access_aliasing tokens from spirv_internal
  [SYCL][ESIMD] Add support for named barrier APIs (intel#5583)
  [SYCL][L0] Remove ZeModule when program build failed (intel#5541)
  [SYCL] Silence "unknown attribute" warning for `device_indirectly_callable` (intel#5591)
  [SYCL][DOC] Introductory material for extensions (intel#5605)
  [SYCL][DOC] Change extension names to lower case (intel#5607)
  [SYCL] Improve get_kernel_bundle performance (intel#5496)
  [SYCL] Do not build device code for sub-devices (intel#5240)
  [sycl-post-link] Fix a crash during spec-constant properties generation (intel#5538)
  [SYCL][DOC] Move SPIR-V and OpenCL extensions (intel#5578)
  [SYCL][ESIMD][EMU] Update memory intrinsics for ESIMD_EMU plugin (intel#4748)
  [CI] Allow stale issue bot to analyze more issues (intel#5602)
  [SYCL][L0] Honor property::queue::enable_profiling (intel#5543)
  [OpenMP] Properly save strings when doing LTO
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants