-
Notifications
You must be signed in to change notification settings - Fork 794
[SYCL] Fix processing SYCL_DEVICE_FILTER #2826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Please do not review. The fix is going to be reworked to avoid code duplication. |
|
@smaslov-intel, the patch is ready for review. |
|
/Summary:run |
|
/summary:run |
|
the reported failure is known flaky test (SYCL :: Regression/complex_global_object.cpp) it can be ignored. |
romanovvlad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but would like @bso-intel and @smaslov-intel to approve as well.
|
Hi Vladimir, FYI, I already took a quick look your old code, but did not have a chance to look your latest updates. |
sycl/source/detail/platform_impl.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sycl-ls uses platform::get_devices() to get the list of devices and print them in the order of indices of the vector PiDevices.
Please note that sycl-ls tool should be run without setting SYCL_DEVICE_FILTER.
That means sycl-ls will list all possible devices in the system regardless of the env var SYCL_DEVICE_FILTER.
Now, when this function is running with SYCL_DEVICE_FILTER set, we already disabled loading unrelated plugins.
For example, sycl-ls will list three GPU devices (OCL, CUDA, L0), one CPU, one ACC, and one HOST in this order.
And the user set SYCL_DEVICE_FILTER=level_zero:3 because sycl-ls listed L0 GPU as the index number 3.
SYCL RT will not even load open-cl and cuda plugins, so only L0 GPU device and one HOST are included in "PiDevices".
There is no index '3' in 'PiDevices' vector.
How do you handle this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning was added to sycl-ls to as user unset the variable to get proper indexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I did not make myself clear.
What I wanted to point out is that the index is different when SYCL_DEVICE_FILTER is set and not set.
sycl-ls will report index without setting SYCL_DEVICE_FILTER because we will emit a warning if it is set.
user program will run with SYCL_DEVICE_FILTER is set. Remember once SYCL_DEVICE_FILTER is set, not all devices are loaded.
That means the same device will be given a different index between sycl-ls and the user program.
The end result is that the device_selector with a specific device number is not what the user has seen from sycl-ls.
Let me know if this does not make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The devices are enumerated within platform. See example for 8 FPGA emulator devices and all other below:
$ CL_CONFIG_CPU_EMULATE_DEVICES=8 bin/sycl-ls --verbose
Platforms: 5
Platform [#1]:
Version : OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
Name : Intel(R) FPGA Emulation Platform for OpenCL(TM)
Vendor : Intel(R) Corporation
Devices : 8
Device [#1]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#2]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#3]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#4]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#5]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#6]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#7]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#8]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Platform [#2]:
Version : OpenCL 2.1 LINUX
Name : Intel(R) OpenCL
Vendor : Intel(R) Corporation
Devices : 1
Device [#1]:
Type : CPU
Version : 2.1
Name : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Platform [#3]:
Version : OpenCL 3.0
Name : Intel(R) OpenCL HD Graphics
Vendor : Intel(R) Corporation
Devices : 1
Device [#1]:
Type : GPU
Version : 3.0
Name : Intel(R) Graphics Gen9 [0x3e92]
Vendor : Intel(R) Corporation
Driver : 20.42.18209
Platform [#4]:
Version : 1.0
Name : Intel(R) Level-Zero
Vendor : Intel(R) Corporation
Devices : 1
Device [#1]:
Type : GPU
Version : 1.0
Name : Intel(R) Graphics Gen9
Vendor : Intel(R) Corporation
Driver : 1.0.18209
Platform [#5]:
Version : 1.2
Name : SYCL host platform
Vendor :
Devices : 1
Device [#1]:
Type : HOST
Version : 1.2
Name : SYCL host device
Vendor :
Driver : 1.2
default_selector() : GPU : Intel(R) Level-Zero 1.0 [1.0.18209]
host_selector() : HOST: SYCL host platform 1.2 [1.2]
accelerator_selector() : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
cpu_selector() : CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]
gpu_selector() : GPU : Intel(R) Level-Zero 1.0 [1.0.18209]
custom_selector(gpu) : GPU : Intel(R) Level-Zero 1.0 [1.0.18209]
custom_selector(cpu) : CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]
When SYCL_DEVICE_FILTER is set sycl-ls reports only devices which match filter and enumerate them sequentially. The warning is added to notify user:
$ CL_CONFIG_CPU_EMULATE_DEVICES=8 SYCL_DEVICE_FILTER=1,4,5 bin/sycl-ls --verbose
Warning: SYCL_DEVICE_FILTER environment variable is set. Unset it to get full list of devices and correct device indexes.
Platforms: 2
Platform [#1]:
Version : OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
Name : Intel(R) FPGA Emulation Platform for OpenCL(TM)
Vendor : Intel(R) Corporation
Devices : 3
Device [#1]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#2]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Device [#3]:
Type : ACC
Version : 1.2
Name : Intel(R) FPGA Emulation Device
Vendor : Intel(R) Corporation
Driver : 2020.11.11.0.04_160000
Platform [#2]:
Version : 1.2
Name : SYCL host platform
Vendor :
Devices : 1
Device [#1]:
Type : HOST
Version : 1.2
Name : SYCL host device
Vendor :
Driver : 1.2
default_selector() : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
host_selector() : HOST: SYCL host platform 1.2 [1.2]
accelerator_selector() : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
cpu_selector() : No device of requested type available. -1 (CL_DEVI...
gpu_selector() : No device of requested type available. -1 (CL_DEVI...
custom_selector(gpu) : No device of requested type available. -1 (CL_DEVI...
custom_selector(cpu) : No device of requested type available. -1 (CL_DEVI...
custom_selector(acc) : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
Also sycl-ls enumerates devices staring from 1 and SYCL_DEVICE_FILTER uses zero-based enumeration. but this should be addressed separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I still did not make myself clear.
Let me try with an example.
Users should not add --verbose to get the desired device index.
Rather, we should ask the user to run without --verbose.
This way, they can get only the devices listed across all platforms. Here is expected output of sycl-ls with index of the device given at the beginning of each line.
- ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM)
- GPU : Intel(R) OpenCL HD Graphics 3.0 [20.44.18297]
- CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]
- GPU : Intel(R) Level-Zero 1.0 [1.0.18297]
- HOST: SYCL host platform 1.2 [1.2]
Now, let's assume user set SYCL_DEVICE_FILTER=level_zero:4 to set L0 GPU device.
There is no index "4" in PiDevices vector when the user program runs under SYCL_DEVICE_FILTER.
So, there is no device you can select that matches the SYCL_DEVICE_FILTER.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, Our documentation will be updated to reflect the change in sycl-ls.
The device number starts with 1, but not 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The device ID is counted within platform. No cross-platform enumeration is supported. this is how it worked before my change. If this behavior need to be changed please create feature request and make sure that all agreed on the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look at my comments below about the device number.
sycl/source/detail/platform_impl.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I did not make myself clear.
What I wanted to point out is that the index is different when SYCL_DEVICE_FILTER is set and not set.
sycl-ls will report index without setting SYCL_DEVICE_FILTER because we will emit a warning if it is set.
user program will run with SYCL_DEVICE_FILTER is set. Remember once SYCL_DEVICE_FILTER is set, not all devices are loaded.
That means the same device will be given a different index between sycl-ls and the user program.
The end result is that the device_selector with a specific device number is not what the user has seen from sycl-ls.
Let me know if this does not make sense.
|
/summary:run |
|
@bso-intel , @romanovvlad, @smaslov-intel could you please have a look? I addressed all provided comments. |
sycl/tools/sycl-ls/sycl-ls.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In non-verbose mode, please append the device number for each device so that the user can use that number in SYCL_DEVICE_FILTER. At line 139.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new feature request not connected to the bug I am fixing. I suggest to create issue and have broad discussion to make sure that all are on the same page before do major changes in interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly recommend putting the device number when listing the devices in this PR since your code changes in this PR already assumes this change in sycl-ls.
That's much clearer for users what device number to use for SYCL_DEVICE_FILTER.
I request @romanovvlad to comment on this.
If he is okay to delay to another PR, I am okay too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romanovvlad, friendly ping on question from @bso-intel
sycl/source/device_selector.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we did not make it clear the definition of device number.
Device number is a global number in front of the device name at each line printed by sycl-ls without --verbose.
The device number is not a local number within each platform.
Here is an example output from sycl-ls tool.
- ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
- GPU : Intel(R) OpenCL HD Graphics 3.0 [20.44.18297]
- CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]
- GPU : Intel(R) Level-Zero 1.0 [1.0.18297]
- HOST: SYCL host platform 1.2 [1.2]
Please note that SYCL_DEVICE_NUMBER=1 would be vague if you treat '1' as the local index within each platform.
@romanovvlad @smaslov-intel , what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is misunderstanding: SYCL_DEVICE_FILTER always operated with device IDs within platform.
That was before my change and it stays the same after. If this behavior need to be changed it should be done as separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's not ture. My original implementation calculated the device index correctly across platforms.
device device_selector::select_device() const {
vector_class<device> devices = device::get_devices();
...
for (const auto &dev : devices) {
...
// If SYCL_DEVICE_FILTER is set, give a bonus point for the device
// whose index matches with desired device number.
int index = &dev - &devices[0];
if (isForcedDevice(dev, index)) {
dev_score += 1000;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That conflicts with documentation: "Device_num is an integer that indexes the enumeration of devices from the sycl::platform::get_device() call, where the first device in that enumeration has index zero. " (see https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md).
And finally there are bugs in that implementation because:
- device_num field only affects device_selector() other selectors ignore it.
- custom_selectors/user_selectros will ignore SYCL_DEVICE_FILTER.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..
I wrote that document based on my original implementation.
The original implementation calls get_devices() and find the index correctly.
I don't understand what conflicts/bug you are referring to.
All device selectors are affected by SYCL_DEVICE_FILTER.
Let me repeat, SYCL_DEVICE_FILTER is dual-purposed, and it already filter out irrelevant plugins and not even load it, which means no related platforms and devices are included in the enumeration returned from get_devices().
For example, SYCL_DEVICE_FILTER=level_zero, no opencl platforms or devices are available because opencl plugin is not even loaded.
custom selectors cannot ignore SYCL_DEVICE_FILTER for this reason, too.
bso-intel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Vladimir,
If we cannot resolve the discussion, let's have another meeting to discuss the pending issues and reach an agreement. Let me know.
52287fb to
7b87f47
Compare
This was hitting a "not implemented UNREACHABLE". Like the other cooperative matrix operations, map this construct to a SPIR-V friendly IR function call. Let `transDbgInfo` skip over `OpConstantComposite` because we're mapping `OpConstantComposite` to an LLVM `Instruction` without having a corresponding `SPIRVInstruction`. Original commit: KhronosGroup/SPIRV-LLVM-Translator@04b546550077c4f
When environment variable is set only one plugin is selected even if
several ones match criteria.
platform::get_platforms() and device::get_devices() ignored the
environment variable:
SYCL_DEVICE_FILTER;
function to obtain list of devices;
there is not need to do additional check in them;
SYCL_DEVICE_FILTER;