Skip to content

Conversation

@vladimirlaz
Copy link
Contributor

@vladimirlaz vladimirlaz commented Nov 27, 2020

When environment variable is set only one plugin is selected even if
several ones match criteria.

platform::get_platforms() and device::get_devices() ignored the
environment variable:

  • platform::get_platform() was updated to select only devices matching
    SYCL_DEVICE_FILTER;
  • platform::get_platform() is also fixed as it uses platform::get_devices()
    function to obtain list of devices;
  • device selectors also use functions above to get available devices and
    there is not need to do additional check in them;
  • for default_selector add extra score for devices selected by
    SYCL_DEVICE_FILTER;
  • add warning to sycl-ls when SYCL_DEVICE_FILTER is set.

@vladimirlaz vladimirlaz requested review from a team and smaslov-intel as code owners November 27, 2020 08:56
@vladimirlaz vladimirlaz changed the title [SYCL] Fix processing SYCL_DEVICE_FILTER [WIP][SYCL] Fix processing SYCL_DEVICE_FILTER Nov 27, 2020
@vladimirlaz
Copy link
Contributor Author

Please do not review. The fix is going to be reworked to avoid code duplication.

@vladimirlaz vladimirlaz changed the title [WIP][SYCL] Fix processing SYCL_DEVICE_FILTER [SYCL] Fix processing SYCL_DEVICE_FILTER Nov 30, 2020
@vladimirlaz
Copy link
Contributor Author

@smaslov-intel, the patch is ready for review.

@vladimirlaz
Copy link
Contributor Author

/Summary:run

@vladimirlaz
Copy link
Contributor Author

/summary:run

@vladimirlaz
Copy link
Contributor Author

the reported failure is known flaky test (SYCL :: Regression/complex_global_object.cpp) it can be ignored.

romanovvlad
romanovvlad previously approved these changes Dec 3, 2020
Copy link
Contributor

@romanovvlad romanovvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but would like @bso-intel and @smaslov-intel to approve as well.

@bso-intel
Copy link
Contributor

Hi Vladimir,

FYI, I already took a quick look your old code, but did not have a chance to look your latest updates.
I will review this PR by midnight today.
I am sorry. I have other urgent thing to take care today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl-ls uses platform::get_devices() to get the list of devices and print them in the order of indices of the vector PiDevices.
Please note that sycl-ls tool should be run without setting SYCL_DEVICE_FILTER.
That means sycl-ls will list all possible devices in the system regardless of the env var SYCL_DEVICE_FILTER.
Now, when this function is running with SYCL_DEVICE_FILTER set, we already disabled loading unrelated plugins.
For example, sycl-ls will list three GPU devices (OCL, CUDA, L0), one CPU, one ACC, and one HOST in this order.
And the user set SYCL_DEVICE_FILTER=level_zero:3 because sycl-ls listed L0 GPU as the index number 3.
SYCL RT will not even load open-cl and cuda plugins, so only L0 GPU device and one HOST are included in "PiDevices".
There is no index '3' in 'PiDevices' vector.
How do you handle this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning was added to sycl-ls to as user unset the variable to get proper indexes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I did not make myself clear.
What I wanted to point out is that the index is different when SYCL_DEVICE_FILTER is set and not set.
sycl-ls will report index without setting SYCL_DEVICE_FILTER because we will emit a warning if it is set.
user program will run with SYCL_DEVICE_FILTER is set. Remember once SYCL_DEVICE_FILTER is set, not all devices are loaded.
That means the same device will be given a different index between sycl-ls and the user program.
The end result is that the device_selector with a specific device number is not what the user has seen from sycl-ls.
Let me know if this does not make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The devices are enumerated within platform. See example for 8 FPGA emulator devices and all other below:

$ CL_CONFIG_CPU_EMULATE_DEVICES=8 bin/sycl-ls  --verbose
Platforms: 5
Platform [#1]:
    Version  : OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
    Name     : Intel(R) FPGA Emulation Platform for OpenCL(TM)
    Vendor   : Intel(R) Corporation
    Devices  : 8
        Device [#1]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#2]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#3]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#4]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#5]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#6]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#7]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#8]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
Platform [#2]:
    Version  : OpenCL 2.1 LINUX
    Name     : Intel(R) OpenCL
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#1]:
        Type       : CPU
        Version    : 2.1
        Name       : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
Platform [#3]:
    Version  : OpenCL 3.0
    Name     : Intel(R) OpenCL HD Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#1]:
        Type       : GPU
        Version    : 3.0
        Name       : Intel(R) Graphics Gen9 [0x3e92]
        Vendor     : Intel(R) Corporation
        Driver     : 20.42.18209
Platform [#4]:
    Version  : 1.0
    Name     : Intel(R) Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#1]:
        Type       : GPU
        Version    : 1.0
        Name       : Intel(R) Graphics Gen9
        Vendor     : Intel(R) Corporation
        Driver     : 1.0.18209
Platform [#5]:
    Version  : 1.2
    Name     : SYCL host platform
    Vendor   :
    Devices  : 1
        Device [#1]:
        Type       : HOST
        Version    : 1.2
        Name       : SYCL host device
        Vendor     :
        Driver     : 1.2
default_selector()      : GPU : Intel(R) Level-Zero 1.0 [1.0.18209]
host_selector()         : HOST: SYCL host platform 1.2 [1.2]
accelerator_selector()  : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
cpu_selector()          : CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]
gpu_selector()          : GPU : Intel(R) Level-Zero 1.0 [1.0.18209]
custom_selector(gpu)    : GPU : Intel(R) Level-Zero 1.0 [1.0.18209]
custom_selector(cpu)    : CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]

When SYCL_DEVICE_FILTER is set sycl-ls reports only devices which match filter and enumerate them sequentially. The warning is added to notify user:

$ CL_CONFIG_CPU_EMULATE_DEVICES=8 SYCL_DEVICE_FILTER=1,4,5 bin/sycl-ls  --verbose
Warning: SYCL_DEVICE_FILTER environment variable is set. Unset it to get full list of devices and correct device indexes.
Platforms: 2
Platform [#1]:
    Version  : OpenCL 1.2 Intel(R) FPGA SDK for OpenCL(TM), Version 20.3
    Name     : Intel(R) FPGA Emulation Platform for OpenCL(TM)
    Vendor   : Intel(R) Corporation
    Devices  : 3
        Device [#1]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#2]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
        Device [#3]:
        Type       : ACC
        Version    : 1.2
        Name       : Intel(R) FPGA Emulation Device
        Vendor     : Intel(R) Corporation
        Driver     : 2020.11.11.0.04_160000
Platform [#2]:
    Version  : 1.2
    Name     : SYCL host platform
    Vendor   :
    Devices  : 1
        Device [#1]:
        Type       : HOST
        Version    : 1.2
        Name       : SYCL host device
        Vendor     :
        Driver     : 1.2
default_selector()      : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
host_selector()         : HOST: SYCL host platform 1.2 [1.2]
accelerator_selector()  : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
cpu_selector()          : No device of requested type available. -1 (CL_DEVI...
gpu_selector()          : No device of requested type available. -1 (CL_DEVI...
custom_selector(gpu)    : No device of requested type available. -1 (CL_DEVI...
custom_selector(cpu)    : No device of requested type available. -1 (CL_DEVI...
custom_selector(acc)    : ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]

Also sycl-ls enumerates devices staring from 1 and SYCL_DEVICE_FILTER uses zero-based enumeration. but this should be addressed separately.

Copy link
Contributor

@bso-intel bso-intel Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I still did not make myself clear.
Let me try with an example.
Users should not add --verbose to get the desired device index.
Rather, we should ask the user to run without --verbose.
This way, they can get only the devices listed across all platforms. Here is expected output of sycl-ls with index of the device given at the beginning of each line.

  1. ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM)
  2. GPU : Intel(R) OpenCL HD Graphics 3.0 [20.44.18297]
  3. CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]
  4. GPU : Intel(R) Level-Zero 1.0 [1.0.18297]
  5. HOST: SYCL host platform 1.2 [1.2]

Now, let's assume user set SYCL_DEVICE_FILTER=level_zero:4 to set L0 GPU device.

There is no index "4" in PiDevices vector when the user program runs under SYCL_DEVICE_FILTER.
So, there is no device you can select that matches the SYCL_DEVICE_FILTER.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, Our documentation will be updated to reflect the change in sycl-ls.
The device number starts with 1, but not 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The device ID is counted within platform. No cross-platform enumeration is supported. this is how it worked before my change. If this behavior need to be changed please create feature request and make sure that all agreed on the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look at my comments below about the device number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I did not make myself clear.
What I wanted to point out is that the index is different when SYCL_DEVICE_FILTER is set and not set.
sycl-ls will report index without setting SYCL_DEVICE_FILTER because we will emit a warning if it is set.
user program will run with SYCL_DEVICE_FILTER is set. Remember once SYCL_DEVICE_FILTER is set, not all devices are loaded.
That means the same device will be given a different index between sycl-ls and the user program.
The end result is that the device_selector with a specific device number is not what the user has seen from sycl-ls.
Let me know if this does not make sense.

@vladimirlaz
Copy link
Contributor Author

/summary:run

@vladimirlaz
Copy link
Contributor Author

@bso-intel , @romanovvlad, @smaslov-intel could you please have a look? I addressed all provided comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In non-verbose mode, please append the device number for each device so that the user can use that number in SYCL_DEVICE_FILTER. At line 139.

Copy link
Contributor Author

@vladimirlaz vladimirlaz Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new feature request not connected to the bug I am fixing. I suggest to create issue and have broad discussion to make sure that all are on the same page before do major changes in interfaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly recommend putting the device number when listing the devices in this PR since your code changes in this PR already assumes this change in sycl-ls.
That's much clearer for users what device number to use for SYCL_DEVICE_FILTER.
I request @romanovvlad to comment on this.
If he is okay to delay to another PR, I am okay too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romanovvlad, friendly ping on question from @bso-intel

Copy link
Contributor

@bso-intel bso-intel Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we did not make it clear the definition of device number.
Device number is a global number in front of the device name at each line printed by sycl-ls without --verbose.
The device number is not a local number within each platform.
Here is an example output from sycl-ls tool.

  1. ACC : Intel(R) FPGA Emulation Platform for OpenCL(TM) 1.2 [2020.11.11.0.04_160000]
  2. GPU : Intel(R) OpenCL HD Graphics 3.0 [20.44.18297]
  3. CPU : Intel(R) OpenCL 2.1 [2020.11.11.0.04_160000]
  4. GPU : Intel(R) Level-Zero 1.0 [1.0.18297]
  5. HOST: SYCL host platform 1.2 [1.2]

Please note that SYCL_DEVICE_NUMBER=1 would be vague if you treat '1' as the local index within each platform.
@romanovvlad @smaslov-intel , what do you think?

Copy link
Contributor Author

@vladimirlaz vladimirlaz Dec 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misunderstanding: SYCL_DEVICE_FILTER always operated with device IDs within platform.
That was before my change and it stays the same after. If this behavior need to be changed it should be done as separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's not ture. My original implementation calculated the device index correctly across platforms.

device device_selector::select_device() const {
  vector_class<device> devices = device::get_devices();
  ...
  for (const auto &dev : devices) {
   ...
    // If SYCL_DEVICE_FILTER is set, give a bonus point for the device
    // whose index matches with desired device number.
    int index = &dev - &devices[0];
    if (isForcedDevice(dev, index)) {
      dev_score += 1000;
    }

Copy link
Contributor Author

@vladimirlaz vladimirlaz Dec 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That conflicts with documentation: "Device_num is an integer that indexes the enumeration of devices from the sycl::platform::get_device() call, where the first device in that enumeration has index zero. " (see https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md).
And finally there are bugs in that implementation because:

  • device_num field only affects device_selector() other selectors ignore it.
  • custom_selectors/user_selectros will ignore SYCL_DEVICE_FILTER.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm..
I wrote that document based on my original implementation.
The original implementation calls get_devices() and find the index correctly.
I don't understand what conflicts/bug you are referring to.
All device selectors are affected by SYCL_DEVICE_FILTER.
Let me repeat, SYCL_DEVICE_FILTER is dual-purposed, and it already filter out irrelevant plugins and not even load it, which means no related platforms and devices are included in the enumeration returned from get_devices().
For example, SYCL_DEVICE_FILTER=level_zero, no opencl platforms or devices are available because opencl plugin is not even loaded.
custom selectors cannot ignore SYCL_DEVICE_FILTER for this reason, too.

@vladimirlaz vladimirlaz requested a review from bso-intel December 9, 2020 09:14
Copy link
Contributor

@bso-intel bso-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Vladimir,

If we cannot resolve the discussion, let's have another meeting to discuss the pending issues and reach an agreement. Let me know.

@vladimirlaz vladimirlaz force-pushed the sycl branch 3 times, most recently from 52287fb to 7b87f47 Compare January 11, 2021 07:24
@vladimirlaz vladimirlaz marked this pull request as draft January 15, 2021 05:37
jsji pushed a commit that referenced this pull request Nov 22, 2024
This was hitting a "not implemented UNREACHABLE".
Like the other cooperative matrix operations, map this construct to a
SPIR-V friendly IR function call.

Let `transDbgInfo` skip over `OpConstantComposite` because we're
mapping `OpConstantComposite` to an LLVM `Instruction` without having
a corresponding `SPIRVInstruction`.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@04b546550077c4f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants