[SYCL2020] Migrate information descriptors to their respective namespaces #987

nmnobre · 2023-03-27T23:51:55Z

Hi Aksel,

I've tried to move all the information descriptors from their old enum classes to their new namespaces.
The motivation for this is that info::device::max_work_item_sizes is now a template and I couldn't figure out a clean way of maintaining the old way of doing things while implementing this template...
I've not removed or added any functionality and, in particular, that means that old features like the program class and its information descriptors remain, just with new syntax. I believe this should be completely transparent for the user, it's just the slightly awkward situation of implementing some parts of the old standard in the new standard's way...

Closes #774.

Cheers,
-Nuno

illuhad

Wow, Nuno, you are on fire :D

illuhad · 2023-03-28T00:33:30Z

include/hipSYCL/sycl/device.hpp

+      rt::device_uint_property::max_group_size0));
+  std::size_t size1 = static_cast<std::size_t>(get_rt_device()->get_property(
+      rt::device_uint_property::max_group_size1));
+  return id<2>{size0, size1};


Here we need to be extra careful. The main motivation for switching to structs instead of the old enums in SYCL 2020 was exactly this query: SYCL implementations for some cases need to flip the dimensions for some backends.
E.g. on CUDA, the "vectorized" index is always x (i.e. 0), while the SYCL memory layout wants the highest index to be the fast one.
So, on such backends, SYCL implementations flip indices such that from the perspective of the hardware, x is the fast index, while from the application point of view it remains the highest index.

I believe that these runtime queries return directly the result from the backend for the respective dimension (Can you double check?). Therefore, to make everything match, we would need to check here whether a flip is required (this might require a new property needs_index_flip or similar for the runtime), and if so, return {size1, size0}.

Yeah, I think you suggested that in #832, I'll look into that...

Hi Aksel, I've taken a shortcut by just flipping the return value of the query irrespectively of the runtime backend.
Here's my reasoning:

CUDA: the flip is needed, thus covered.

HIP: since they've followed the CUDA implementation so closely, I'd assume the flip also applies. In any case, I've confirmed that on both the MI100 and MI200 series, the limits are the same for the three dimensions (1024), so it's rather irrelevant.

OpenMP: the limits are currently being hardcoded by us and they are the same for all three dimensions (1024).

Level Zero: not sure here, never tried this backend myself if I'm honest, but all examples I found online seem to indicate the limits are also the same for the three dimensions. See here and also here.

Thoughts?

It's probably true that currently we might be able to get by by just flipping in any case. HIP, CUDA and Level Zero does the flip, and OpenMP does not (but exposes similar limits in all dimensions as you point out - although it's questionable whether this is actually correct).

My concern is that this introduces an assumption that might not hold anymore in the future e.g. if new backends are introduced, and might be surprising then. To me it seems the better strategy long term to ask those components that actually know the details (such as the runtime backends) instead of making assumptions in the SYCL headers.

Is there a problem introducing new runtime queries?

An alternative to asking whether things should flip could be to directly add a runtime query for the 1, 2, and 3-dimensional maximums. That might be even more flexible.

No, there is no problem, it's just I guess I started wondering if it was worth the trouble. :-)
Please have a look now, do I just leave the properties hardcoded or is there a better way of querying the backends?

It looks good to me. What do you mean by hardcoded properties for querying the backends?

So, you knew that...

HIP, CUDA and Level Zero does the flip, and OpenMP does not

But I didn't. So I was wondering if there is a property of the respective runtimes I could query that would expose the need for flipping instead of just returning the boolean literals explicitly like we're doing now.

illuhad · 2023-03-28T00:33:50Z

include/hipSYCL/sycl/device.hpp

+  return id<2>{size0, size1};
+}
+
+HIPSYCL_SPECIALIZE_GET_INFO(device, max_work_item_sizes<3>)


illuhad · 2023-04-05T06:36:06Z

include/hipSYCL/runtime/hardware.hpp

@@ -60,6 +60,7 @@ enum class device_uint_property {
  max_global_size0,
  max_global_size1,
  max_global_size2,
+  max_group_sizes_need_flip,


Is this specific to max_group_sizes? It also flips the work item indices passed into the kernel, global ranges etc. So maybe just flips_dimensions or needs_dimension_flip or something like that?

True, I've opted for needs_dimension_flip.
So I expect there's other places, e.g. kernel launches, that need tweaking, can you point me to those?

Not really. The relevant code would be the kernel launchers (inside glue), but those are already backend-specific and know the right behavior for their backend.
The one exception is the sscp kernel launcher, but that one currently only supports backends that do the flip, and so it just flips. Even then I'm not sure it would be possible to have the SSCP compilation flow flip only in some cases, since the assignment of builtin queries to indices would need to change which is handled inside the SYCL headers, so outside the control of the SSCP kernel launcher.

Okay, so do you still think it's worth having this property at all?
I guess it makes sense if you can think of a backend out there that doesn't do the flip and whose limits aren't the same for every dimension.

Otherwise, I'm happy if you are with this PR. :-)

illuhad · 2023-04-25T22:30:45Z

include/hipSYCL/sycl/info/param_traits.hpp

 struct param_traits {};

 #define HIPSYCL_PARAM_TRAIT_RETURN_VALUE(T, param, ret_value) \
  template<> \
-  struct param_traits<T, param> \
+  struct param_traits<param> \


I've just looked at the spec again to check what it says on param_traits - the only mention I find is this:

Information query descriptors have been changed to structures under namespaces named accordingly. param_traits has been removed and the return type of an information query is now contained in the descriptor. [...]

from here: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:what-changed-between

However, I can't find any mention of how exactly the return type should be exposed in the struct. Arguably, that one quote might not even be normative since it just describes the changes as a changelog.

Are you aware of any place where it says more?
I'm not sure it makes sense to do the required refactoring if the spec does not even clearly say anything. What do you think?

Right... I didn't think this was standardised. Well, looking at what our friends in blue did:

https://github.com/intel/llvm/blob/f81499ecad464bd03d7f50e95004eeed1e85dbc9/sycl/include/sycl/info/info_desc.hpp#L26-L29

We should probably do something similar then... I'm thinking of removing param_traits.hpp and creating two macros in info.hpp, one for the plain case and the other for the templated case, just like their __SYCL_PARAM_TRAITS_SPEC and __SYCL_PARAM_TRAITS_TEMPLATE_SPEC. Help me choose some names:
HIPSYCL_INFO_DESCRIPTOR(param, ret_value) and HIPSYCL_INFO_DESCRIPTOR_TEMPLATE(...)?

Well, I'd say that arguably it's not standardized.. That one sentence is only a changelog. So, we could adopt the point of view that this is not required. However, it is possible that, if this is discussed inside the SYCL WG, it is found that this is just an oversight, and we have to implement the type inside the descriptor classes.

If you want to make the transition, I agree with the approach from DPC++.

Do we actually need two macros?

#define HIPSYCL_DECLARE_INFO_DESCRIPTOR(param, ret_type) \ struct param { using return_type = ret_type; }; template<int Dim> HIPSYCL_DECLARE_INFO_DESCRIPTOR(max_group_sizes, range<Dim>)

?

True. Solid.

I'd just rather argue this a definition, and so HIPSYCL_DEFINE_INFO_DESCRIPTOR().

You're totally right. Define it is :-)

Are you aware of any place where it says more? I'm not sure it makes sense to do the required refactoring if the spec does not even clearly say anything. What do you think?

I found this for get_info() from https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#table.members.device:

"Queries this SYCL device for information requested by the template parameter Param. The type alias Param::return_type must be defined in accordance with the info parameters in Table 25 to facilitate returning the type associated with the Param parameter."

In hindsight, I'm supper happy we decided to do the refactoring.
Same functionality (or just slightly more) in 60% of the lines of code. 🎉

Yep, looks much better :) Thanks for all this work :)

illuhad · 2023-05-03T13:41:36Z

include/hipSYCL/sycl/info/device.hpp

+  HIPSYCL_DEFINE_INFO_DESCRIPTOR(max_work_item_dimensions, detail::u_int);
+
+  template<int Dimensions = 3>
+  HIPSYCL_DEFINE_INFO_DESCRIPTOR(max_work_item_sizes, id<Dimensions>);


It should be range<Dim>, not id. Looks like this is a preexisting bug, so we could also fix this in a separate PR if your prefer. https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_device_information_descriptors

illuhad

Thanks Nuno!

illuhad reviewed Mar 28, 2023

View reviewed changes

nmnobre force-pushed the templatez branch 2 times, most recently from 4b8096f to f8f24d5 Compare April 3, 2023 15:30

illuhad reviewed Apr 5, 2023

View reviewed changes

nmnobre force-pushed the templatez branch from f8f24d5 to f72ad63 Compare April 5, 2023 11:25

nmnobre requested a review from illuhad April 6, 2023 09:05

illuhad reviewed Apr 25, 2023

View reviewed changes

nmnobre force-pushed the templatez branch 2 times, most recently from edf6a72 to cc0e979 Compare April 27, 2023 16:09

nmnobre requested a review from illuhad April 27, 2023 19:02

nmnobre force-pushed the templatez branch 3 times, most recently from 8db9d4b to a3151fa Compare May 2, 2023 21:32

illuhad reviewed May 3, 2023

View reviewed changes

nmnobre added 10 commits May 3, 2023 14:52

Migrate context information descriptors interface

add6417

Migrate device information descriptors interface

b71aba5

Migrate event information descriptors interface

b14f779

Migrate kernel information descriptors interface

462fcf8

Migrate platform information descriptors interface

aa9a95f

Migrate program information descriptors interface

c6b428a

Migrate queue information descriptors interface

b2785f4

Migrate generic information descriptors helper templates

baca3ca

Flip max_work_item_sizes queries for select backends

f53d8ae

Use the new macro to define the device-specific kernel info descriptors

25678f8

nmnobre force-pushed the templatez branch 2 times, most recently from 18af41e to 8b1e73c Compare May 3, 2023 14:00

Fix return type for max_work_item_sizes

3d23d09

nmnobre force-pushed the templatez branch from 8b1e73c to 3d23d09 Compare May 3, 2023 14:07

illuhad approved these changes May 3, 2023

View reviewed changes

illuhad merged commit d2bd9fc into AdaptiveCpp:develop May 3, 2023
17 checks passed

fknorr mentioned this pull request Aug 3, 2023

Remove old workarounds; Fix build with current hipSYCL celerity/celerity-runtime#200

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL2020] Migrate information descriptors to their respective namespaces #987

[SYCL2020] Migrate information descriptors to their respective namespaces #987

nmnobre commented Mar 27, 2023 •

edited

illuhad left a comment

illuhad Mar 28, 2023 •

edited

nmnobre Mar 28, 2023

nmnobre Apr 3, 2023

illuhad Apr 3, 2023

nmnobre Apr 3, 2023 •

edited

illuhad Apr 5, 2023

nmnobre Apr 5, 2023

illuhad Mar 28, 2023

illuhad Apr 5, 2023

nmnobre Apr 5, 2023

illuhad Apr 5, 2023

nmnobre Apr 5, 2023

illuhad Apr 25, 2023

nmnobre Apr 26, 2023

illuhad Apr 26, 2023

nmnobre Apr 26, 2023

illuhad Apr 26, 2023

nmnobre Apr 27, 2023 •

edited

nmnobre Apr 27, 2023 •

edited

illuhad May 3, 2023

illuhad May 3, 2023

nmnobre May 3, 2023

illuhad left a comment

[SYCL2020] Migrate information descriptors to their respective namespaces #987

[SYCL2020] Migrate information descriptors to their respective namespaces #987

Conversation

nmnobre commented Mar 27, 2023 • edited

illuhad left a comment

Choose a reason for hiding this comment

illuhad Mar 28, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmnobre Apr 3, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nmnobre Apr 27, 2023 • edited

Choose a reason for hiding this comment

nmnobre Apr 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

illuhad left a comment

Choose a reason for hiding this comment

nmnobre commented Mar 27, 2023 •

edited

illuhad Mar 28, 2023 •

edited

nmnobre Apr 3, 2023 •

edited

nmnobre Apr 27, 2023 •

edited

nmnobre Apr 27, 2023 •

edited