Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the builtin functions (again) #440

Merged
merged 10 commits into from
Sep 7, 2023

Conversation

gmlueck
Copy link
Contributor

@gmlueck gmlueck commented Jul 21, 2023

After gaining some implementation experience, we decided to change the
clarification of the builtin functions. Our previous attempt in #428
added many overloads, and we discovered that this increased the time it
takes to compile the <sycl/sycl.hpp> header. After further reflection,
we decided that a different mix of overloads vs. templates was more
consistent with C++ and it also results in better compile times.

In general, we follow C++ whenever there is a precedent. For example,
the math functions like fmax and isinf have separate overloads for
scalar inputs because C++ defines these same function in that way.
However, we use template parameters for the variants that take marray
or vec because there is no C++ precedent here. Other functions
like clamp have a C++ precedent where scalar arguments are template
parameters, so SYCL follows suit here also.

Of course, there are many cases where there is no C++ precedent. In
these cases, SYCL tries to be self consistent. For example, the
relational functions like isequal have separate overloads for scalar
inputs to be consistent with isinf and the other math functions.

We also felt that the "generic type names" were adding confusion to
the spec. Therefore, this commit remove all these generic pseudo-type
definitions and shows the exact synopsis for each builtin function.
This was hard to do with the old table format, so the tables defining
the builtin functions have been completely rewritten to match the
style used by the newer builtin functions (e.g. the group functions).
This style leaves more horizontal space for the synopsis, which makes
it easier to write the synopsis without awkward line breaks. The
"Constraints" section of this style also makes it easier to describe
the constraints for the various template parameters.

Some other changes of note:

  • This PR also adds the member type alias value_type to vec, making
    it more consistent with marray. This makes it much easier to define
    the templated versions of the builtin functions because we can rely on
    this type alias for both vec and marray.

  • The Common Functions were inconsistently specified for the half
    type. We had defined some half overloads, but we missed others.
    This commit adds half support consistently to these functions.

  • The introductory paragraphs have been modified to remove general
    statements about the types that are supported by the functions.
    These statements were sometime in conflict with the actual function
    definitions, and it seemed better to avoid redundancy. Each function
    clearly specifies the allowed input types, so we do not need to say
    this again in the introduction.

  • The "native precision" and "half precision" math functions have been
    split out into their own sub-sections. It was easy to miss these
    functions before because they came after a long table of regular math
    functions. It's now much easier to see that they exist.

  • All uses of "latexmath" have been removed from the descriptions of
    the builtin functions and replaced with Asciidoc formatting. This
    results in a more consistent style, and it also fixed several
    problems where the latexmath version was not rendered correctly in
    the final PDF and/or HTML documents.

Note that this PR changes the section numbering for many of the
sections that define builtin functions. This is because the
section defining the "generic" pseudo-types no longer exists and
because there are new sections for the native-precision and half-
precision math functions.

Closes internal issue 278 (again).
Closes #321

After gaining some implementation experience, we decided to change the
clarification of the builtin functions.  Our previous attempt in #428
added many overloads, and we discovered that this increased the time it
takes to compile the <sycl/sycl.hpp> header.  After further reflection,
we decided that a different mix of overloads vs. templates was more
consistent with C++ and it also results in better compile times.

In general, we follow C++ whenever there is a precedent.  For example,
the math functions like `fmax` and `isinf` have separate overloads for
scalar inputs because C++ defines these same function in that way.
However, we use template parameters for the variants that take `marray`
or `vec` because there is no C++ precedent here.  Other functions
like `clamp` have a C++ precedent where scalar arguments are template
parameters, so SYCL follows suit here also.

Of course, there are many cases where there is no C++ precedent.  In
these cases, SYCL tries to be self consistent.  For example, the
relational functions like `isequal` have separate overloads for scalar
inputs to be consistent with `isinf` and the other math functions.

We also felt that the "generic type names" were adding confusion to
the spec.  Therefore, this commit remove all these generic pseudo-type
definitions and shows the exact synopsis for each builtin function.
This was hard to do with the old table format, so the tables defining
the builtin functions have been completely rewritten to match the
style used by the newer builtin functions (e.g. the group functions).
This style leaves more horizontal space for the synopsis, which makes
it easier to write the synopsis without awkward line breaks.  The
"Constraints" section of this style also makes it easier to describe
the constraints for the various template parameters.

Some other changes of note:

* The Common Functions were inconsistently specified for the `half`
  type.  We had defined some `half` overloads, but we missed others.
  This commit adds `half` support consistently to these functions.

* The introductory paragraphs have been modified to remove general
  statements about the types that are supported by the functions.
  These statements were sometime in conflict with the actual function
  definitions, and it seemed better to avoid redundancy.  Each function
  clearly specifies the allowed input types, so we do not need to say
  this again in the introduction.

* The "native precision" and "half precision" math functions have been
  split out into their own sub-sections.  It was easy to miss these
  functions before because they came after a long table of regular math
  functions.  It's now much easier to see that they exist.

* All uses of "latexmath" have been removed from the descriptions of
  the builtin functions and replaced with Asciidoc formatting.  This
  results in a more consistent style, and it also fixed several
  problems where the latexmath version was not rendered correctly in
  the final PDF and/or HTML documents.

Closes internal issue 278 (again).
Closes #321
Add a new type alias `value_type` to `vec`, making it more consistent
with `marray`.  This makes it much easier to define the templated
versions of the builtin functions because we can rely on this type
alias for both `vec` and `marray`.
Remove the old tables defining the builtin functions.  We no longer
need the section defining the generic pseudo-type names, so remove it
too.

Note that this changes the section numbering for all the builtin
functions!
Since we removed the generic pseudo-types, we can also remove the
rouge formatting support for them.
steffenlarsen added a commit to steffenlarsen/llvm that referenced this pull request Jul 25, 2023
Currently the host-side implementation of sycl::nextafter with
sycl::half uses the float variant of std::nextafter. However, due to the
conversion between half and float, the result may be unexpected.
Likewise, KhronosGroup/SYCL-Docs#440 removes the
reference to single-precision floating point results.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
steffenlarsen added a commit to intel/llvm that referenced this pull request Jul 26, 2023
Currently the host-side implementation of sycl::nextafter with
sycl::half uses the float variant of std::nextafter. However, due to the
conversion between half and float, the result may be unexpected.
Likewise, KhronosGroup/SYCL-Docs#440 removes the
reference to single-precision floating point results.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Restore some wording in `nextafter` to be the same as it was before
this PR.  That wording is incorrect, but I want to break this fix out
into a separate PR to highlight the change.
@nliber
Copy link
Collaborator

nliber commented Jul 28, 2023

We don't define the Words of Power (Constraints, Mandates, etc.). We should using the definitions at [structure.specifications#3]. That way there is no question on it being enforced via SFINAE/requires (Constraints) vs. static_assert (Mandates).

Slide 6: As I mentioned, I prefer getting rid of the obvious typename in the function declaration. I think it's optional as of C++20 (but I'm not a core wording expert).

As I learned in LWG on Wednesday, element_type is used when modifiers like const can be there, while value_type is used in other situations. (IDK how consistent the C++ standard is with that, though.)

Slide 10: The difference between templated functions and overloads is that overloads automatically allow implicit conversions. If we had

struct MyInt {
    operator int() const { /* ... */ }
    /* ... */
};

It wouldn't work. (I'm just pointing this out. I'm not sure which way we should go.)

(I'm out of time for today...)

Address review feedback.  Remove uses of `typename` that are obvious to
human readers.
@gmlueck
Copy link
Contributor Author

gmlueck commented Aug 2, 2023

Thanks @nliber for the review:

We don't define the Words of Power (Constraints, Mandates, etc.). We should using the definitions at [structure.specifications#3]. That way there is no question on it being enforced via SFINAE/requires (Constraints) vs. static_assert (Mandates).

I agree that it would be good to define these terms, but we should do it in a separate PR. The SYCL spec used these structural words even before my PR, so this is not something new that I added here. I opened internal issue 664 to capture this, and I'm happy to draft something once the WG agrees to a direction.

Slide 6: As I mentioned, I prefer getting rid of the obvious typename in the function declaration. I think it's optional as of C++20 (but I'm not a core wording expert).

I did this in 817fe7d.

As I learned in LWG on Wednesday, element_type is used when modifiers like const can be there, while value_type is used in other situations. (IDK how consistent the C++ standard is with that, though.)

Given that definition, I think value_type is the correct name for both vec and marray because neither adds any qualifiers to the template parameter. This is in line with the change I made in this PR (b328345).

@gmlueck
Copy link
Contributor Author

gmlueck commented Aug 2, 2023

Slide 10: The difference between templated functions and overloads is that overloads automatically allow implicit conversions.
[...]
(I'm just pointing this out. I'm not sure which way we should go.)

This is definitely an interesting case, which is why I wanted to highlight it in my slides.

In some sense, this is a tradeoff between consistency with C++ vs. consistency with the other SYCL integer functions. I think there are a few reasonable options:

  1. Choose consistency with SYCL, which is what I have in this PR now. An outcome of this is that a user-defined type that is convertible to int, long, or long long cannot be passed to sycl::abs, which is different from C++. However, it is consistent with the other SYCL integer functions.

  2. Choose consistency with C++ and only define the three scalar overloads, plus a templated one for non-scalars:

    int abs(int x);
    long abs(long x);
    long long abs(long long x);
    
    // Available only when NonScalar is marray, vec, or __swizzled_vec__
    template<typename NonScalar>
    /*return-type*/ abs(NonScalar x);
    

    An outcome of this is that unsigned scalar types cannot be passed to sycl::abs. This is consistent with C++, but it's an API break compared to previous SYCL versions.

  3. Choose a hybrid approach where we define the same scalar overloads as C++ and also define a templated function for the remaining scalar types (and for the non-scalar types):

    int abs(int x);
    long abs(long x);
    long long abs(long long x);
    
    // Available only when GenInt is any integral type, marray, vec, or __swizzled_vec__
    template<typename GenInt>
    /*return-type*/ abs(GenInt x);
    

    The compiler will choose one of the first three overloads when the input is int, long, or long long. It choose the template for any other integer type. It also choose one of the first three overloads for a user-defined type that is convertible to int, long, or long long.

    This allows user-defined types that are convertible to int, long or long long, but it still has some differences compared to C++. For example sycl::abs(static_cast<short>(1) returns short whereas C++ std::abs returns int (because the short value is promoted to int).

I have a weak preference for 1 over 3, and I have a stronger dislike of 2 because of the API break relative to previous SYCL releases.

Whatever we choose, I think we should do the same thing for abs_diff in order to keep these two functions consistent with each other.

@keryell
Copy link
Member

keryell commented Aug 10, 2023

I have a preference to solution 3 since I imagine that an ML engineer writing generic SYCL code for a ReLU and use it on an std::int_8 and might not want to have a 32 bit output after such a hard work.

@gmlueck
Copy link
Contributor Author

gmlueck commented Aug 22, 2023

I have a preference to solution 3 since I imagine that an ML engineer writing generic SYCL code for a ReLU and use it on an std::int_8 and might not want to have a 32 bit output after such a hard work.

Both option 1 and option 3 behave this way. Do you have a reason to prefer option 3 over option 1? Option 1 is what I have in this PR now. I have a weak preference for this option mainly because it retains consistency with all the other SYCL integer functions. As a reminder, option 1 defines sycl::abs like this:

template<typename GenInt>
/*return-type*/ abs(GenInt x)

Constraints: Available only if GenInt is a generic integer type as defined above.

Returns: When the input is a scalar, returns |x|. Otherwise, returns |x[i]| for each element of x. The behavior is undefined if the result cannot be represented by the return type.

The return type is GenInt unless GenInt is the __swizzled_vec__ type, in which case the return type is the corresponding vec.

Copy link
Collaborator

@AerialMantis AerialMantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thank you for putting this together, this is a lot of effort and I think it will really improve the clarity of the specification. Apologies it took me so long to review this, I've been reviewing it bit by bit over the past couple of weeks.

I think overall this looks great, I much prefer the horizontal wording format.

My main suggestion would prefer if possible, to avoid having the return type of builtins defined as /*return type*/ and instead have a type trait, I think this makes it clearer what type is returned and avoids having to describe it in the wording. One issue with this is inferring type information of a __swizzled_vec__ argument, but I think with a few type traits this could also work.

I also noticed that there is some wording that is repeated a lot for common function patterns but I'm wary of suggesting a simplification as I feel it would sacrifice readability.

On the question of overloads vs templates, I think I would lean towards option 1, as option 2 removes functionality and option 3 has an inconsistency which I worry would be confusing. Though I think there's a fourth option; have scalar overloads for signed and unsigned integer types, and keep the template function for marray/vec, this way user-defined conversions work, it would align with C++ for the functions that it defines and follows the same principal for the ones it doesn't, all scalar overloads would be consistent, and this would also be consistent with the floating point builtins.

adoc/chapters/programming_interface.adoc Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Outdated Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Outdated Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Outdated Show resolved Hide resolved
@gmlueck
Copy link
Contributor Author

gmlueck commented Aug 23, 2023

Hi @AerialMantis,

Thanks for reviewing this PR. I'd like to dig into this comment of yours because I'm not sure what you are proposing:

On the question of overloads vs templates, I think I would lean towards option 1, as option 2 removes functionality and option 3 has an inconsistency which I worry would be confusing. Though I think there's a fourth option; have scalar overloads for signed and unsigned integer types, and keep the template function for marray/vec, this way user-defined conversions work, it would align with C++ for the functions that it defines and follows the same principal for the ones it doesn't, all scalar overloads would be consistent, and this would also be consistent with the floating point builtins.

I assume you are proposing a 4th option for the definition of abs and abs_diff (slide 10 of my presentation), correct? Are you suggesting the following synopsis?

int abs(int x);
long abs(long x);
long long abs(long long x);
unsigned int abs(unsigned int x);
unsigned long abs(unsigned long x);
unsigned long long abs(unsigned long long x);

// Available only when GenInt is any integral type, marray, vec, or __swizzled_vec__
template<typename GenInt>
/*return-type*/ abs(GenInt x);

(I.e. the same as option 3, but adding overloads for unsigned int, unsigned long, and unsigned long long.)

Or, are you suggesting adding overloads for all scalar types like:

char abs(char x);
signed char abs(signed char x);
short abs(short x);
int abs(int x);
long abs(long x);
long long abs(long long x);
unsigned char abs(unsigned char x);
unsigned short abs(unsigned short x);
unsigned int abs(unsigned int x);
unsigned long abs(unsigned long x);
unsigned long long abs(unsigned long long x);

// Available only when NonScalar is marray, vec, or __swizzled_vec__
template<typename NonScalar>
/*return-type*/ abs(NonScalar x);

Either way, doesn't this have the same inconsistency that you point out for option 3? The other "integer" functions in SYCL do not have all these overloads. For example, clamp is defined like:

template<typename GenInt1, typename GenInt2, typename GenInt3>
/*return-type*/ clamp(GenInt1 x, GenInt2 minval, GenInt3 maxval)

template<typename NonScalar>
/*return-type*/ clamp(NonScalar x, NonScalar::value_type minval, NonScalar::value_type maxval)

I considered adding overloads for all 11 scalar integer types for all the "integer" functions in SYCL, however this would make SYCL inconsistent with C++. C++ defines std::clamp purely as a template (similar to what I show above for SYCL), with no separate overloads for scalar types. I thought it was important to retain consistency with C++ when possible, which is why I decided not to add these 11 scalar integer overloads to all the "integer" functions.

Address code review comment.  Remove this introductory text describing
the return value for the relational functions.  This is redundant now
because the function descriptions clearly describe the return value.
Address code review comment.  The `__swizzled_vec__` type does not have
any specified template parameters, so do use it in a context that
requires template parameters.
@gmlueck
Copy link
Contributor Author

gmlueck commented Aug 23, 2023

I had an AR to research precedents for the /*return-type*/ syntax used in this PR. I looked at both cppreference and the C++ specification. It looks like cppreference does use comments like this to represent things that cannot be written in C++ syntax. For example, see the definition of std::ignore in cppreference:

inline constexpr /*unspecified*/ ignore;

Here, cppreference uses the comment /*unspecified*/ in much the same way that I use /*return-type*/ in this PR.

The C++ specification, however, uses italic font. For example, see the definition of std::ignore in the C++ specification:

inline constexpr unspecified ignore;

Of the two, l actually prefer the comment style because I find that it stands out better visually. The italic font is more subtle, so it's easy to miss if you read quickly.

There is also a practical reason to prefer the comment style. The Aciidoctor toolchain makes it difficult to use italic font in source code blocks. I can get it to mostly work for the HTML render (with some minor degradation of the syntax highlighting). However, I cannot get it to work with the PDF render. If this were the only option, we could look into ways to work-around or fix the PDF render bug. However, it seems easier to use the comment style, and I prefer the way that style looks anyway.

@nliber
Copy link
Collaborator

nliber commented Aug 23, 2023

Searching through the standard for an example, we have in [memory.syn] the following declaration:

template<class Ptr>
constexpr auto to_address(const Ptr& p) noexcept;

and in [pointer.conversion], the verbiage:

Returns: pointer_traits<Ptr>::to_address(p) if that expression is well-formed,
otherwise to_address(p.operator->()).

We also could just use auto...

Copy link
Member

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need some time to dive into the discussions.

@@ -180,76 +180,6 @@ class Sycl < Cpp
wait_and_throw
)

# Generic types used in SYCL pseudo code descriptions like Gen,
# SGen, GenVec...
sycl_generic_types = %w(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest keeping this but empty, as this infrastructure could be used in the future instead of diving again into which Rouge feature we could use for a new kind of keywords in SYCL Next.
For example if we introduce some concept in SYCL Next.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I restored the rouge support in 904fb54.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious : where is this commit gone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks for noticing this! I forgot to push that commit to the branch corresponding to this PR. It should be there now.

The branches are a little complicated with all these stacked PRs.

@keryell
Copy link
Member

keryell commented Aug 24, 2023

We also could just use auto...

If we were not stuck in C++17...

Correct a mistake in this PR.  The `any` and `all` functions (for `vec`
inputs) should return 1 for true, not -1.  This is consistent with the
SYCL spec before this PR, and it is consistent with OpenCL.
@keryell
Copy link
Member

keryell commented Aug 24, 2023

We also could just use auto...

If we were not stuck in C++17...

@nliber rereading this I am unsure I understand which auto you are talking about.
Is this C++11 auto in

template<class Ptr>
constexpr auto to_address(const Ptr& p) noexcept;

or C++20 auto as a parameter type in

constexpr auto to_address(const auto& p) noexcept;

?

@keryell
Copy link
Member

keryell commented Aug 24, 2023

At the end, I wonder whether a 5th option with just

template <typename Anything>
auto abs(Anything x);

with implementation specialized for any type discussed here in the SYCL spec + any other type which are implicitly convertible to the types already handled in the SYC spec would not just work™.
This would solved the __swizzled__ case by side effect.
At the end, the current C++ overload mess is due to compatibility with C and computers from 1972.
I guess we are more interested in high-efficiency for computers from 2023 and later?
So, use std::abs for compatibility, sycl::abs for efficiency.

@gmlueck
Copy link
Contributor Author

gmlueck commented Aug 24, 2023

At the end, I wonder whether a 5th option with just

If this is where we want to go eventually, we should choose option 1 (i.e. the option already in this PR). Remember, this is a bug fix to SYCL 2020, so it's not the time to expand the scope of these functions. If we choose option 1 now, we can simply relax the constraints later, which gets us to your option 5.

@gmlueck
Copy link
Contributor Author

gmlueck commented Aug 24, 2023

@nliber rereading this I am unsure I understand which auto you are talking about.
Is this C++11 auto in

I assume @nliber meant the C++11 auto when used as the return type of a function. I'm not opposed to using this syntax, but I wonder if it gives the impression that an implementation must define these functions as auto? I don't think that is our intent. An implementation could define them to return some type-trait, for example, as @AerialMantis suggested.

To be honest, my preference is still to use the /*return-type*/ notation, but I think this is a minor part of the overall PR, so I'm happy to use auto if others like that.

Rather than deleting the rouge support for the "gentype" pseudo-types
completely, retain the support but leave it as an empty placeholder
that we can use in the future.  For example, we might use this to add
syntax coloring for concept names (if we add any).
Copy link
Member

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massive!

gmlueck added a commit to gmlueck/SYCL-Docs that referenced this pull request Sep 7, 2023
[This comment](KhronosGroup#440 (comment))
in KhronosGroup#440 noted that the number of elements in a `__swizzled_vec__` is
not clearly defined.  Clarify this and also clarify that the member
functions operate on the result of the swizzle operation.
@tomdeakin tomdeakin merged commit 96cd795 into SYCL-2020/master Sep 7, 2023
3 checks passed
mdtoguchi pushed a commit to mdtoguchi/llvm that referenced this pull request Oct 18, 2023
Currently the host-side implementation of sycl::nextafter with
sycl::half uses the float variant of std::nextafter. However, due to the
conversion between half and float, the result may be unexpected.
Likewise, KhronosGroup/SYCL-Docs#440 removes the
reference to single-precision floating point results.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@gmlueck gmlueck deleted the gmlueck/gentype-funcs-v2 branch January 3, 2024 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Description of common functions
5 participants