Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify "gentype" function signatures #428

Merged
merged 7 commits into from
Jun 29, 2023

Conversation

gmlueck
Copy link
Contributor

@gmlueck gmlueck commented Jun 9, 2023

This commit clarifies the expected signatures of the functions defined
in section 4.17.5 "Math functions" - 4.17.9 "Relational functions".

  • Previously, it was not clear how much templating (if any) is present
    in these functions. We now clarify that these functions are mostly
    exposed as overloads, using template parameters only for the
    marray extents and for the multi_ptr parameters.

  • Previously, it was not clear how the "generic type names" should be
    interpreted when a function signature uses several generic type
    names like this:

    genfloat fmax(genfloat x, sgenfloat y)
    

    Did we intend fmax to have overloads for all possible combinations
    of types from genfloat and sgenfloat, or are only certain
    combinations expected? Cases like this have been clarified by
    rewriting the spec to avoid multiple generic type names in the same
    signature. For example:

    genfloatf fmax(genfloatf x, float y)
    genfloatd fmax(genfloatd x, double y)
    genfloath fmax(genfloath x, half y)
    

    To resolve cases that weren't obvious, I aligned with the OpenCL
    behavior, which makes sense because all of these functions were
    originally derived from OpenCL.

  • There are still two cases when a function could be defined with
    two different generic type names. These are resolved by adding
    the names unsignedtype and elementtype and specifying exactly
    how these generic types interact with the other generic types.

  • The upsample, mad24, and mul24 functions presented an
    interesting problem. These were defined in terms of the
    {i,u}genintegerNbit generic types, which lead to ambiguities in
    cases like this:

    igeninteger32bit upsample(igeninteger16bit hi, ugeninteger16bit lo)
    

    The igeninteger32bit type was defined to be all signed integer
    types that are 32 bits wide. It was therefore unclear whether
    this function should return int or long for an implementation
    where both types are 32-bits. This was resolved by defining these
    three functions exclusively in terms of the fixed-width integer
    types. In the example above, the return type is now int32_t
    (or a vec or marray of int32_t). This makes sense because all
    three of these functions assume input values which have a specific
    number of bits.

  • The overloads involving marray of integer types were also
    simplified. To give an example, we previously defined overloads for
    both mshort{n} and marray<short,{N}>. This may seem purely
    redundant, but mshort{n} is defined as marray<int16_t,{N}>. If
    we assume that each of the fixed-width integer types aliases to one
    of the fundamental integer types, there is no need to define
    overloads for both. We therefore define overloads only for the
    fundamental integer types. This will only make a difference for a
    bizarre implementation where a fixed-width integer type is a distinct
    type, different from all the fundamental types. We do not think that
    SYCL needs to define special overloads for this case, which is
    consistent with the C++ standard (which also does not define special
    library function overloads for the fixed-width integer types).

    Note that all the overloads involving vec of integer types are
    defined exclusively for the fixed-width integer types (and not the
    fundamental types). This was the case even before this commit. This
    makes sense if we think the main use for vec is for compatibility
    with OpenCL, where the sizes of the integer types are all fixed.

  • The structure of the table of "Generic type names" was also improved
    for clarity. Previously, there was a deep hierarchy of generic type
    names, where many generic types were defined in terms of other
    generic types. A reader needed to traverse down through the
    hierarchy in order to understand exactly which concrete types were
    included in each generic type. The table no longer has any
    hierarchy, so each generic type now lists the full set of concrete
    types that it represents, which makes it much easier to read.

  • The following statement was removed from section 4.17.1 "Description
    of generic type names" as part of my rewrite of the introduction of
    that section:

    All of the OpenCL built-in types are available in the namespace
    sycl.

    I think this statement is erroneous because we did not intend to
    export all of these types into the sycl namespace. In addition,
    this would be a very strange place for us to say that.

Closes internal issue 278.
Closes internal issue 590.

This commit clarifies the expected signatures of the functions defined
in section 4.17.5 "Math functions" - 4.17.9 "Relational functions".

* Previously, it was not clear how much templating (if any) is present
  in these functions.  We now clarify that these functions are mostly
  exposed as overloads, using template parameters only for the `vec`
  and `marray` extents and for the `multi_ptr` parameters.

* Previously, it was not clear how the "generic type names" should be
  interpreted when a function signature uses several generic type
  names like this:

  ```
  genfloat fmax(genfloat x, sgenfloat y)
  ```

  Did we intend `fmax` to have overloads for all possible combinations
  of types from `genfloat` and `sgenfloat`, or are only certain
  combinations expected?  Cases like this have been clarified by
  rewriting the spec to avoid multiple generic type names in the same
  signature.  For example:

  ```
  genfloatf fmax(genfloatf x, float y)
  genfloatd fmax(genfloatd x, double y)
  genfloath fmax(genfloath x, half y)
  ```

  To resolve cases that weren't obvious, I aligned with the OpenCL
  behavior, which makes sense because all of these functions were
  originally derived from OpenCL.

* There are still two cases when a function could be defined with
  two different generic type names.  These are resolved by adding
  the names `unsignedtype` and `elementtype` and specifying exactly
  how these generic types interact with the other generic types.

* The `upsample`, `mad24`, and `mul24` functions presented an
  interesting problem.  These were defined in terms of the
  `{i,u}genintegerNbit` generic types, which lead to ambiguities in
  cases like this:

  ```
  igeninteger32bit upsample(igeninteger16bit hi, ugeninteger16bit lo)
  ```

  The `igeninteger32bit` type was defined to be all signed integer
  types that are 32 bits wide.  It was therefore unclear whether
  this function should return `int` or `long` for an implementation
  where both types are 32-bits.  This was resolved by defining these
  three functions exclusively in terms of the fixed-width integer
  types.  In the example above, the return type is now `int32_t`
  (or a `vec` or `marray` of `int32_t`).  This makes sense because all
  three of these functions assume input values which have a specific
  number of bits.

* The overloads involving `marray` of integer types were also
  simplified.  To give an example, we previously defined overloads for
  both `mshort{n}` and `marray<short,{N}>`.  This may seem purely
  redundant, but `mshort{n}` is defined as `marray<int16_t,{N}>`.  If
  we assume that each of the fixed-width integer types aliases to one
  of the fundamental integer types, there is no need to define
  overloads for both.  We therefore define overloads only for the
  fundamental integer types.  This will only make a difference for a
  bizarre implementation where a fixed-width integer type is a distinct
  type, different from all the fundamental types.  We do not think that
  SYCL needs to define special overloads for this case, which is
  consistent with the C++ standard (which also does not define special
  library function overloads for the fixed-width integer types).

  Note that all the overloads involving `vec` of integer types are
  defined exclusively for the fixed-width integer types (and not the
  fundamental types).  This was the case even before this commit.  This
  makes sense if we think the main use for `vec` is for compatibility
  with OpenCL, where the sizes of the integer types are all fixed.

* The structure of the table of "Generic type names" was also improved
  for clarity.  Previously, there was a deep hierarchy of generic type
  names, where many generic types were defined in terms of other
  generic types.  A reader needed to traverse down through the
  hierarchy in order to understand exactly which concrete types were
  included in each generic type.  The table no longer has any
  hierarchy, so each generic type now lists the full set of concrete
  types that it represents, which makes it much easier to read.

* The following statement was removed from section 4.17.1 "Description
  of generic type names" as part of my rewrite of the introduction of
  that section:

  > All of the OpenCL built-in types are available in the namespace
  > `sycl`.

  I think this statement is erroneous because we did not intend to
  export all of these types into the `sycl` namespace.  In addition,
  this would be a very strange place for us to say that.

Closes internal issue 278.
Closes internal issue 590.
Comment on lines +19495 to +19498
* Unless specified otherwise, the template parameter [code]#Space# is
constrained to the values [code]#access::address_space::global_space#,
[code]#access::address_space::local_space#, or
[code]#access::address_space::private_space#.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this again this morning... Why aren't these functions available for generic pointers? Pointers are only ever used for return types, so it seems odd to limit the available address spaces here.

It would make sense to keep a note that Space cannot be constant_space, but we could remove that in SYCL-Next (since constant_space is deprecated).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this too, and I was thinking of creating an issue for this. The spec had the same limitation before my PR, and I didn't want to expand the scope of this PR to include new additions. In addition to adding generic_space, I think we should consider adding new overloads that take raw pointers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed this in #432 which is rebased on this PR.

There is no need to constrain the functions taking `vec` and `marray`
to legal values for the number of elements.  The `vec` and `marray`
types themselves should have `static_assert` to guarantee that the
number of elements is legal.
----
float fmax(float x, float y)
double fmax(double x, double y)
template<int N> vec<float,N> fmax(vec<float,N> x, vec<float,N> y)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to keep the vec variants as templates? Since the set of valid values for {n} is so limited we could just have overloads for each one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is possible. I did it this way because I think this is what we agreed on when we discussed internal issue 278.

One advantage to implementing the vec versions as many overloads is that it allows implicit conversion from __swizzled_vec__. For example:

vec<float, 3> v{1.0f, 2.0f, 3.0f};
fmax(v.swizzle<2, 0, 1>(), v.swizzle<0, 1, 2>());

The calls to v.swizzle above return the unnamed type __swizzled_vec__ which is convertible to vec<float, 3>. If we implement fmax as a template, I believe the code above won't compile because template resolution doesn't consider conversions.

If we implement fmax as separate overloads for all of the {n} possibilities, then I think the code above does compile because overload resolution does consider conversions.

There is no similar issue with marray because there is no type like __swizzled_vec__ that is convertible to an marray. In any event, we cannot do multiple overloads for marray because there aren't a fixed number of {N} values.

@illuhad how does hipSYCL deal with the vec variants of these math functions? Is the {n} a template parameter or are there many different overloads for all the {n} possibilities?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An implementation could choose to use only generic code and emulate all the overload behavior by meta-programming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this would be visible to applications, though, so this could cause portability problems. For example, if the spec says the synopsis is:

template<int N> fmax(vec<N> x, vec<N> y)

Then, application code might do something like this:

template<int N>
void foo() {
  sycl::vec<N> x;
  sycl::vec<N> y;
  fmax<N>(x, y);
}

This won't compile unless the implementation actually implements fmax with a template parameter of type int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec should just say 1 line to rule them all:

template <typename X, typename Y> fmax(X x, Y y) -> return_t<X, Y>;

and just explain the conversion rule implemented under the hood and when it is valid, in a modern intensive generic way instead of in an old extensive overloaded way,
Having a special template for int N like

template<int N> vec<float,N> fmax(vec<float,N> x, vec<float,N> y)

looks like syntactic noise because it allows to write the cryptic

fmax<3>(1.f, 42.f) // returns a vfloat3 obviously o_o

instead of just

fmax(vfloat3 { 1.f }, 42.f) // returns a vfloat3 more naturally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like syntactic noise because it allows to write the cryptic

fmax<3>(1.f, 42.f) // returns a vfloat3 obviously o_o

I think this is not a real problem for two reasons:

  • The vec and marray constructors that take a single scalar are marked explicit, so there is no implicit conversion from float to vec<float, 3> in this example.
  • There are fmax<3> overloads for both vec and marray, so the example you show would be ambiguous even if the constructor was not excplicit.

instead of just

fmax(vfloat3 { 1.f }, 42.f) // returns a vfloat3 more naturally

I agree that this is a more rationale way to write the code. Even if we implement fmax as overloads, the user can still write code like this.

I feel like we are re-litigating the decision made in internal issue 278. Do you feel this PR should be held up to rediscuss that issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to keep the vec variants as templates? Since the set of valid values for {n} is so limited we could just have overloads for each one.

I did this in a8bcf62.

@keryell
Copy link
Member

keryell commented Jun 15, 2023

SYCL WG meeting 2023/06/15:

Lot of heritage from OpenCL

Add a lot of more precise overloads to have implicit conversion

Discussion on whether the same behavior can be implemented with meta-programming?

How to get the right behavior across implementation?

For example in the spec with SFINAE-like behaviour which can be implemented differently

Discussion on the case of ambiguous conversion between id<1> and size_t for example

The main problem is that the gen types from OpenCL were not clearly defined.

How to declare all the problems? By the CTS, end-user feedback...

Discussion on swizzle and whether we would need to add overloads to the functions

Isn't it a long discussion for a sycl::vec we plan to deprecate?

Greg will update the PR to add overloads to function

Some implementers have this on their critical path

KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Jun 21, 2023
It was decided to change all built-in functions from templates to
overloads. This patch changes non-marray part of "4.17.8. Geometric
functions".

Spec: KhronosGroup/SYCL-Docs#428

Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Jun 22, 2023
We are going to change built-ins implementation from templates to
overloads according to KhronosGroup/SYCL-Docs#428. This patch is
intended to unblock the parallel transition of the different groups of
built-ins.

Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
steffenlarsen added a commit to intel/llvm that referenced this pull request Jun 22, 2023
We are going to change built-ins implementation from templates to
overloads according to KhronosGroup/SYCL-Docs#428. This patch is
intended to unblock the parallel transition of the different groups of
built-ins.

Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
Change the `vec` overloads so they are no longer templated.  Instead
there is a separate overload for each vector length.  This allows the
application to pass a "swizzle" object as a parameter which expects
`vec` because the swizzle object is implicitly convertible to `vec`.

I also changed the formatting of the table of generic types to use
"[source]" blocks instead of "[code]" inline format.  This looks better
and it also allowed me to reformat some of the cells that have many
`vec` types.  Without this reformat, these cells were too big to fit on
one page, which caused an error with the PDF formatter.
dm-vodopyanov pushed a commit to intel/llvm that referenced this pull request Jun 25, 2023
…#10014)

It was decided to change all built-in functions from templates to
overloads. This patch changes non-marray part of "4.17.8. Geometric
functions".

Marray part: #10048

Spec: KhronosGroup/SYCL-Docs#428

---------

Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
dm-vodopyanov pushed a commit to intel/llvm that referenced this pull request Jun 26, 2023
…mplates (#10048)

It was decided to change all built-in functions from templates to
overloads. This patch changes marray part of "4.17.8. Geometric
functions".

Non-marray part: #10014

Spec: KhronosGroup/SYCL-Docs#428
KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Jun 26, 2023
…plates

It was decided to change all built-in functions from templates to overloads. This patch changes functions with ptr arg of "4.17.5. Math functions":
genfloat fract(genfloat x, genfloatptr iptr)
genfloat frexp(genfloat x, genintptr exp)
genfloat lgamma_r(genfloat x, genintptr signp)
genfloat modf(genfloat x, genfloatptr iptr)
genfloat remquo(genfloat x, genfloat y, genintptr quo)
genfloat sincos(genfloat x, genfloatptr cosval)

Spec: KhronosGroup/SYCL-Docs#428
Copy link
Member

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, this was verbose...

vec<float,4> fmax(vec<float,4> x, vec<float,4> y)
vec<float,8> fmax(vec<float,8> x, vec<float,8> y)
vec<float,16> fmax(vec<float,16> x, vec<float,16> y)
template<size_t N> marray<float,N> fmax(marray<float,N> x, marray<float,N> y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is a discrepancy compared to vec to solve with concepts in SYCL Next...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our thinking is that vec will probably go away at some point. Once that happens, all of these overloads on vec can be removed. In the meantime, most people wanted to do whatever was easiest for implementors w.r.t. vec, and the consensus from implementors was that adding these overloads is the easiest solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is nit the vec but that the marray with this template is not user-friendly without deeper rework.

Copy link
Collaborator

@AerialMantis AerialMantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

adoc/chapters/programming_interface.adoc Outdated Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Outdated Show resolved Hide resolved
adoc/chapters/programming_interface.adoc Outdated Show resolved Hide resolved
Comment on lines +19495 to +19498
* Unless specified otherwise, the template parameter [code]#Space# is
constrained to the values [code]#access::address_space::global_space#,
[code]#access::address_space::local_space#, or
[code]#access::address_space::private_space#.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed this in #432 which is rebased on this PR.

dm-vodopyanov pushed a commit to intel/llvm that referenced this pull request Jun 29, 2023
It was decided to change all built-in functions from templates to
overloads. This patch changes non-marray part of "4.17.5. Math
functions".

Spec: KhronosGroup/SYCL-Docs#428

---------

Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
@fraggamuffin
Copy link

LGTM

@keryell keryell merged commit 207fa39 into KhronosGroup:SYCL-2020/master Jun 29, 2023
1 of 2 checks passed
@gmlueck gmlueck deleted the gmlueck/gentype-funcs branch June 29, 2023 18:07
Chenyang-L pushed a commit to intel/llvm that referenced this pull request Jun 30, 2023
It was decided to change all built-in functions from templates to
overloads. This patch changes non-marray part of "4.17.5. Math
functions".

Spec: KhronosGroup/SYCL-Docs#428

---------

Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
gmlueck added a commit to gmlueck/SYCL-Docs that referenced this pull request Jul 21, 2023
After gaining some implementation experience, we decided to change the
clarification of the builtin functions.  Our previous attempt in KhronosGroup#428
added many overloads, and we discovered that this increased the time it
takes to compile the <sycl/sycl.hpp> header.  After further reflection,
we decided that a different mix of overloads vs. templates was more
consistent with C++ and it also results in better compile times.

In general, we follow C++ whenever there is a precedent.  For example,
the math functions like `fmax` and `isinf` have separate overloads for
scalar inputs because C++ defines these same function in that way.
However, we use template parameters for the variants that take `marray`
or `vec` because there is no C++ precedent here.  Other functions
like `clamp` have a C++ precedent where scalar arguments are template
parameters, so SYCL follows suit here also.

Of course, there are many cases where there is no C++ precedent.  In
these cases, SYCL tries to be self consistent.  For example, the
relational functions like `isequal` have separate overloads for scalar
inputs to be consistent with `isinf` and the other math functions.

We also felt that the "generic type names" were adding confusion to
the spec.  Therefore, this commit remove all these generic pseudo-type
definitions and shows the exact synopsis for each builtin function.
This was hard to do with the old table format, so the tables defining
the builtin functions have been completely rewritten to match the
style used by the newer builtin functions (e.g. the group functions).
This style leaves more horizontal space for the synopsis, which makes
it easier to write the synopsis without awkward line breaks.  The
"Constraints" section of this style also makes it easier to describe
the constraints for the various template parameters.

Some other changes of note:

* The Common Functions were inconsistently specified for the `half`
  type.  We had define some `half` overloads, but we missed others.
  This commit adds `half` support consistently to these functions.

* The introductory paragraphs have been modified to remove general
  statements about the types that are supported by the functions.
  These statements were sometime in conflict with the actual function
  definitions, and it seemed better to avoid redundancy.  Each function
  clearly specifies the allowed input types, so we do not need to say
  this again in the introduction.

* The "native precision" and "half precision" math functions have been
  split out into their own sub-sections.  It was easy to miss these
  functions before because they came after a long table of regular math
  functions.  It's now much easier to see that they exist.

* All uses of "latexmath" have been removed from the descriptions of
  the builtin functions and replaced with Asciidoc formatting.  This
  results in a more consistent style, and it also fixed several
  problems where the latexmath version was not rendered correctly in
  the final PDF and/or HTML documents.

Closes internal issue 278 (again).
Closes internal issue 321.
gmlueck added a commit to gmlueck/SYCL-Docs that referenced this pull request Jul 21, 2023
After gaining some implementation experience, we decided to change the
clarification of the builtin functions.  Our previous attempt in KhronosGroup#428
added many overloads, and we discovered that this increased the time it
takes to compile the <sycl/sycl.hpp> header.  After further reflection,
we decided that a different mix of overloads vs. templates was more
consistent with C++ and it also results in better compile times.

In general, we follow C++ whenever there is a precedent.  For example,
the math functions like `fmax` and `isinf` have separate overloads for
scalar inputs because C++ defines these same function in that way.
However, we use template parameters for the variants that take `marray`
or `vec` because there is no C++ precedent here.  Other functions
like `clamp` have a C++ precedent where scalar arguments are template
parameters, so SYCL follows suit here also.

Of course, there are many cases where there is no C++ precedent.  In
these cases, SYCL tries to be self consistent.  For example, the
relational functions like `isequal` have separate overloads for scalar
inputs to be consistent with `isinf` and the other math functions.

We also felt that the "generic type names" were adding confusion to
the spec.  Therefore, this commit remove all these generic pseudo-type
definitions and shows the exact synopsis for each builtin function.
This was hard to do with the old table format, so the tables defining
the builtin functions have been completely rewritten to match the
style used by the newer builtin functions (e.g. the group functions).
This style leaves more horizontal space for the synopsis, which makes
it easier to write the synopsis without awkward line breaks.  The
"Constraints" section of this style also makes it easier to describe
the constraints for the various template parameters.

Some other changes of note:

* The Common Functions were inconsistently specified for the `half`
  type.  We had defined some `half` overloads, but we missed others.
  This commit adds `half` support consistently to these functions.

* The introductory paragraphs have been modified to remove general
  statements about the types that are supported by the functions.
  These statements were sometime in conflict with the actual function
  definitions, and it seemed better to avoid redundancy.  Each function
  clearly specifies the allowed input types, so we do not need to say
  this again in the introduction.

* The "native precision" and "half precision" math functions have been
  split out into their own sub-sections.  It was easy to miss these
  functions before because they came after a long table of regular math
  functions.  It's now much easier to see that they exist.

* All uses of "latexmath" have been removed from the descriptions of
  the builtin functions and replaced with Asciidoc formatting.  This
  results in a more consistent style, and it also fixed several
  problems where the latexmath version was not rendered correctly in
  the final PDF and/or HTML documents.

Closes internal issue 278 (again).
Closes KhronosGroup#321
keryell added a commit that referenced this pull request Sep 10, 2024

This commit clarifies the expected signatures of the functions defined
in section 4.17.5 "Math functions" - 4.17.9 "Relational functions".

    Previously, it was not clear how much templating (if any) is present
    in these functions. We now clarify that these functions are mostly
    exposed as overloads, using template parameters only for the
    marray extents and for the multi_ptr parameters.

    Previously, it was not clear how the "generic type names" should be
    interpreted when a function signature uses several generic type
    names like this:

    genfloat fmax(genfloat x, sgenfloat y)

    Did we intend fmax to have overloads for all possible combinations
    of types from genfloat and sgenfloat, or are only certain
    combinations expected? Cases like this have been clarified by
    rewriting the spec to avoid multiple generic type names in the same
    signature. For example:

    genfloatf fmax(genfloatf x, float y)
    genfloatd fmax(genfloatd x, double y)
    genfloath fmax(genfloath x, half y)

    To resolve cases that weren't obvious, I aligned with the OpenCL
    behavior, which makes sense because all of these functions were
    originally derived from OpenCL.

    There are still two cases when a function could be defined with
    two different generic type names. These are resolved by adding
    the names unsignedtype and elementtype and specifying exactly
    how these generic types interact with the other generic types.

    The upsample, mad24, and mul24 functions presented an
    interesting problem. These were defined in terms of the
    {i,u}genintegerNbit generic types, which lead to ambiguities in
    cases like this:

    igeninteger32bit upsample(igeninteger16bit hi, ugeninteger16bit lo)

    The igeninteger32bit type was defined to be all signed integer
    types that are 32 bits wide. It was therefore unclear whether
    this function should return int or long for an implementation
    where both types are 32-bits. This was resolved by defining these
    three functions exclusively in terms of the fixed-width integer
    types. In the example above, the return type is now int32_t
    (or a vec or marray of int32_t). This makes sense because all
    three of these functions assume input values which have a specific
    number of bits.

    The overloads involving marray of integer types were also
    simplified. To give an example, we previously defined overloads for
    both mshort{n} and marray<short,{N}>. This may seem purely
    redundant, but mshort{n} is defined as marray<int16_t,{N}>. If
    we assume that each of the fixed-width integer types aliases to one
    of the fundamental integer types, there is no need to define
    overloads for both. We therefore define overloads only for the
    fundamental integer types. This will only make a difference for a
    bizarre implementation where a fixed-width integer type is a distinct
    type, different from all the fundamental types. We do not think that
    SYCL needs to define special overloads for this case, which is
    consistent with the C++ standard (which also does not define special
    library function overloads for the fixed-width integer types).

    Note that all the overloads involving vec of integer types are
    defined exclusively for the fixed-width integer types (and not the
    fundamental types). This was the case even before this commit. This
    makes sense if we think the main use for vec is for compatibility
    with OpenCL, where the sizes of the integer types are all fixed.

    The structure of the table of "Generic type names" was also improved
    for clarity. Previously, there was a deep hierarchy of generic type
    names, where many generic types were defined in terms of other
    generic types. A reader needed to traverse down through the
    hierarchy in order to understand exactly which concrete types were
    included in each generic type. The table no longer has any
    hierarchy, so each generic type now lists the full set of concrete
    types that it represents, which makes it much easier to read.

    The following statement was removed from section 4.17.1 "Description
    of generic type names" as part of my rewrite of the introduction of
    that section:

        All of the OpenCL built-in types are available in the namespace
        sycl.

    I think this statement is erroneous because we did not intend to
    export all of these types into the sycl namespace. In addition,
    this would be a very strange place for us to say that.

Closes internal issue 278.
Closes internal issue 590.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants