Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of extensions in the kernel language #82

Open
AnastasiaStulova opened this issue May 28, 2019 · 17 comments · May be fixed by #355
Open

Use of extensions in the kernel language #82

AnastasiaStulova opened this issue May 28, 2019 · 17 comments · May be fixed by #355
Labels
needs-cts-coverage The CTS needs to be extended OpenCL C Spec Issues related to the OpenCL C Language specification. themed-agenda Agenda for on-demand meetings dedicated to a specific topic

Comments

@AnastasiaStulova
Copy link
Contributor

AnastasiaStulova commented May 28, 2019

I would like to summarize the intended use of extension (after my discussion with @bashbaug) and if possible adjust the implementation.

A. The way to check that the extension is being supported is by predefined extension macro i.e.

#if defined(cl_khr_subgroups)
size_t i = get_sub_group_local_id();
#endif

B. Some extensions might require extra compiler support or alter traditional compilation phase. When different compilation functionality is required it can be enabled using pragma. I.e.

#pragma OPENCL EXTENSION cl_khr_fp64 : enable
cos(0.1); //use cos overload with double.
#pragma OPENCL EXTENSION cl_khr_fp64 : disable
cos(0.1); //use cos overload with float.

Observation: the use of get_sub_group_local_id doesn't require any compilation support and can be handled as regular include and therefore no extension pragma is needed to use the function. The same applies to cos. Only the use of double literal requires enabling pragma in the above examples.

This doesn't align with upstream implementation in Clang. The implementation requires extension pragma for some extensions that are not part of core support in corresponding spec version i.e. pragma will be required for subgroups up to OpenCL v2.0. The implementation is quite inconsistent though. For example some extensions from ARM and Intel require pragma but extensions from AMD don't seem to be requiring it.

Actions:
1. Gather the list of extensions that require pragma
2. Simplify compilation by not requiring pragma where it is not necessary.

Backwards compatibility: Old code will continue to compile. Pragma will be simply ignored by the new rules.

@AnastasiaStulova AnastasiaStulova added OpenCL C Spec Issues related to the OpenCL C Language specification. agenda Needs Working Group Discussion labels May 28, 2019
@bashbaug
Copy link
Contributor

This sounds reasonable to me - I think I hinted at something similar here #19 (comment).

Backwards compatibility: Old code will continue to compile.

Is there also a risk that new code will fail on older drivers. Presumably some compilers currently require enabling pragmas even if others do not, so perhaps we should have a single control to enable "simpler compilation"?

@AnastasiaStulova
Copy link
Contributor Author

Yes old drivers will continue requesting the extension to be enabled.

@AnastasiaStulova AnastasiaStulova removed the agenda Needs Working Group Discussion label Sep 3, 2019
@kpet
Copy link
Contributor

kpet commented May 22, 2020

As discussed in the 2020/05/22 tooling call, we probably want to resolve this for CL3.0.

@kpet kpet added this to To do in OpenCL 3.0 Final specifications via automation May 22, 2020
@AnastasiaStulova
Copy link
Contributor Author

I looked at Clang upstream and those are estensions that are actually used in compilation:

  • cl_khr_fp16 enables an extra fp type 'half'
  • cl_khr_fp64 enables double type and makes double literal to be in double precision.
  • cl_khr_gl_msaa_sharing enables extra builtin image types
  • cl_khr_3d_image_writes allows write_only 3d image type
  • cl_khr_subgroups for some extra diagnostic of DSE BIFs? (maybe we can remove this?)
  • cl_khr_int64_base_atomics, cl_khr_int64_extended_atomics add extra atomic types

@AnastasiaStulova AnastasiaStulova added the agenda Needs Working Group Discussion label Jun 9, 2020
@AnastasiaStulova AnastasiaStulova added the OpenCL 3 Must Fix Critical OpenCL 3.0 Issues label Jul 8, 2020
@bashbaug bashbaug moved this from To do to Needs WG discussion in OpenCL 3.0 Final specifications Jul 9, 2020
@bashbaug bashbaug moved this from Needs WG discussion to In progress in OpenCL 3.0 Final specifications Jul 16, 2020
@AnastasiaStulova
Copy link
Contributor Author

Summary:

  • OpenCL C didn't specify whether each extension should require pragma enable/disable directives. It has only required the extension macros:

Every extension which affects the OpenCL language semantics, syntax or adds built-in functions tothe language must create a preprocessor #define that matches the extension name string. This#define would be available in the language if and only if the extension is supported on a givenimplementation.

  • Pragma directives are normally used to alter the compiler behavior and therefore it is undesirable to require them for functionality that is based on existing/non-altered compiler behavior i.e. adding a new function or a set of functions.

A. Example of functionality from extensions that need altering the compiler behavior:
Floating point literal without any suffix are interpreted as either float or double if cl_khr_fp64 is enabled or disabled.
See full example: https://godbolt.org/z/YW6r8E

B. Example of functionality from extensions that doesn't need extra compiler support:
(i) cl_khr_mipmap_image_writes adds overloads of write_image function with an extra argument. This seems to use existing language functionality for declaring/defining functions. There seems to be no extra compiler support needed.
(ii) cl_khr_initialize_memory, cl_khr_icd, cl_khr_mipmap_image don't seem to be even related to the kernel language? Yet we have added them to clang and there seems to be pragma for those in the spec.

The suggestion to move forward:

  • Improve the specification around extensions to describe explicitly that pragma directives are only to be used with extensions that need altering the compiler standard behavior.
  • Add explicit statement about the pragma to be required in the extensions that need changes in the compiler(cl_khr_fp16, cl_khr_fp64, cl_khr_gl_msaa_sharing, cl_khr_3d_image_writes, cl_khr_subgroups, cl_khr_int64_base_atomics, cl_khr_int64_extended_atomics). To be honest this list already has extensions that shouldn't need pragma either i.e. we could just add overloads of atomics with 64 bits width types if it is supported by the target. I am not sure what is the value in the use of pragma?
  • For already published extensions that shouldn't have required the pragma (i.e. compiler modification) originally we can clarify that the pragma is not necessary but it's going to be accepted. To avoid modifying each of the extensions we can also explain that the default behavior (if not explicitly stated) is that pragma is optional. To be honest unknown pragmas are to be ignored by the compiler anyway. So not sure we need to explain much here. For example, clang just gives a warning for unrecognized pragmas but doesn't reject the code. Existing code with the use of pragma will continue to compile even if pragma support is removed from the compiler.
  • For new extensions, we should encourage to specify explicitly whether pragma is needed or not to avoid confusions in implementations and applications too.

AnastasiaStulova pushed a commit to AnastasiaStulova/OpenCL-Docs that referenced this issue Jul 20, 2020
This change adds clarification for the compiler directive
that enables and disables the extensions.

Summary:
- Add high-level description of OpenCL specific pragma
  directives into the core spec.
- Clarify the use of extension pragma in the extension spec.
- Clarify the need of extension pragma directive in cl_khr_fp64
  and cl_khr_fp16.
@bashbaug bashbaug linked a pull request Jul 21, 2020 that will close this issue
@bashbaug bashbaug moved this from In progress to Needs WG discussion in OpenCL 3.0 Final specifications Jul 23, 2020
@bashbaug bashbaug removed the OpenCL 3 Must Fix Critical OpenCL 3.0 Issues label Jul 28, 2020
@AnastasiaStulova
Copy link
Contributor Author

FYI I have started a discussion with the LLVM community to see if we can do the simplifications in the implementation for now:
http://lists.llvm.org/pipermail/cfe-dev/2020-September/066911.html

@AnastasiaStulova
Copy link
Contributor Author

FYI an attempt to remove extensions from upstream clang that don't have a kernel language changes:
https://reviews.llvm.org/D89372

@bashbaug bashbaug removed this from Needs WG discussion in OpenCL 3.0 Final specifications Nov 6, 2020
@bashbaug bashbaug added this to To do in OpenCL specification maintenance via automation Nov 6, 2020
@bashbaug bashbaug moved this from To do to Needs WG discussion in OpenCL specification maintenance Nov 6, 2020
@AnastasiaStulova
Copy link
Contributor Author

AnastasiaStulova commented Nov 10, 2020

I have made further investigation and I realized that the majority of extensions will have a section with the pragma in the same exact format regardless of whether they even add any change to OpenCL C. However, neither individual extensions nor extensions spec or language spec explain whether and how the pragma is to be used. Moreover the extensions spec and language spec doesn't contain the pragmas for any extension, so only individual extension specs do. That means that neither upstream implementation in Clang nor existing kernels using pragmas are or can be conformant at the moment. My guess is that the pragma in the specification is a result of copy-paste.

These are the only extensions that somehow refer to the potential use of pragmas: cl_khr_fp16, cl_khr_int64_base_atomics, cl_khr_int64_extended_atomics. Those are the wordings from the spec.

Unless the cl_khr_fp16 extension is supported and has been enabled.

The half scalar and vector types can only be used if the cl_khr_fp16 extension is supported and has been enabled. The double scalar and vector types can only be used if double precision is supported.

The atomic_long and atomic_ulong types are supported if the cl_khr_int64_base_atomics andcl_khr_int64_extended_atomics extensions are supported and have been enabled.

The atomic_double type is only supported if double precision is supported and the cl_khr_int64_base_atomics and cl_khr_int64_extended_atomics extensions are supported and have been enabled.

If the device address space is 64-bits, the data types atomic_intptr_t, atomic_uintptr_t,atomic_size_t and atomic_ptrdiff_t are supported if the cl_khr_int64_base_atomics andcl_khr_int64_extended_atomics extensions are supported and have been enabled.

The problem I have with those is that I don't understand what exactly should happen when the pragma is enabled and disabled. After several discussions with the developers, I realized that there are multiple interpretations, including the one implemented in clang that in my opinion doesn't help but only makes things complicated. In addition, the upstream implementation is inconsistent i.e. (1) pragma is required for some extensions but not the others and (2)
the same extension can behave differently depending on which of two default headers is used.

To sort out the inconsistencies and misinterpretations I suggest the following steps:

  1. Clarify the general use of extension pragma to avoid future mistakes and confusions: [OpenCL C 3.0] Clarify use of extension pragma (#82). #355
  2. Identify in what extensions the pragma is needed and add clarification to each of those i.e. what happens when pragma is disabled/enabled.
  3. Add clarifications to the existing extensions that don't need pragma but list it in the extension spec:
    (a) the pragma is only accepted for backward compatibility reasons but doesn't change any functionality. I.e. new kernels won't need to add it but the existing kernels will continue to compile without the warnings about the unknown pragmas. NOTE that this should exclude the API extensions where pragma could never have been used in the kernel side and therefore it should just be removed from the spec.
    (b) the prgma is not guaranteed to be accepted.
  4. Modify upstream implementation according to the clarified spec.

To clarify that (3a) is more helpful to the developer and allows to simplify tooling sufficiently. But there has to be consensus regarding this within the WG because some existing implementation might not have added the pragmas initially. (3b) is less helpful to the developers but provides sufficient feedom to simplify tooling and doesn't necessarily need for vendors to agree as its not affecting backwards compatibility in any way. Potentially it can be covered by clarification regarding the extension pragma in (1) and therefore no changes in individual extensions would be needed. The issue is that it is a huge issue for portability.

@jvesely
Copy link

jvesely commented Dec 11, 2020

Macros are insufficient for backwards compatibility. This should be obvious as applications cannot check for macros that didn't exist at the time the application was written.

Section 6.1.9 (OpenCLC 2.0) lists reserved identifiers, but there is no requirement for new extensions to only introduce identifiers from the reserved set.

Note that Section 9.1 (OpenCL 2.0 Extensions) does not require extension functions to follow a particular format or mandate use of previously reserved identifiers.

This can be easily fixed by adjusting both sections to explicitly prevent identifier conflicts.

However, the older spec (OpenCL 1.2) used pragmas for that purpose. For example section 9.3 Atomics says:

An application that wants to use any of these extensions will need to include the #pragma
OPENCL EXTENSION cl_khr_int64_base_atomics : enable or #pragma
OPENCL EXTENSION cl_khr_int64_extended_atomics : enable directive in
the OpenCL program source.

The wording is clear that including #pragma ....: enable is a requirement to have access to atom_* functions.

This is behaviour is easily achieved using the following steps in implementation.
1.) all extensions only add identifiers that begin with double underscore (e.g. __new_functionA) these identifiers are reserved in 6.1.9
2.) compiler adds defines for all identifiers introduced by the extension (e.g. #define new_functionA __new_functionA) on the same line as #pragma extension_E: enable
3.) compiler removes defines for all identifiers on the same line as #pragma extension_E: disable

This approach would allow legacy apps and extensions to potentially share identifiers.

There're enough references to backward compatibility throughout the specs (see for exmaple cl_khr_fp64), to suggest that legacy applications should be considered.

Either making the name exclusions explicit (be adjusting 6.1.9 and 9),
or clarifying the role of pragma for visibility of extension identifiers would provide clarity wrt backwards compatibility.

@AnastasiaStulova
Copy link
Contributor Author

Section 6.1.9 (OpenCLC 2.0) lists reserved identifiers, but there is no requirement for new extensions to only introduce identifiers from the reserved set.

@jvesely I agree that this is indeed a problem, and I have created an issue to follow up separately (#547).

However, the older spec (OpenCL 1.2) used pragmas for that purpose. For example section 9.3 Atomics says:

An application that wants to use any of these extensions will need to include the #pragma
OPENCL EXTENSION cl_khr_int64_base_atomics : enable or #pragma
OPENCL EXTENSION cl_khr_int64_extended_atomics : enable directive in
the OpenCL program source.

The wording is clear that including #pragma ....: enable is a requirement to have access to atom_* functions.

This is a very reasonable interpretation of the spec wording but it is unfortunately not the wording that spec has. Especially there is no wording about identifiers from the extensions being either available or unavailable with or without the pragma.

Regarding the implementation - I would say with the current language construct choice it wouldn't be easy to implement the logic that can load or unload functionality from the extensions. The issue is that the pragma can be added anywhere in the code which means we would require loading and unloading the extension identifiers dynamically. This goes fundamentally against the parsing flow in C/C++ derived implementations such us clang for example. We could of course look at restricting the pragma but I would rather use better matching feature in the language rather than fixing what has not been designed for that purpose. I would imagine that the use of a regular header includes or/and reserved identifiers such as double underscore prefixes could easily solve the issue you are highlighting. And I do hope this will be the direction for the near future.

@AnastasiaStulova
Copy link
Contributor Author

PR #355 currently aims to clarify that pragma has an implementation-defined behavior with the purpose to:

  • remove confusions and misinterpretation regarding this topic
  • allow implementations to improve without invalidating existing implementations and affecting backward compatibility (i.e. pragma can be accepted without any diagnostic but ignored)

However, a new approach that has been discussed with @bashbaug:

@jvesely
Copy link

jvesely commented Feb 11, 2021

Section 6.1.9 (OpenCLC 2.0) lists reserved identifiers, but there is no requirement for new extensions to only introduce identifiers from the reserved set.

@jvesely I agree that this is indeed a problem, and I have created an issue to follow up separately (#547).

However, the older spec (OpenCL 1.2) used pragmas for that purpose. For example section 9.3 Atomics says:
An application that wants to use any of these extensions will need to include the #pragma
OPENCL EXTENSION cl_khr_int64_base_atomics : enable or #pragma
OPENCL EXTENSION cl_khr_int64_extended_atomics : enable directive in
the OpenCL program source.
The wording is clear that including #pragma ....: enable is a requirement to have access to atom_* functions.

This is a very reasonable interpretation of the spec wording but it is unfortunately not the wording that spec has. Especially there is no wording about identifiers from the extensions being either available or unavailable with or without the pragma.

The wording is quoted from the specs so it literally IS what the spec has.
It'd be nicer if the spec was explicit about the visibility of extension identifiers, but the word "need" means using pragma is a requirement. i.e. it's not going to work without.
I guess you can argue that there are other failure modes than missing identifiers, but that's a bigger stretch.

Regarding the implementation - I would say with the current language construct choice it wouldn't be easy to implement the logic that can load or unload functionality from the extensions. The issue is that the pragma can be added anywhere in the code which means we would require loading and unloading the extension identifiers dynamically. This goes fundamentally against the parsing flow in C/C++ derived implementations such us clang for example. We could of course look at restricting the pragma but I would rather use better matching feature in the language rather than fixing what has not been designed for that purpose. I would imagine that the use of a regular header includes or/and reserved identifiers such as double underscore prefixes could easily solve the issue you are highlighting. And I do hope this will be the direction for the near future.

The current spec already requires the compiler to support dynamic switching of extensions on and off ("Directives that occur later override those seen earlier." Sec 9.1). There's no restriction on the placement of extension pragmas in kernel source code.

Both pragma and define/undef are preprocessor directives. define/undef already has to be supported and can be easily repurposed for hiding/exposing extension identifiers.

@AnastasiaStulova
Copy link
Contributor Author

It'd be nicer if the spec was explicit about the visibility of extension identifiers, but the word "need" means using pragma is a requirement. i.e. it's not going to work without.

What is pragam required for and what is not going to work without it? You can also assume that it is not going to execute correctly but the compilation succeeds. Or you can come up with other interpretations too.

Both pragma and define/undef are preprocessor directives. define/undef already has to be supported and can be easily repurposed for hiding/exposing extension identifiers.

I don't have the evidence that this can be done easy and I don't think it is very productive to make such claims without thorough investigations of the topic.

@bashbaug
Copy link
Contributor

I added a few more examples to the google doc spreadsheet linked above. So far, it looks like we will need an extension pragma for maximum portability with existing Clang-based implementations for at least:

  • cl_intel_device_side_avc_motion_estimation
  • cl_khr_3d_image_writes
  • cl_khr_fp16
  • cl_khr_fp64
  • cl_khr_gl_msaa_sharing
  • cl_khr_int64_base_atomics
  • cl_khr_int64_extended_atomics
  • cl_khr_subgroups

So far, I haven't found a need for an extension pragma for:

  • cl_intel_subgroups
  • cl_intel_subgroups_short
  • cl_khr_byte_addressable_store
  • cl_khr_depth_images
  • cl_khr_global_int32_base_atomics
  • cl_khr_global_int32_extended_atomics
  • cl_khr_local_int32_base_atomics
  • cl_khr_local_int32_extended_atomics
  • cl_khr_mipmap_image
  • cl_khr_mipmap_image_writes
  • cl_khr_srgb_image_writes

@bashbaug
Copy link
Contributor

I see now that my list in the comment above is the same list as @AnastasiaStulova's comment earlier in this issue - that seems like a good thing.

We discussed this issue in the special Feburary 25th teleconference:

We're trending towards defining three lists of extensions:

  1. Extensions that require a pragma. Kernels may fail (produce compilation errors) without them. This is the first list above.
  2. Extensions that do not require a pragma, but that must not produce any diagnostics if a pragma directive for the extension is present. This means that kernels with the extension pragma that are compiled with -Werror must continue to compile successfully. This is effectively the second list above.
  3. Extensions that do not require a pragma and that may produce a diagostic if a pragma directive for the extenion is present. This will be the recommended policy for extensions moving forward.

Since we don't know how all implementations behave WRT extension pragmas, we may need to move an extension from (2) to (1). We'll do our best to get this right, but if we need to make any changes we'll treat this as a spec bug.

If there are any other extensions that require a pragma for an implementation, please let us know ASAP!

@AnastasiaStulova
Copy link
Contributor Author

AnastasiaStulova commented Mar 11, 2021

Summary from the teleconference on Mar 11, 2021

  • The implementation in upstream clang for the legacy extensions doesn't seem to be neither conformant nor helpful. It is generally reasonable to change the implementation but preserving backward compatibility is desirable. From the last suggested approach, the only requirement on the implementation of the legacy extensions seems to be to accept the pragma. There seems to be nor desirable nor common behavior for the pragma at present.

  • For the new extensions some vendors might discover that the pragma is useful at some point after the extension is published. However, requiring the use of pragma incurs implementation overheads and burdens to the developers. The compiler would need to know all extensions that are supported and the developers would have to add them in their source code. The pragma behavior would also have to be defined and therefore the spec needs to change.

  • The semantic of pragma could be changed to allow implementing it adequately. However, there are other language features already available that could serve the same purpose i.e. header includes. This could be used already now for the new extensions.

  • Action: gather more feedback on whether the pragma can be deprecated and the new mechanisms can be instated.

leok7v added a commit to leok7v/OpenCL that referenced this issue May 20, 2023
…ement

//
Intel(R) UHD Graphics does not support ocl_fp64
manifesting it by saying
   "use of type 'double' requires cl_khr_fp64 extension to be enabled"
while NOT having 'cl_khr_fp64' among it's extensions
but double_fp_config == 0

Since OpenCL C 1.1 it is NOT required to enable cl_khr_fp64 extension
to use "double" type in .cl code.
However some Intel CPU/Integrated Graphics GPU silicon
does not implement double.
OpenCL is not clear on reporting the device float/double
capabilities and clang .cl compiler struggles with it.
Real problem is the cl_khr_* are [ab]used for both reporting
and enablement/disablement which is muddy.
For now both cl_khr_fp64 and cl_intel_accelerator are checked
until OpenCL achive clarity on #pragma extension
enabling/disabling and reporting
KhronosGroup/OpenCL-Docs#82
KhronosGroup/OpenCL-Docs#355
@bashbaug bashbaug added the needs-cts-coverage The CTS needs to be extended label Aug 1, 2023
@bashbaug
Copy link
Contributor

bashbaug commented Aug 1, 2023

Related CTS issue: KhronosGroup/OpenCL-CTS#886

@bashbaug bashbaug moved this from Needs WG discussion to To do in OpenCL specification maintenance Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cts-coverage The CTS needs to be extended OpenCL C Spec Issues related to the OpenCL C Language specification. themed-agenda Agenda for on-demand meetings dedicated to a specific topic
Development

Successfully merging a pull request may close this issue.

4 participants