-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of extensions in the kernel language #82
Comments
This sounds reasonable to me - I think I hinted at something similar here #19 (comment).
Is there also a risk that new code will fail on older drivers. Presumably some compilers currently require enabling pragmas even if others do not, so perhaps we should have a single control to enable "simpler compilation"? |
Yes old drivers will continue requesting the extension to be enabled. |
As discussed in the 2020/05/22 tooling call, we probably want to resolve this for CL3.0. |
I looked at Clang upstream and those are estensions that are actually used in compilation:
|
Summary:
A. Example of functionality from extensions that need altering the compiler behavior: B. Example of functionality from extensions that doesn't need extra compiler support: The suggestion to move forward:
|
This change adds clarification for the compiler directive that enables and disables the extensions. Summary: - Add high-level description of OpenCL specific pragma directives into the core spec. - Clarify the use of extension pragma in the extension spec. - Clarify the need of extension pragma directive in cl_khr_fp64 and cl_khr_fp16.
FYI I have started a discussion with the LLVM community to see if we can do the simplifications in the implementation for now: |
FYI an attempt to remove extensions from upstream clang that don't have a kernel language changes: |
I have made further investigation and I realized that the majority of extensions will have a section with the pragma in the same exact format regardless of whether they even add any change to OpenCL C. However, neither individual extensions nor extensions spec or language spec explain whether and how the pragma is to be used. Moreover the extensions spec and language spec doesn't contain the pragmas for any extension, so only individual extension specs do. That means that neither upstream implementation in Clang nor existing kernels using pragmas are or can be conformant at the moment. My guess is that the pragma in the specification is a result of copy-paste. These are the only extensions that somehow refer to the potential use of pragmas:
The problem I have with those is that I don't understand what exactly should happen when the pragma is enabled and disabled. After several discussions with the developers, I realized that there are multiple interpretations, including the one implemented in clang that in my opinion doesn't help but only makes things complicated. In addition, the upstream implementation is inconsistent i.e. (1) pragma is required for some extensions but not the others and (2) To sort out the inconsistencies and misinterpretations I suggest the following steps:
To clarify that (3a) is more helpful to the developer and allows to simplify tooling sufficiently. But there has to be consensus regarding this within the WG because some existing implementation might not have added the pragmas initially. (3b) is less helpful to the developers but provides sufficient feedom to simplify tooling and doesn't necessarily need for vendors to agree as its not affecting backwards compatibility in any way. Potentially it can be covered by clarification regarding the extension pragma in (1) and therefore no changes in individual extensions would be needed. The issue is that it is a huge issue for portability. |
Macros are insufficient for backwards compatibility. This should be obvious as applications cannot check for macros that didn't exist at the time the application was written. Section 6.1.9 (OpenCLC 2.0) lists reserved identifiers, but there is no requirement for new extensions to only introduce identifiers from the reserved set. Note that Section 9.1 (OpenCL 2.0 Extensions) does not require extension functions to follow a particular format or mandate use of previously reserved identifiers. This can be easily fixed by adjusting both sections to explicitly prevent identifier conflicts. However, the older spec (OpenCL 1.2) used pragmas for that purpose. For example section 9.3 Atomics says:
The wording is clear that including This is behaviour is easily achieved using the following steps in implementation. This approach would allow legacy apps and extensions to potentially share identifiers. There're enough references to backward compatibility throughout the specs (see for exmaple cl_khr_fp64), to suggest that legacy applications should be considered. Either making the name exclusions explicit (be adjusting 6.1.9 and 9), |
@jvesely I agree that this is indeed a problem, and I have created an issue to follow up separately (#547).
This is a very reasonable interpretation of the spec wording but it is unfortunately not the wording that spec has. Especially there is no wording about identifiers from the extensions being either available or unavailable with or without the pragma. Regarding the implementation - I would say with the current language construct choice it wouldn't be easy to implement the logic that can load or unload functionality from the extensions. The issue is that the pragma can be added anywhere in the code which means we would require loading and unloading the extension identifiers dynamically. This goes fundamentally against the parsing flow in C/C++ derived implementations such us clang for example. We could of course look at restricting the pragma but I would rather use better matching feature in the language rather than fixing what has not been designed for that purpose. I would imagine that the use of a regular header includes or/and reserved identifiers such as double underscore prefixes could easily solve the issue you are highlighting. And I do hope this will be the direction for the near future. |
PR #355 currently aims to clarify that pragma has an implementation-defined behavior with the purpose to:
However, a new approach that has been discussed with @bashbaug:
|
The wording is quoted from the specs so it literally IS what the spec has.
The current spec already requires the compiler to support dynamic switching of extensions on and off ("Directives that occur later override those seen earlier." Sec 9.1). There's no restriction on the placement of extension pragmas in kernel source code. Both |
What is pragam required for and what is not going to work without it? You can also assume that it is not going to execute correctly but the compilation succeeds. Or you can come up with other interpretations too.
I don't have the evidence that this can be done easy and I don't think it is very productive to make such claims without thorough investigations of the topic. |
I added a few more examples to the google doc spreadsheet linked above. So far, it looks like we will need an extension pragma for maximum portability with existing Clang-based implementations for at least:
So far, I haven't found a need for an extension pragma for:
|
I see now that my list in the comment above is the same list as @AnastasiaStulova's comment earlier in this issue - that seems like a good thing. We discussed this issue in the special Feburary 25th teleconference: We're trending towards defining three lists of extensions:
Since we don't know how all implementations behave WRT extension pragmas, we may need to move an extension from (2) to (1). We'll do our best to get this right, but if we need to make any changes we'll treat this as a spec bug. If there are any other extensions that require a pragma for an implementation, please let us know ASAP! |
Summary from the teleconference on Mar 11, 2021
|
…ement // Intel(R) UHD Graphics does not support ocl_fp64 manifesting it by saying "use of type 'double' requires cl_khr_fp64 extension to be enabled" while NOT having 'cl_khr_fp64' among it's extensions but double_fp_config == 0 Since OpenCL C 1.1 it is NOT required to enable cl_khr_fp64 extension to use "double" type in .cl code. However some Intel CPU/Integrated Graphics GPU silicon does not implement double. OpenCL is not clear on reporting the device float/double capabilities and clang .cl compiler struggles with it. Real problem is the cl_khr_* are [ab]used for both reporting and enablement/disablement which is muddy. For now both cl_khr_fp64 and cl_intel_accelerator are checked until OpenCL achive clarity on #pragma extension enabling/disabling and reporting KhronosGroup/OpenCL-Docs#82 KhronosGroup/OpenCL-Docs#355
Related CTS issue: KhronosGroup/OpenCL-CTS#886 |
I would like to summarize the intended use of extension (after my discussion with @bashbaug) and if possible adjust the implementation.
A. The way to check that the extension is being supported is by predefined extension macro i.e.
B. Some extensions might require extra compiler support or alter traditional compilation phase. When different compilation functionality is required it can be enabled using pragma. I.e.
Observation: the use of get_sub_group_local_id doesn't require any compilation support and can be handled as regular include and therefore no extension pragma is needed to use the function. The same applies to cos. Only the use of double literal requires enabling pragma in the above examples.
This doesn't align with upstream implementation in Clang. The implementation requires extension pragma for some extensions that are not part of core support in corresponding spec version i.e. pragma will be required for subgroups up to OpenCL v2.0. The implementation is quite inconsistent though. For example some extensions from ARM and Intel require pragma but extensions from AMD don't seem to be requiring it.
Actions:
1. Gather the list of extensions that require pragma
2. Simplify compilation by not requiring pragma where it is not necessary.
Backwards compatibility: Old code will continue to compile. Pragma will be simply ignored by the new rules.
The text was updated successfully, but these errors were encountered: