Skip to content
This repository has been archived by the owner on Jan 26, 2024. It is now read-only.

Fix OpenCL headers 2021.04.29 compatibility #25

Open
wants to merge 102 commits into
base: develop
Choose a base branch
from

Conversation

devurandom
Copy link

OpenCL headers 2021.04.29 moved CL_COMMAND_GL_FENCE_SYNC_OBJECT_KHR into
a different header. (See issue referenced below for details.)

See-also: KhronosGroup/OpenCL-Headers#145
See-also: https://bugs.gentoo.org/790164

vsytch and others added 30 commits August 15, 2021 23:44
This change refactors the current ROCclr cmake build to accomodate a
more modular approach. This allows easier support for multiple compiler
and/or multiple runtime backends.

Currently supported compilers:
    HSAIL - enabled by ROCCLR_ENABLE_HSAIL (defaults to OFF)
    LC    - enabled by ROCCLR_ENABLE_LC    (defaults to ON)

Currently supported runtimes:
    HSA - enabled by ROCCLR_ENABLE_HSA (defaults to ON)
    PAL - enabled by ROCCLR_ENABLE_PAL (defaults to OFF)

Any configuration is supported as long as at least one compiler and one
runtime is enabled.

Since ROCclr clients can configure it differently, one cannot reuse the
same ROCclr build artifacts between different clients. To assure this,
this patch assumes that ROCclr will be built as part of the clients
project.

Change-Id: Id4a5c43634296802b8ae87d1ad5984968391ccaf
With HIP API callback runtime has to stall the queue until the
callback is done. Rocclr will introduce SW blocking HSA signal,
which will be released after the callback is done.

Change-Id: I6411f3efab31b468e3b87ebb5c8d155e116b613d
Change-Id: I337b8d3b38a492b77b55602ab3a6bb3c05e693e0
Add an extension to memory advise to disable cache coherency for
better performance

Change-Id: I283703d81d9c36ddfa2c8fffa15eef60e2195056
Change-Id: I54ca6e8458cf6414c263df7a8bf61f7ce39a64df
All KMD/asic_reg/UGL headers are located under the drivers folder. No
need for the AMD_UGL_PATH variable as it essentially is
${AMD_DRIVERS_PATH}/ugl.

Change-Id: I070d737d50f2096493b3e75ef9b9e824cb19d048
Switch HSA_AMD_SVM_ATTRIB_READ_ONLY to
HSA_AMD_SVM_ATTRIB_READ_MOSTLY to match Cuda. The new attribute
was just exposed in ROCr/KFD.

Change-Id: I2ee522d33c347ba52a4e272d2cd7f67960490cf7
Change-Id: I132fa424cf9bec608e5c8429e93d20e78b76c6f0
For DD, send a NOP packet so that we leverage the handler to indicate
completion.

Change-Id: Ie57ea0124a8497d39cc49da1c4575c2cd86b9319
Revert back to using the Raven (gfx902) target ID for Raven 2 (gfx909).
This is due to the HSAIL compiler not supporting gfx909.

In theory there should be no issue with running Raven isa on Raven 2.

Change-Id: I425edebc99075799eda5522fad231b8fb3184873
If AMD event contains a reference to a HW event, then runtime
could check/wait for HW event. CPU status update will occur later
after HSA signal callback, but it's not important for the result.

Change-Id: I591391a953bbdba6a25ac07e2cd98aeb17cd4596
Add a Purge function to MemObjMap

Change-Id: Iac51dfda9a7b7c45f2f4a0dc35f7a623121aba1a
…a host ptr

Change-Id: I530eb39104bbe727c3e38186f6db4e64285b3fc8
- Create an env var ROC_ACTIVE_WAIT_TIMEOUT to set active wait timeout
- Record profiling informaion if marker_ts_ property is valid.

Change-Id: If0d8aec8d9b0715027cf0f7c3dc8a4c722a6bae6
Add Navi 24 support

Change-Id: I7343384cf6fb8c532321e57e202c196ef054f459
Change-Id: I9701fbab587e2ea31e58449e8c8b07341a7aa161
Add lock protection for signal processing
If signal is reused, then disable reference to it from HIP
Increase the pool signal size to 32

Change-Id: I7d529b35910f83ce577c9eca6d3386759611ccc0
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christophe Paquot <christophe.paquot@amd.com>
Change-Id: Iff0253a181bbfc1984304014a9e3b542b2556635
Reset hasPendingDispatch_ if we insert barrier for time marker.

Change-Id: Id038fd4e1c910c0a657978fee00630e49c372321
…, MI100 and MI200

Change-Id: I6f07036d8ee6e4c6b55196a13288f8107488d824
Sourabh Betigeri and others added 25 commits August 15, 2021 23:45
… particularly 'value' arg to unsigned 'value

Change-Id: I74b24b2dec911acd5e7a364ea8c050c2ecb1c3b8
- Device Reset should not purge the allocations that were not by the user
- Addresses QMCPack Test abort due to the removal of all the mem objects during reset

Change-Id: I7b7a123e72bcc985d7e51d17c2382bc618d3e041
Only add Roc path and don't use Pal path.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Change-Id: I7117e2dc3c3ad4c8d563e9bbdc721f70ddba51fd
Change-Id: I994f3e7c67ed29c4ee46229c8bcd1448fc7f59ec
Change-Id: Iba90a31f9c5d6d4f2b60b7ccf903325c03d4d245
Below logic allocates the host buffer whenever a subbuffer is created
from a SVM allocation. This is only needed for multi-device contexts.

HIP does not support multi-device contexts, hence this logic just ends
up performing unnecessary system allocations.

Change-Id: I8eae635f7c5289c52ef73434218c1658b788a456
Fixes error "All control paths should return a value".

Change-Id: I4718688b55b24862465e15ea0d64b32fa44b3299
Change-Id: I80be39ace9d93347f81ef8acd7858d43bc4a3f1e
…ncl.

Change-Id: I5c28e9c606dec1c956f3f48071d8a0271adfff22
This a cherry pick from the ASIC's branch.

Change-Id: Ic6e888f8fa96103d1e79432dd75e68faabd8cf6c
Change-Id: Ie3ad85a8335b1fc751812c09bb0cd30aad38dcae
Change-Id: Ib9e25a6beb97cc042bb3cc50338686a8dd09e21c
This module will be used to add any specific compiler options to ROCclr
and it's clients.

Currently it only adds a workaround to remove the MSVC flag /GR, which
is added by default CMake <3.20. This resolves the conflict of PAL
adding /GR-.

Change-Id: If83adb271bcec86812a6e9de940da3920fc75393
Note that this requires base driver CL#2340320+ to have SQ interrupt
functionality enabled by default.

Change-Id: I04b936819ebe1eb7cf5de1db4fafe83af3a1b5f6
…hsa_ext_image_descriptor_t

Change-Id: I0af0f09120f15a42349ec4de491df8aee7bfd46d
There is a possible race condition when signal reuse can have
access to a destroyed Timestamp object, because the callback
was running asynchronously. Use reference counter and lock
to allow asynchronous timestamp update

Change-Id: I6224f7c62cb0a03a7466fcc512e5e5afb06736fa
OpenCL headers 2021.04.29 moved `CL_COMMAND_GL_FENCE_SYNC_OBJECT_KHR` into
a different header.  (See issue referenced below for details.)

See-also: KhronosGroup/OpenCL-Headers#145
See-also: https://bugs.gentoo.org/790164
@devurandom
Copy link
Author

devurandom commented Sep 6, 2021

To build hip 4.3.0 I also needed to patch that:

rm amdocl/CL/cl{,_icd,_gl,_gl_ext,_platform}.h
sed -i 's/CL_EXT_SUFFIX/CL_API_SUFFIX/' \
  amdocl/CL/cl_icd_amd.h \
  amdocl/CL/cl_ext.h \
  rocclr/cl_lqdflash_amd.h

Since AMD's OpenCL extension headers (hip 4.3.0 needs some cl_amd_... symbols that it carries in amdocl/CL/cl_ext.h, but that are not in https://github.com/KhronosGroup/OpenCL-Headers) have been moved somewhere else (where? I was unable to find them) I would like some advice how to proceed here.

devurandom added a commit to devurandom/gentoo-patches that referenced this pull request Sep 6, 2021
@vsytch
Copy link
Contributor

vsytch commented Sep 7, 2021

The AMD OpenCL runtime unfortunately cannot work with the upstream OpenCL headers. There's a lot of issues currently blocking us from upgrading. Due to this, we cannot be making any fixes to accommodate any upstream changes as of right now.

Did you experience any build issues? The upstream OpenCL headers should not need to be involved in that.

@devurandom
Copy link
Author

devurandom commented Sep 7, 2021

The AMD OpenCL runtime unfortunately cannot work with the upstream OpenCL headers. There's a lot of issues currently blocking us from upgrading. Due to this, we cannot be making any fixes to accommodate any upstream changes as of right now.

Did you experience any build issues? The upstream OpenCL headers should not need to be involved in that.

Yes, Gentoo users reported /usr/include/rocclr/platform/command.hpp:327:41: error: ‘CL_COMMAND_GL_FENCE_SYNC_OBJECT_KHR’ was not declared in this scope. That symbol comes from newer OpenCL headers.

See-also: https://bugs.gentoo.org/790164
Ebuild: https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-util/hip/hip-4.3.0.ebuild

@vsytch There seems to be some restructuring of OpenCL headers going on in the AMD repos. Could you please tell me where to find the latest version? Maybe it is easier to work with that...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet