Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix double-release of memory objects #1277

Merged

Conversation

jrprice
Copy link
Contributor

@jrprice jrprice commented Jun 18, 2021

A recent update to the object wrapper classes (#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning nullptr to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

@jrprice jrprice requested a review from mantognini June 18, 2021 18:56
mantognini
mantognini previously approved these changes Jun 22, 2021
Copy link
Contributor

@mantognini mantognini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks appropriate indeed, thanks for the follow up.

EwanC
EwanC previously approved these changes Jul 5, 2021
Jeremy-Kemp
Jeremy-Kemp previously approved these changes Jul 5, 2021
Copy link
Contributor

@Jeremy-Kemp Jeremy-Kemp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I'm not sure how to force the Build macos-latest action to run to unblock merging?

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.
@jrprice jrprice dismissed stale reviews from Jeremy-Kemp, EwanC, and mantognini via 726eaed July 5, 2021 14:18
@jrprice jrprice force-pushed the fix-mem-object-double-free branch from 8b490f2 to 726eaed Compare July 5, 2021 14:18
@jrprice
Copy link
Contributor Author

jrprice commented Jul 5, 2021

I'm not sure how to force the Build macos-latest action to run to unblock merging?

I rebased the PR since it's probably because the macOS config changed on the master branch.

Unfortunately this dismissed all the reviews :-/

@Jeremy-Kemp Jeremy-Kemp merged commit 4a03bb7 into KhronosGroup:master Jul 5, 2021
yanfeng3721 added a commit to yanfeng3721/OpenCL-CTS that referenced this pull request Apr 20, 2022
* Use macOS 10 in CI (KhronosGroup#1282)

macOS jobs frequently fail. Since macos-11.0 support is considered experimental,
move to macos-10, using macos-latest so we automatically move to 11 when
stable.

See https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

Signed-off-by: Kevin Petit <kevin.petit@arm.com>

* Fix double-release of memory objects (KhronosGroup#1277)

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

Co-authored-by: Kévin Petit <kpet@free.fr>
Co-authored-by: James Price <jrprice@google.com>
yanfeng3721 added a commit to yanfeng3721/OpenCL-CTS that referenced this pull request Apr 20, 2022
* Use macOS 10 in CI (KhronosGroup#1282)

macOS jobs frequently fail. Since macos-11.0 support is considered experimental,
move to macos-10, using macos-latest so we automatically move to 11 when
stable.

See https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

Signed-off-by: Kevin Petit <kevin.petit@arm.com>

* Fix double-release of memory objects (KhronosGroup#1277)

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

* Fix check for image support in test_basic sizeof (KhronosGroup#1269)

* add basic test for cl_khr_pci_bus_info (KhronosGroup#1227)

* add basic test for cl_khr_pci_bus_info

* correctly use TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* fix related usage of TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* Fix double release of object in test_api and test_gl (KhronosGroup#1287)

* Fix clang format only

* Fix double release of objects

* subgroups: Fix setting cl_halfs and progress check. (KhronosGroup#1278)

* subgroups: Fix setting cl_halfs and progress check.

cl_float testing uses set_value such that a generated cl_ulong of 1 is
stored as 1.0F in a logical sense. However, cl_half values aren't
intrinsic to C++ and generated cl_ulongs less than 1024 in particular
are interpreted bitwise as subnormals. The test fails on compute devices
lacking subnormal support. Perform the logical conversion to cl_half.

Fix independent forward progress check.

* subgroups_half: Address review comments

* subgroups_half: Formatting fixes required by check-format

* subgroups_half: Modified to query and use rounding mode supported by device

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* Add tests for entrypoint cl_khr_suggested_local_work_size (KhronosGroup#1264)

* Add tests for entrypoint cl_khr_suggested_local_work_size

Tests added within test_conformance/workgroups. The tests cover several
shapes (num dimensions) and sizes of global work size, kernels using
local memory (dynamic and static) and present/non-present global work
offset.

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* Fix in comparison for error checking

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* 'test_wg_suggested_local_work_size' fixes

* Refactoring of 'test_wg_suggested_local_work_size'

Modifications to reduce code duplication and minimize build time

* remove testing for scalar vloada_half (KhronosGroup#1293)

* Temporarily disable the test_kernel_attributes test case (KhronosGroup#1297)

* Temporarily disable the test_kernel_attributes test case

Per OpenCL spec on CL_KERNEL_ATTRIBUTES, for kernels not created from OpenCL C
source and the clCreateProgramWithSource API call the string returned from this
query will be empty.
But in test_kernel_attributes test, it read from bc binary and expect to get
kernel attribute, which is not consistent with OpenCL spec.

* Fix clang format issue

* Fix double free in c11_atomics tests for SVM allocations (KhronosGroup#1286)

* Only Clang format changes

* Fix double free object for SVM allocations

* Fix double free - review fixes

Co-authored-by: Kévin Petit <kpet@free.fr>
Co-authored-by: James Price <jrprice@google.com>
Co-authored-by: BKoscielak <bartosz.koscielak@intel.com>
Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
Co-authored-by: Grzegorz Wawiorko <grzegorz.wawiorko@intel.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>
Co-authored-by: spauls <spauls@qti.qualcomm.com>
Co-authored-by: kalchr01 <83217667+kalchr01@users.noreply.github.com>
Co-authored-by: Feng Zou <feng.zou@intel.com>
yanfeng3721 added a commit to yanfeng3721/OpenCL-CTS that referenced this pull request Apr 20, 2022
* Use macOS 10 in CI (KhronosGroup#1282)

macOS jobs frequently fail. Since macos-11.0 support is considered experimental,
move to macos-10, using macos-latest so we automatically move to 11 when
stable.

See https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

Signed-off-by: Kevin Petit <kevin.petit@arm.com>

* Fix double-release of memory objects (KhronosGroup#1277)

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

* Fix check for image support in test_basic sizeof (KhronosGroup#1269)

* add basic test for cl_khr_pci_bus_info (KhronosGroup#1227)

* add basic test for cl_khr_pci_bus_info

* correctly use TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* fix related usage of TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* Fix double release of object in test_api and test_gl (KhronosGroup#1287)

* Fix clang format only

* Fix double release of objects

* subgroups: Fix setting cl_halfs and progress check. (KhronosGroup#1278)

* subgroups: Fix setting cl_halfs and progress check.

cl_float testing uses set_value such that a generated cl_ulong of 1 is
stored as 1.0F in a logical sense. However, cl_half values aren't
intrinsic to C++ and generated cl_ulongs less than 1024 in particular
are interpreted bitwise as subnormals. The test fails on compute devices
lacking subnormal support. Perform the logical conversion to cl_half.

Fix independent forward progress check.

* subgroups_half: Address review comments

* subgroups_half: Formatting fixes required by check-format

* subgroups_half: Modified to query and use rounding mode supported by device

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* Add tests for entrypoint cl_khr_suggested_local_work_size (KhronosGroup#1264)

* Add tests for entrypoint cl_khr_suggested_local_work_size

Tests added within test_conformance/workgroups. The tests cover several
shapes (num dimensions) and sizes of global work size, kernels using
local memory (dynamic and static) and present/non-present global work
offset.

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* Fix in comparison for error checking

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* 'test_wg_suggested_local_work_size' fixes

* Refactoring of 'test_wg_suggested_local_work_size'

Modifications to reduce code duplication and minimize build time

* remove testing for scalar vloada_half (KhronosGroup#1293)

* Temporarily disable the test_kernel_attributes test case (KhronosGroup#1297)

* Temporarily disable the test_kernel_attributes test case

Per OpenCL spec on CL_KERNEL_ATTRIBUTES, for kernels not created from OpenCL C
source and the clCreateProgramWithSource API call the string returned from this
query will be empty.
But in test_kernel_attributes test, it read from bc binary and expect to get
kernel attribute, which is not consistent with OpenCL spec.

* Fix clang format issue

* Fix double free in c11_atomics tests for SVM allocations (KhronosGroup#1286)

* Only Clang format changes

* Fix double free object for SVM allocations

* Fix double free - review fixes

* Fix kernel source for cl_khr_suggested_local_work_size (KhronosGroup#1300)

Use ASCII '-' instead of unicode '–' as subtration operator.

Signed-off-by: Kévin Petit <kpet@free.fr>

* Remove unused definitions in CMakeLists.txt (KhronosGroup#1302)

Signed-off-by: Kévin Petit <kpet@free.fr>

* add tests for cl_khr_integer_dot_product (KhronosGroup#1276)

* cl_khr_integer_dot_product_tests

* remove emulated codepaths

* fix formatting

* address code review comments

* remove emulated codepaths again

* address one more review comment

* define NOMINMAX in the CMakefile to fix std::min and std::max on MSVC (KhronosGroup#1308)

* Report failures in  simple_{read,write}_image_pitch tests (KhronosGroup#1309)

* Add cl_khr_integer_dot_product to known extensions in test compiler. (KhronosGroup#1316)

* suppress MSVC strdup warning (KhronosGroup#1314)

* Add missing include for gRandomSeed (KhronosGroup#1307)

* Limit workgroup size for atomics tests (KhronosGroup#1197)

* Limit workgroup size for atomics tests

This avoids extremely large local buffer size and slow run

* Always limit workgroup size

* Fix memory model issue in `atomic_flag`. (KhronosGroup#1283)

* Fix memory model issue in atomic_flag.

In atomic_flag sub-tests that modify local memory, compilers may re-order memory accesses between the local and global address spaces which can lead to incorrect test failures.

This commit ensures that both local and global memory operations are fenced to prevent this re-ordering from occurring.

Fixes KhronosGroup#134.

* Clang format changes.

* Added missing global acquire which is necessary for the corresponding global release.

Thanks to @jlewis-austin for spotting.

* Clang format changes.

* Match the condition for applying acquire/release fences.

* remove min max macros (KhronosGroup#1310)

* remove the MIN and MAX macros and use the std versions instead

* fix formatting

* fix Arm build

* remove additional MIN and MAX macros from compat.h

* gles: Fix double frees. (KhronosGroup#1323)

* gles: Fix double frees.

Remove a few explicit frees in the redirect_buffers test which are
already handled by a wrapper.

* gles: Fix double frees

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* api: Enable cl_khr_fp16 when using half types in kernel (KhronosGroup#1327)

* Update cl_khr_integer_dot_product tests for v2 (KhronosGroup#1317)

* Update cl_khr_integer_dot_product tests for v2

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Signed-off-by: Marco Cattani <marco.cattani@arm.com>
Change-Id: I97dbd820f1f32f6b377e47d0bf638f36bb91930a

* only query acceleration properties with v2+

Change-Id: I3f13a0cba7f1f686365b10adf81690e089cd3d74

* Report unsupported extended subgroup tests as skipped rather than passed (KhronosGroup#1301)

* Report unsupported extended subgroup tests as skipped rather than passed

Also don't check the presence of extensions for each sub-test.

Signed-off-by: Kévin Petit <kpet@free.fr>

* address review comments

* Extended subgroups - use 128bit masks (KhronosGroup#1215)

* Extended subgroups - use 128bit masks

* Refactoring to avoid kernels code duplication

* unification kernel names as test_ prefix +subgroups function name
* use string literals that improve readability
* use kernel templates that limit code duplication
* WorkGroupParams allows define default kernel - kernel template for multiple functions
* WorkGroupParams allows define  kernel for specific one subgroup function

Co-authored-by: Stuart Brady <stuart.brady@arm.com>

* Remove space character from extension name (KhronosGroup#1336)

* Add testing of sub_group_broadcast for (u)char and (u)short types (KhronosGroup#1347)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove excessive logging in subgroup tests (KhronosGroup#1343)

This also adds some missing data type logging to the
subgroup_functions_non_uniform_vote tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve error handling in subgroup tests (KhronosGroup#1352)

* MPGCOMP-14761 Improve error handling in subgroup tests

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Add missing newline

* Clean up logging in cl_khr_subgroup_ballot tests (KhronosGroup#1351)

The tests were logging scalar results as vectors padded with zeroes for
no apparent benefit.  Fix this.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix missing cl_khr_semaphore extensions in compiler tests (KhronosGroup#1357)

* Added missing extensions related to cl_khr_semaphore

Signed-off-by: Marco Cattani <marco.cattani@arm.com>

* Fix stack-use-after-scope crash in conversions (KhronosGroup#1358)

The way that program sources were being constructed involved capturing
pointers to strings that were allocated on the stack, and then trying
to use them outside of that scope. This change uses a stringstream
defined in the outer scope to build the program instead.

* Use maximum subgroup size in sub_group_ballot tests (KhronosGroup#1344)

sub_group_ballot_bit_count() and sub_group_ballot_find_msb() mask
their input according to a subgroup size, which is assumed to be the
maximum subgroup size, and not the actual subgroup size excluding
non-existent work-items in the "remainder" subgroup.

Fix this as per the the clarification made to the OpenCL C specification
in revision 3.0.9 for issue KhronosGroup/OpenCL-Docs#626 by pull request
KhronosGroup/OpenCL-Docs#689.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix conversion data loss in test_api min_max_constant_args (KhronosGroup#1355)

* Subgroups tests - sub_group_non_uniform_scan_exclusive function fixes (KhronosGroup#1350)

* Fix - comparing results will never happen.

* No special action needed for one work item in the subgroup

* Remove unused inclusion of <cstdio> (KhronosGroup#1362)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Tidy up code to determine bit mask for ballot scans (KhronosGroup#1363)

It seems more intuitive to set only the bits that are required, rather
than to set one more bit than is required, only to clear it again.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Test api min max - fix printing cl_ulong data type (KhronosGroup#1212)

* test api - fix code formatting only

* Fix printing cl_ulong type to avoid overloading.

* Fix printing size_t data type

* Fix printing size_t data type - set unsinged

* Fix formatting for maxArgs (uint) and numberOfInts (size_t)

* Fix build, glext should not be used with GLEW (KhronosGroup#1337)

* Fix build, glext should not be used with GLEW

* Remove additional define GL_GLEXT_PROTOTYPES

* Remove includes which already defined in setup.h

* Add cl_khr_command_buffer to list of extensions (KhronosGroup#1365)

cl_khr_command_buffer is now public as a provisional khr extension
which implementations may report.

* Refactor logging of subgroup test start/pass messages (KhronosGroup#1361)

Note that this also corrects the start messages logged for the
sub_group_ballot_bit_count/find_msb/find_lsb tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove dead threading code (KhronosGroup#1339)

Remove unused code that hasn't been used for the last three years
and isn't included in makefiles.

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

* test_subgroups - Set safe input values for half type and mul, add operations (KhronosGroup#1346)

* Set safe input values for half type and mul, add operations

* Set safe values for all data types

* Typo fix

* Set constant seed for shuffle

* Change function name to more specific

* set_value takes an integer value, not a bit pattern

* Remove invalid negative_get_platform_info testcase (KhronosGroup#1374)

* Remove invalid negative_get_platform_info testcase

* Implementations are only required to do null checks
* Fixes KhronosGroup#1318

* Fix formatting

* Fix test_api get_command_queue_info (KhronosGroup#1324)

* Fix test_api get_command_queue_info

Decouple host and device out-of-order test enabling

* Rename property sets more generically

* Refactor to use std::vector to accumulate test permutations

* Fix memory leaks (KhronosGroup#1378)

* Fix memory leaks

Fixed memory leaks in: buffers, basic, and vectors

* Formatting fixes

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

Co-authored-by: Kévin Petit <kpet@free.fr>
Co-authored-by: James Price <jrprice@google.com>
Co-authored-by: BKoscielak <bartosz.koscielak@intel.com>
Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
Co-authored-by: Grzegorz Wawiorko <grzegorz.wawiorko@intel.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>
Co-authored-by: spauls <spauls@qti.qualcomm.com>
Co-authored-by: kalchr01 <83217667+kalchr01@users.noreply.github.com>
Co-authored-by: Feng Zou <feng.zou@intel.com>
Co-authored-by: Senran (Stephen) Zhang <senran.zhang@intel.com>
Co-authored-by: Jeremy Kemp <jeremy@jeremykemp.co.uk>
Co-authored-by: Stuart Brady <stuart.brady@arm.com>
Co-authored-by: marcat03 <94451804+marcat03@users.noreply.github.com>
Co-authored-by: Ewan Crawford <ewan@codeplay.com>
Co-authored-by: oramirez <oramirez@qti.qualcomm.com>
Co-authored-by: Jim Lewis <j.lewis1@samsung.com>
yanfeng3721 added a commit to yanfeng3721/OpenCL-CTS that referenced this pull request Aug 29, 2022
* Use macOS 10 in CI (KhronosGroup#1282)

macOS jobs frequently fail. Since macos-11.0 support is considered experimental,
move to macos-10, using macos-latest so we automatically move to 11 when
stable.

See https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

Signed-off-by: Kevin Petit <kevin.petit@arm.com>

* Fix double-release of memory objects (KhronosGroup#1277)

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

* Fix check for image support in test_basic sizeof (KhronosGroup#1269)

* add basic test for cl_khr_pci_bus_info (KhronosGroup#1227)

* add basic test for cl_khr_pci_bus_info

* correctly use TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* fix related usage of TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* Fix double release of object in test_api and test_gl (KhronosGroup#1287)

* Fix clang format only

* Fix double release of objects

* subgroups: Fix setting cl_halfs and progress check. (KhronosGroup#1278)

* subgroups: Fix setting cl_halfs and progress check.

cl_float testing uses set_value such that a generated cl_ulong of 1 is
stored as 1.0F in a logical sense. However, cl_half values aren't
intrinsic to C++ and generated cl_ulongs less than 1024 in particular
are interpreted bitwise as subnormals. The test fails on compute devices
lacking subnormal support. Perform the logical conversion to cl_half.

Fix independent forward progress check.

* subgroups_half: Address review comments

* subgroups_half: Formatting fixes required by check-format

* subgroups_half: Modified to query and use rounding mode supported by device

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* Add tests for entrypoint cl_khr_suggested_local_work_size (KhronosGroup#1264)

* Add tests for entrypoint cl_khr_suggested_local_work_size

Tests added within test_conformance/workgroups. The tests cover several
shapes (num dimensions) and sizes of global work size, kernels using
local memory (dynamic and static) and present/non-present global work
offset.

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* Fix in comparison for error checking

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* 'test_wg_suggested_local_work_size' fixes

* Refactoring of 'test_wg_suggested_local_work_size'

Modifications to reduce code duplication and minimize build time

* remove testing for scalar vloada_half (KhronosGroup#1293)

* Temporarily disable the test_kernel_attributes test case (KhronosGroup#1297)

* Temporarily disable the test_kernel_attributes test case

Per OpenCL spec on CL_KERNEL_ATTRIBUTES, for kernels not created from OpenCL C
source and the clCreateProgramWithSource API call the string returned from this
query will be empty.
But in test_kernel_attributes test, it read from bc binary and expect to get
kernel attribute, which is not consistent with OpenCL spec.

* Fix clang format issue

* Fix double free in c11_atomics tests for SVM allocations (KhronosGroup#1286)

* Only Clang format changes

* Fix double free object for SVM allocations

* Fix double free - review fixes

* Fix kernel source for cl_khr_suggested_local_work_size (KhronosGroup#1300)

Use ASCII '-' instead of unicode '–' as subtration operator.

Signed-off-by: Kévin Petit <kpet@free.fr>

* Remove unused definitions in CMakeLists.txt (KhronosGroup#1302)

Signed-off-by: Kévin Petit <kpet@free.fr>

* add tests for cl_khr_integer_dot_product (KhronosGroup#1276)

* cl_khr_integer_dot_product_tests

* remove emulated codepaths

* fix formatting

* address code review comments

* remove emulated codepaths again

* address one more review comment

* define NOMINMAX in the CMakefile to fix std::min and std::max on MSVC (KhronosGroup#1308)

* Report failures in  simple_{read,write}_image_pitch tests (KhronosGroup#1309)

* Add cl_khr_integer_dot_product to known extensions in test compiler. (KhronosGroup#1316)

* suppress MSVC strdup warning (KhronosGroup#1314)

* Add missing include for gRandomSeed (KhronosGroup#1307)

* Limit workgroup size for atomics tests (KhronosGroup#1197)

* Limit workgroup size for atomics tests

This avoids extremely large local buffer size and slow run

* Always limit workgroup size

* Fix memory model issue in `atomic_flag`. (KhronosGroup#1283)

* Fix memory model issue in atomic_flag.

In atomic_flag sub-tests that modify local memory, compilers may re-order memory accesses between the local and global address spaces which can lead to incorrect test failures.

This commit ensures that both local and global memory operations are fenced to prevent this re-ordering from occurring.

Fixes KhronosGroup#134.

* Clang format changes.

* Added missing global acquire which is necessary for the corresponding global release.

Thanks to @jlewis-austin for spotting.

* Clang format changes.

* Match the condition for applying acquire/release fences.

* remove min max macros (KhronosGroup#1310)

* remove the MIN and MAX macros and use the std versions instead

* fix formatting

* fix Arm build

* remove additional MIN and MAX macros from compat.h

* gles: Fix double frees. (KhronosGroup#1323)

* gles: Fix double frees.

Remove a few explicit frees in the redirect_buffers test which are
already handled by a wrapper.

* gles: Fix double frees

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* api: Enable cl_khr_fp16 when using half types in kernel (KhronosGroup#1327)

* Update cl_khr_integer_dot_product tests for v2 (KhronosGroup#1317)

* Update cl_khr_integer_dot_product tests for v2

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Signed-off-by: Marco Cattani <marco.cattani@arm.com>
Change-Id: I97dbd820f1f32f6b377e47d0bf638f36bb91930a

* only query acceleration properties with v2+

Change-Id: I3f13a0cba7f1f686365b10adf81690e089cd3d74

* Report unsupported extended subgroup tests as skipped rather than passed (KhronosGroup#1301)

* Report unsupported extended subgroup tests as skipped rather than passed

Also don't check the presence of extensions for each sub-test.

Signed-off-by: Kévin Petit <kpet@free.fr>

* address review comments

* Extended subgroups - use 128bit masks (KhronosGroup#1215)

* Extended subgroups - use 128bit masks

* Refactoring to avoid kernels code duplication

* unification kernel names as test_ prefix +subgroups function name
* use string literals that improve readability
* use kernel templates that limit code duplication
* WorkGroupParams allows define default kernel - kernel template for multiple functions
* WorkGroupParams allows define  kernel for specific one subgroup function

Co-authored-by: Stuart Brady <stuart.brady@arm.com>

* Remove space character from extension name (KhronosGroup#1336)

* Add testing of sub_group_broadcast for (u)char and (u)short types (KhronosGroup#1347)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove excessive logging in subgroup tests (KhronosGroup#1343)

This also adds some missing data type logging to the
subgroup_functions_non_uniform_vote tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve error handling in subgroup tests (KhronosGroup#1352)

* MPGCOMP-14761 Improve error handling in subgroup tests

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Add missing newline

* Clean up logging in cl_khr_subgroup_ballot tests (KhronosGroup#1351)

The tests were logging scalar results as vectors padded with zeroes for
no apparent benefit.  Fix this.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix missing cl_khr_semaphore extensions in compiler tests (KhronosGroup#1357)

* Added missing extensions related to cl_khr_semaphore

Signed-off-by: Marco Cattani <marco.cattani@arm.com>

* Fix stack-use-after-scope crash in conversions (KhronosGroup#1358)

The way that program sources were being constructed involved capturing
pointers to strings that were allocated on the stack, and then trying
to use them outside of that scope. This change uses a stringstream
defined in the outer scope to build the program instead.

* Use maximum subgroup size in sub_group_ballot tests (KhronosGroup#1344)

sub_group_ballot_bit_count() and sub_group_ballot_find_msb() mask
their input according to a subgroup size, which is assumed to be the
maximum subgroup size, and not the actual subgroup size excluding
non-existent work-items in the "remainder" subgroup.

Fix this as per the the clarification made to the OpenCL C specification
in revision 3.0.9 for issue KhronosGroup/OpenCL-Docs#626 by pull request
KhronosGroup/OpenCL-Docs#689.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix conversion data loss in test_api min_max_constant_args (KhronosGroup#1355)

* Subgroups tests - sub_group_non_uniform_scan_exclusive function fixes (KhronosGroup#1350)

* Fix - comparing results will never happen.

* No special action needed for one work item in the subgroup

* Remove unused inclusion of <cstdio> (KhronosGroup#1362)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Tidy up code to determine bit mask for ballot scans (KhronosGroup#1363)

It seems more intuitive to set only the bits that are required, rather
than to set one more bit than is required, only to clear it again.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Test api min max - fix printing cl_ulong data type (KhronosGroup#1212)

* test api - fix code formatting only

* Fix printing cl_ulong type to avoid overloading.

* Fix printing size_t data type

* Fix printing size_t data type - set unsinged

* Fix formatting for maxArgs (uint) and numberOfInts (size_t)

* Fix build, glext should not be used with GLEW (KhronosGroup#1337)

* Fix build, glext should not be used with GLEW

* Remove additional define GL_GLEXT_PROTOTYPES

* Remove includes which already defined in setup.h

* Add cl_khr_command_buffer to list of extensions (KhronosGroup#1365)

cl_khr_command_buffer is now public as a provisional khr extension
which implementations may report.

* Refactor logging of subgroup test start/pass messages (KhronosGroup#1361)

Note that this also corrects the start messages logged for the
sub_group_ballot_bit_count/find_msb/find_lsb tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove dead threading code (KhronosGroup#1339)

Remove unused code that hasn't been used for the last three years
and isn't included in makefiles.

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

* test_subgroups - Set safe input values for half type and mul, add operations (KhronosGroup#1346)

* Set safe input values for half type and mul, add operations

* Set safe values for all data types

* Typo fix

* Set constant seed for shuffle

* Change function name to more specific

* set_value takes an integer value, not a bit pattern

* Remove invalid negative_get_platform_info testcase (KhronosGroup#1374)

* Remove invalid negative_get_platform_info testcase

* Implementations are only required to do null checks
* Fixes KhronosGroup#1318

* Fix formatting

* Fix test_api get_command_queue_info (KhronosGroup#1324)

* Fix test_api get_command_queue_info

Decouple host and device out-of-order test enabling

* Rename property sets more generically

* Refactor to use std::vector to accumulate test permutations

* Fix memory leaks (KhronosGroup#1378)

* Fix memory leaks

Fixed memory leaks in: buffers, basic, and vectors

* Formatting fixes

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

* Refactor divergence mask handling in subgroup tests (KhronosGroup#1379)

This changes compilation of subgroup test kernels so that a separate
compilation is no longer performed for each divergence mask value.

The divergence mask is now passed as a kernel argument.

This also fixes all subgroup_functions_non_uniform_arithmetic testing
and the sub_group_elect and sub_group_any/all_equal subtests of the
subgroup_functions_non_uniform_vote test to use the correct order of
vector components for GPUs with a subgroup size greater than 64.

The conversion of divergence mask bitsets to uint4 vectors has been
corrected to match code comments in WorkGroupParams::load_masks()
in test_conformance/subgroups/subhelpers.h.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve testing of sub_group_ballot (KhronosGroup#1382)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve testing of kernel arg info in pipe_info test (KhronosGroup#1326)

The test now checks that CL_KERNEL_ARG_INFO_NOT_AVAILABLE is returned
when calling clGetKernelArgInfo() with offline compilation modes.

The correct function name is printed if clGetKernelArgInfo() fails
when using online compilation (and not "clSetKernelArgInfo()").

When using online compilation, if the actual arg type is not as
expected, the actual arg type is now logged, and the return value
is now TEST_FAIL (-1) as per other failures (and not 1).

All other test pass/fail values used in the test now use TEST_PASS
and TEST_FAIL instead of 0 and -1 literals.

An unnecessary cast of pipe_kernel_code has been removed.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Sync submission_details with conformance doc v26 (KhronosGroup#1389)

Add "Patches" field

Co-authored-by: Kévin Petit <kpet@free.fr>
Co-authored-by: James Price <jrprice@google.com>
Co-authored-by: BKoscielak <bartosz.koscielak@intel.com>
Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
Co-authored-by: Grzegorz Wawiorko <grzegorz.wawiorko@intel.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>
Co-authored-by: spauls <spauls@qti.qualcomm.com>
Co-authored-by: kalchr01 <83217667+kalchr01@users.noreply.github.com>
Co-authored-by: Feng Zou <feng.zou@intel.com>
Co-authored-by: Senran (Stephen) Zhang <senran.zhang@intel.com>
Co-authored-by: Jeremy Kemp <jeremy@jeremykemp.co.uk>
Co-authored-by: Stuart Brady <stuart.brady@arm.com>
Co-authored-by: marcat03 <94451804+marcat03@users.noreply.github.com>
Co-authored-by: Ewan Crawford <ewan@codeplay.com>
Co-authored-by: oramirez <oramirez@qti.qualcomm.com>
Co-authored-by: Jim Lewis <j.lewis1@samsung.com>
yanfeng3721 added a commit to yanfeng3721/OpenCL-CTS that referenced this pull request Oct 18, 2023
* Use macOS 10 in CI (KhronosGroup#1282)

macOS jobs frequently fail. Since macos-11.0 support is considered experimental,
move to macos-10, using macos-latest so we automatically move to 11 when
stable.

See https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

Signed-off-by: Kevin Petit <kevin.petit@arm.com>

* Fix double-release of memory objects (KhronosGroup#1277)

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

* Fix check for image support in test_basic sizeof (KhronosGroup#1269)

* add basic test for cl_khr_pci_bus_info (KhronosGroup#1227)

* add basic test for cl_khr_pci_bus_info

* correctly use TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* fix related usage of TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* Fix double release of object in test_api and test_gl (KhronosGroup#1287)

* Fix clang format only

* Fix double release of objects

* subgroups: Fix setting cl_halfs and progress check. (KhronosGroup#1278)

* subgroups: Fix setting cl_halfs and progress check.

cl_float testing uses set_value such that a generated cl_ulong of 1 is
stored as 1.0F in a logical sense. However, cl_half values aren't
intrinsic to C++ and generated cl_ulongs less than 1024 in particular
are interpreted bitwise as subnormals. The test fails on compute devices
lacking subnormal support. Perform the logical conversion to cl_half.

Fix independent forward progress check.

* subgroups_half: Address review comments

* subgroups_half: Formatting fixes required by check-format

* subgroups_half: Modified to query and use rounding mode supported by device

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* Add tests for entrypoint cl_khr_suggested_local_work_size (KhronosGroup#1264)

* Add tests for entrypoint cl_khr_suggested_local_work_size

Tests added within test_conformance/workgroups. The tests cover several
shapes (num dimensions) and sizes of global work size, kernels using
local memory (dynamic and static) and present/non-present global work
offset.

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* Fix in comparison for error checking

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* 'test_wg_suggested_local_work_size' fixes

* Refactoring of 'test_wg_suggested_local_work_size'

Modifications to reduce code duplication and minimize build time

* remove testing for scalar vloada_half (KhronosGroup#1293)

* Temporarily disable the test_kernel_attributes test case (KhronosGroup#1297)

* Temporarily disable the test_kernel_attributes test case

Per OpenCL spec on CL_KERNEL_ATTRIBUTES, for kernels not created from OpenCL C
source and the clCreateProgramWithSource API call the string returned from this
query will be empty.
But in test_kernel_attributes test, it read from bc binary and expect to get
kernel attribute, which is not consistent with OpenCL spec.

* Fix clang format issue

* Fix double free in c11_atomics tests for SVM allocations (KhronosGroup#1286)

* Only Clang format changes

* Fix double free object for SVM allocations

* Fix double free - review fixes

* Fix kernel source for cl_khr_suggested_local_work_size (KhronosGroup#1300)

Use ASCII '-' instead of unicode '–' as subtration operator.

Signed-off-by: Kévin Petit <kpet@free.fr>

* Remove unused definitions in CMakeLists.txt (KhronosGroup#1302)

Signed-off-by: Kévin Petit <kpet@free.fr>

* add tests for cl_khr_integer_dot_product (KhronosGroup#1276)

* cl_khr_integer_dot_product_tests

* remove emulated codepaths

* fix formatting

* address code review comments

* remove emulated codepaths again

* address one more review comment

* define NOMINMAX in the CMakefile to fix std::min and std::max on MSVC (KhronosGroup#1308)

* Report failures in  simple_{read,write}_image_pitch tests (KhronosGroup#1309)

* Add cl_khr_integer_dot_product to known extensions in test compiler. (KhronosGroup#1316)

* suppress MSVC strdup warning (KhronosGroup#1314)

* Add missing include for gRandomSeed (KhronosGroup#1307)

* Limit workgroup size for atomics tests (KhronosGroup#1197)

* Limit workgroup size for atomics tests

This avoids extremely large local buffer size and slow run

* Always limit workgroup size

* Fix memory model issue in `atomic_flag`. (KhronosGroup#1283)

* Fix memory model issue in atomic_flag.

In atomic_flag sub-tests that modify local memory, compilers may re-order memory accesses between the local and global address spaces which can lead to incorrect test failures.

This commit ensures that both local and global memory operations are fenced to prevent this re-ordering from occurring.

Fixes KhronosGroup#134.

* Clang format changes.

* Added missing global acquire which is necessary for the corresponding global release.

Thanks to @jlewis-austin for spotting.

* Clang format changes.

* Match the condition for applying acquire/release fences.

* remove min max macros (KhronosGroup#1310)

* remove the MIN and MAX macros and use the std versions instead

* fix formatting

* fix Arm build

* remove additional MIN and MAX macros from compat.h

* gles: Fix double frees. (KhronosGroup#1323)

* gles: Fix double frees.

Remove a few explicit frees in the redirect_buffers test which are
already handled by a wrapper.

* gles: Fix double frees

A recent update to the object wrapper classes (KhronosGroup#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* api: Enable cl_khr_fp16 when using half types in kernel (KhronosGroup#1327)

* Update cl_khr_integer_dot_product tests for v2 (KhronosGroup#1317)

* Update cl_khr_integer_dot_product tests for v2

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Signed-off-by: Marco Cattani <marco.cattani@arm.com>
Change-Id: I97dbd820f1f32f6b377e47d0bf638f36bb91930a

* only query acceleration properties with v2+

Change-Id: I3f13a0cba7f1f686365b10adf81690e089cd3d74

* Report unsupported extended subgroup tests as skipped rather than passed (KhronosGroup#1301)

* Report unsupported extended subgroup tests as skipped rather than passed

Also don't check the presence of extensions for each sub-test.

Signed-off-by: Kévin Petit <kpet@free.fr>

* address review comments

* Extended subgroups - use 128bit masks (KhronosGroup#1215)

* Extended subgroups - use 128bit masks

* Refactoring to avoid kernels code duplication

* unification kernel names as test_ prefix +subgroups function name
* use string literals that improve readability
* use kernel templates that limit code duplication
* WorkGroupParams allows define default kernel - kernel template for multiple functions
* WorkGroupParams allows define  kernel for specific one subgroup function

Co-authored-by: Stuart Brady <stuart.brady@arm.com>

* Remove space character from extension name (KhronosGroup#1336)

* Add testing of sub_group_broadcast for (u)char and (u)short types (KhronosGroup#1347)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove excessive logging in subgroup tests (KhronosGroup#1343)

This also adds some missing data type logging to the
subgroup_functions_non_uniform_vote tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve error handling in subgroup tests (KhronosGroup#1352)

* MPGCOMP-14761 Improve error handling in subgroup tests

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Add missing newline

* Clean up logging in cl_khr_subgroup_ballot tests (KhronosGroup#1351)

The tests were logging scalar results as vectors padded with zeroes for
no apparent benefit.  Fix this.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix missing cl_khr_semaphore extensions in compiler tests (KhronosGroup#1357)

* Added missing extensions related to cl_khr_semaphore

Signed-off-by: Marco Cattani <marco.cattani@arm.com>

* Fix stack-use-after-scope crash in conversions (KhronosGroup#1358)

The way that program sources were being constructed involved capturing
pointers to strings that were allocated on the stack, and then trying
to use them outside of that scope. This change uses a stringstream
defined in the outer scope to build the program instead.

* Use maximum subgroup size in sub_group_ballot tests (KhronosGroup#1344)

sub_group_ballot_bit_count() and sub_group_ballot_find_msb() mask
their input according to a subgroup size, which is assumed to be the
maximum subgroup size, and not the actual subgroup size excluding
non-existent work-items in the "remainder" subgroup.

Fix this as per the the clarification made to the OpenCL C specification
in revision 3.0.9 for issue KhronosGroup/OpenCL-Docs#626 by pull request
KhronosGroup/OpenCL-Docs#689.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix conversion data loss in test_api min_max_constant_args (KhronosGroup#1355)

* Subgroups tests - sub_group_non_uniform_scan_exclusive function fixes (KhronosGroup#1350)

* Fix - comparing results will never happen.

* No special action needed for one work item in the subgroup

* Remove unused inclusion of <cstdio> (KhronosGroup#1362)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Tidy up code to determine bit mask for ballot scans (KhronosGroup#1363)

It seems more intuitive to set only the bits that are required, rather
than to set one more bit than is required, only to clear it again.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Test api min max - fix printing cl_ulong data type (KhronosGroup#1212)

* test api - fix code formatting only

* Fix printing cl_ulong type to avoid overloading.

* Fix printing size_t data type

* Fix printing size_t data type - set unsinged

* Fix formatting for maxArgs (uint) and numberOfInts (size_t)

* Fix build, glext should not be used with GLEW (KhronosGroup#1337)

* Fix build, glext should not be used with GLEW

* Remove additional define GL_GLEXT_PROTOTYPES

* Remove includes which already defined in setup.h

* Add cl_khr_command_buffer to list of extensions (KhronosGroup#1365)

cl_khr_command_buffer is now public as a provisional khr extension
which implementations may report.

* Refactor logging of subgroup test start/pass messages (KhronosGroup#1361)

Note that this also corrects the start messages logged for the
sub_group_ballot_bit_count/find_msb/find_lsb tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove dead threading code (KhronosGroup#1339)

Remove unused code that hasn't been used for the last three years
and isn't included in makefiles.

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

* test_subgroups - Set safe input values for half type and mul, add operations (KhronosGroup#1346)

* Set safe input values for half type and mul, add operations

* Set safe values for all data types

* Typo fix

* Set constant seed for shuffle

* Change function name to more specific

* set_value takes an integer value, not a bit pattern

* Remove invalid negative_get_platform_info testcase (KhronosGroup#1374)

* Remove invalid negative_get_platform_info testcase

* Implementations are only required to do null checks
* Fixes KhronosGroup#1318

* Fix formatting

* Fix test_api get_command_queue_info (KhronosGroup#1324)

* Fix test_api get_command_queue_info

Decouple host and device out-of-order test enabling

* Rename property sets more generically

* Refactor to use std::vector to accumulate test permutations

* Fix memory leaks (KhronosGroup#1378)

* Fix memory leaks

Fixed memory leaks in: buffers, basic, and vectors

* Formatting fixes

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

* Refactor divergence mask handling in subgroup tests (KhronosGroup#1379)

This changes compilation of subgroup test kernels so that a separate
compilation is no longer performed for each divergence mask value.

The divergence mask is now passed as a kernel argument.

This also fixes all subgroup_functions_non_uniform_arithmetic testing
and the sub_group_elect and sub_group_any/all_equal subtests of the
subgroup_functions_non_uniform_vote test to use the correct order of
vector components for GPUs with a subgroup size greater than 64.

The conversion of divergence mask bitsets to uint4 vectors has been
corrected to match code comments in WorkGroupParams::load_masks()
in test_conformance/subgroups/subhelpers.h.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve testing of sub_group_ballot (KhronosGroup#1382)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve testing of kernel arg info in pipe_info test (KhronosGroup#1326)

The test now checks that CL_KERNEL_ARG_INFO_NOT_AVAILABLE is returned
when calling clGetKernelArgInfo() with offline compilation modes.

The correct function name is printed if clGetKernelArgInfo() fails
when using online compilation (and not "clSetKernelArgInfo()").

When using online compilation, if the actual arg type is not as
expected, the actual arg type is now logged, and the return value
is now TEST_FAIL (-1) as per other failures (and not 1).

All other test pass/fail values used in the test now use TEST_PASS
and TEST_FAIL instead of 0 and -1 literals.

An unnecessary cast of pipe_kernel_code has been removed.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Sync submission_details with conformance doc v26 (KhronosGroup#1389)

Add "Patches" field

* Refactor kernel execution in subgroup tests (KhronosGroup#1391)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Update format script and drop Travis badge for branch rename (KhronosGroup#1393)

`master` is now `main`, so update `check-format.sh` accordingly.

Also completely drop the Travis badge as we now use GitHub actions.  There is
no replacement badge as the current action is pre-submission, not
post-submission.

* Added simple test for CL_DEVICE_PRINTF_BUFFER_SIZE. (KhronosGroup#1386)

* Added simple test for CL_DEVICE_PRINTF_BUFFER_SIZE.

* Clang format fix.

* Check for non-uniform work-group support (KhronosGroup#1383)

Only run sub-group tests with non-uniform work-groups on OpenCL 3.0 and
later if it is supported by the device.

* Fix build error for linux with clang-8 (KhronosGroup#1304)

-Wabsolute-value warning reported as error (long double truncated to
double)

* add a prefix to OpenCL extension names (KhronosGroup#1311)

* add a prefix to OpenCL extension names

* fix formatting

* conversions: Use volatile qualifier to prevent optimizations (KhronosGroup#1399)

Use volatile to prevent clang optimizations, fix int2float

* Add cluster size handling in subgroup test helpers (KhronosGroup#1394)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve cl_khr_subgroup_shuffle* test coverage (KhronosGroup#1402)

Test cases where the index/mask/delta is greater than or equal to the
maximum subgroup size.  These are cases that return undefined results
but are not undefined behavior.

The index/mask/delta values now include values less than twice the
subgroup size, and 0xffffffff.

Testing for sub_group_shuffle_xor() already allowed inputs that were
greater or equal to the subgroup size for the last subgroup in a
workgroup, but did not properly account for this in the verification
function, potentially resulting in out of bounds accesses.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* test_api_min_max.cpp: use size_t for get_global_id() value (KhronosGroup#1410)

In some rare cases where get_global_id() is larger than 2G, the 32bit int type would convert the value into a negative integer.

* Fix sub_group_ballot_find_msb/lsb tests (KhronosGroup#1411)

As per the OpenCL Extension Specification § 38.6 Ballots:

   If no bits representing predicate values from all work items in
   the subgroup are set in the bitfield value then the return value
   is undefined.

The case with no bits set is still worth testing, as it does not result
in undefined behavior, but only an undefined return value.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* refactor work group scan and reduction tests (KhronosGroup#1401)

* updated reduce test

* switched all reduce tests to new framework

* switch over scans to new framework

* remove old files

* minor fixes

* add type type name to the kernel name

* fix Windows build and warnings

* address review comments

* Test all cluster sizes for cl_khr_subgroup_clustered_reduce (KhronosGroup#1408)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix incorrect use image channel data type and filtering mode (KhronosGroup#1375)

* Fix clang 10 build errors (KhronosGroup#1387)

* Fix clang 10 build errors

Lossy casts due to inexact float representation of CL_INT_MAX

* Fix clang format

* Remove implicit-const-int-float-conversion flag

* test_basic/enqueue_map: Initialize all the data (KhronosGroup#1417)

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Signed-off-by: Kévin Petit <kpet@free.fr>
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
Signed-off-by: Marco Cattani <marco.cattani@arm.com>
Co-authored-by: Kévin Petit <kpet@free.fr>
Co-authored-by: James Price <jrprice@google.com>
Co-authored-by: BKoscielak <bartosz.koscielak@intel.com>
Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
Co-authored-by: Grzegorz Wawiorko <grzegorz.wawiorko@intel.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>
Co-authored-by: spauls <spauls@qti.qualcomm.com>
Co-authored-by: kalchr01 <83217667+kalchr01@users.noreply.github.com>
Co-authored-by: Feng Zou <feng.zou@intel.com>
Co-authored-by: Senran (Stephen) Zhang <senran.zhang@intel.com>
Co-authored-by: Jeremy Kemp <jeremy@jeremykemp.co.uk>
Co-authored-by: Stuart Brady <stuart.brady@arm.com>
Co-authored-by: marcat03 <94451804+marcat03@users.noreply.github.com>
Co-authored-by: Ewan Crawford <ewan@codeplay.com>
Co-authored-by: oramirez <oramirez@qti.qualcomm.com>
Co-authored-by: Jim Lewis <j.lewis1@samsung.com>
Co-authored-by: Alastair Murray <alastair.murray@codeplay.com>
Co-authored-by: Jack Frankland <30410009+FranklandJack@users.noreply.github.com>
Co-authored-by: Jason Tang <jason.tang@amd.com>
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
yanfeng3721 added a commit to yanfeng3721/OpenCL-CTS that referenced this pull request Oct 18, 2023
* Use macOS 10 in CI (#1282)

macOS jobs frequently fail. Since macos-11.0 support is considered experimental,
move to macos-10, using macos-latest so we automatically move to 11 when
stable.

See https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

Signed-off-by: Kevin Petit <kevin.petit@arm.com>

* Fix double-release of memory objects (#1277)

A recent update to the object wrapper classes (#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

* Fix check for image support in test_basic sizeof (#1269)

* add basic test for cl_khr_pci_bus_info (#1227)

* add basic test for cl_khr_pci_bus_info

* correctly use TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* fix related usage of TEST_SKIPPED_ITSELF

Co-authored-by: Kévin Petit <kpet@free.fr>

* Fix double release of object in test_api and test_gl (#1287)

* Fix clang format only

* Fix double release of objects

* subgroups: Fix setting cl_halfs and progress check. (#1278)

* subgroups: Fix setting cl_halfs and progress check.

cl_float testing uses set_value such that a generated cl_ulong of 1 is
stored as 1.0F in a logical sense. However, cl_half values aren't
intrinsic to C++ and generated cl_ulongs less than 1024 in particular
are interpreted bitwise as subnormals. The test fails on compute devices
lacking subnormal support. Perform the logical conversion to cl_half.

Fix independent forward progress check.

* subgroups_half: Address review comments

* subgroups_half: Formatting fixes required by check-format

* subgroups_half: Modified to query and use rounding mode supported by device

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* Add tests for entrypoint cl_khr_suggested_local_work_size (#1264)

* Add tests for entrypoint cl_khr_suggested_local_work_size

Tests added within test_conformance/workgroups. The tests cover several
shapes (num dimensions) and sizes of global work size, kernels using
local memory (dynamic and static) and present/non-present global work
offset.

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* Fix in comparison for error checking

Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>

* 'test_wg_suggested_local_work_size' fixes

* Refactoring of 'test_wg_suggested_local_work_size'

Modifications to reduce code duplication and minimize build time

* remove testing for scalar vloada_half (#1293)

* Temporarily disable the test_kernel_attributes test case (#1297)

* Temporarily disable the test_kernel_attributes test case

Per OpenCL spec on CL_KERNEL_ATTRIBUTES, for kernels not created from OpenCL C
source and the clCreateProgramWithSource API call the string returned from this
query will be empty.
But in test_kernel_attributes test, it read from bc binary and expect to get
kernel attribute, which is not consistent with OpenCL spec.

* Fix clang format issue

* Fix double free in c11_atomics tests for SVM allocations (#1286)

* Only Clang format changes

* Fix double free object for SVM allocations

* Fix double free - review fixes

* Fix kernel source for cl_khr_suggested_local_work_size (#1300)

Use ASCII '-' instead of unicode '–' as subtration operator.

Signed-off-by: Kévin Petit <kpet@free.fr>

* Remove unused definitions in CMakeLists.txt (#1302)

Signed-off-by: Kévin Petit <kpet@free.fr>

* add tests for cl_khr_integer_dot_product (#1276)

* cl_khr_integer_dot_product_tests

* remove emulated codepaths

* fix formatting

* address code review comments

* remove emulated codepaths again

* address one more review comment

* define NOMINMAX in the CMakefile to fix std::min and std::max on MSVC (#1308)

* Report failures in  simple_{read,write}_image_pitch tests (#1309)

* Add cl_khr_integer_dot_product to known extensions in test compiler. (#1316)

* suppress MSVC strdup warning (#1314)

* Add missing include for gRandomSeed (#1307)

* Limit workgroup size for atomics tests (#1197)

* Limit workgroup size for atomics tests

This avoids extremely large local buffer size and slow run

* Always limit workgroup size

* Fix memory model issue in `atomic_flag`. (#1283)

* Fix memory model issue in atomic_flag.

In atomic_flag sub-tests that modify local memory, compilers may re-order memory accesses between the local and global address spaces which can lead to incorrect test failures.

This commit ensures that both local and global memory operations are fenced to prevent this re-ordering from occurring.

Fixes #134.

* Clang format changes.

* Added missing global acquire which is necessary for the corresponding global release.

Thanks to @jlewis-austin for spotting.

* Clang format changes.

* Match the condition for applying acquire/release fences.

* remove min max macros (#1310)

* remove the MIN and MAX macros and use the std versions instead

* fix formatting

* fix Arm build

* remove additional MIN and MAX macros from compat.h

* gles: Fix double frees. (#1323)

* gles: Fix double frees.

Remove a few explicit frees in the redirect_buffers test which are
already handled by a wrapper.

* gles: Fix double frees

A recent update to the object wrapper classes (#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* api: Enable cl_khr_fp16 when using half types in kernel (#1327)

* Update cl_khr_integer_dot_product tests for v2 (#1317)

* Update cl_khr_integer_dot_product tests for v2

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Signed-off-by: Marco Cattani <marco.cattani@arm.com>
Change-Id: I97dbd820f1f32f6b377e47d0bf638f36bb91930a

* only query acceleration properties with v2+

Change-Id: I3f13a0cba7f1f686365b10adf81690e089cd3d74

* Report unsupported extended subgroup tests as skipped rather than passed (#1301)

* Report unsupported extended subgroup tests as skipped rather than passed

Also don't check the presence of extensions for each sub-test.

Signed-off-by: Kévin Petit <kpet@free.fr>

* address review comments

* Extended subgroups - use 128bit masks (#1215)

* Extended subgroups - use 128bit masks

* Refactoring to avoid kernels code duplication

* unification kernel names as test_ prefix +subgroups function name
* use string literals that improve readability
* use kernel templates that limit code duplication
* WorkGroupParams allows define default kernel - kernel template for multiple functions
* WorkGroupParams allows define  kernel for specific one subgroup function

Co-authored-by: Stuart Brady <stuart.brady@arm.com>

* Remove space character from extension name (#1336)

* Add testing of sub_group_broadcast for (u)char and (u)short types (#1347)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove excessive logging in subgroup tests (#1343)

This also adds some missing data type logging to the
subgroup_functions_non_uniform_vote tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve error handling in subgroup tests (#1352)

* MPGCOMP-14761 Improve error handling in subgroup tests

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Add missing newline

* Clean up logging in cl_khr_subgroup_ballot tests (#1351)

The tests were logging scalar results as vectors padded with zeroes for
no apparent benefit.  Fix this.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix missing cl_khr_semaphore extensions in compiler tests (#1357)

* Added missing extensions related to cl_khr_semaphore

Signed-off-by: Marco Cattani <marco.cattani@arm.com>

* Fix stack-use-after-scope crash in conversions (#1358)

The way that program sources were being constructed involved capturing
pointers to strings that were allocated on the stack, and then trying
to use them outside of that scope. This change uses a stringstream
defined in the outer scope to build the program instead.

* Use maximum subgroup size in sub_group_ballot tests (#1344)

sub_group_ballot_bit_count() and sub_group_ballot_find_msb() mask
their input according to a subgroup size, which is assumed to be the
maximum subgroup size, and not the actual subgroup size excluding
non-existent work-items in the "remainder" subgroup.

Fix this as per the the clarification made to the OpenCL C specification
in revision 3.0.9 for issue KhronosGroup/OpenCL-Docs#626 by pull request
KhronosGroup/OpenCL-Docs#689.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix conversion data loss in test_api min_max_constant_args (#1355)

* Subgroups tests - sub_group_non_uniform_scan_exclusive function fixes (#1350)

* Fix - comparing results will never happen.

* No special action needed for one work item in the subgroup

* Remove unused inclusion of <cstdio> (#1362)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Tidy up code to determine bit mask for ballot scans (#1363)

It seems more intuitive to set only the bits that are required, rather
than to set one more bit than is required, only to clear it again.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Test api min max - fix printing cl_ulong data type (#1212)

* test api - fix code formatting only

* Fix printing cl_ulong type to avoid overloading.

* Fix printing size_t data type

* Fix printing size_t data type - set unsinged

* Fix formatting for maxArgs (uint) and numberOfInts (size_t)

* Fix build, glext should not be used with GLEW (#1337)

* Fix build, glext should not be used with GLEW

* Remove additional define GL_GLEXT_PROTOTYPES

* Remove includes which already defined in setup.h

* Add cl_khr_command_buffer to list of extensions (#1365)

cl_khr_command_buffer is now public as a provisional khr extension
which implementations may report.

* Refactor logging of subgroup test start/pass messages (#1361)

Note that this also corrects the start messages logged for the
sub_group_ballot_bit_count/find_msb/find_lsb tests.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Remove dead threading code (#1339)

Remove unused code that hasn't been used for the last three years
and isn't included in makefiles.

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

* test_subgroups - Set safe input values for half type and mul, add operations (#1346)

* Set safe input values for half type and mul, add operations

* Set safe values for all data types

* Typo fix

* Set constant seed for shuffle

* Change function name to more specific

* set_value takes an integer value, not a bit pattern

* Remove invalid negative_get_platform_info testcase (#1374)

* Remove invalid negative_get_platform_info testcase

* Implementations are only required to do null checks
* Fixes #1318

* Fix formatting

* Fix test_api get_command_queue_info (#1324)

* Fix test_api get_command_queue_info

Decouple host and device out-of-order test enabling

* Rename property sets more generically

* Refactor to use std::vector to accumulate test permutations

* Fix memory leaks (#1378)

* Fix memory leaks

Fixed memory leaks in: buffers, basic, and vectors

* Formatting fixes

Co-authored-by: oramirez <oramirez@qti.qualcomm.com>

* Refactor divergence mask handling in subgroup tests (#1379)

This changes compilation of subgroup test kernels so that a separate
compilation is no longer performed for each divergence mask value.

The divergence mask is now passed as a kernel argument.

This also fixes all subgroup_functions_non_uniform_arithmetic testing
and the sub_group_elect and sub_group_any/all_equal subtests of the
subgroup_functions_non_uniform_vote test to use the correct order of
vector components for GPUs with a subgroup size greater than 64.

The conversion of divergence mask bitsets to uint4 vectors has been
corrected to match code comments in WorkGroupParams::load_masks()
in test_conformance/subgroups/subhelpers.h.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve testing of sub_group_ballot (#1382)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve testing of kernel arg info in pipe_info test (#1326)

The test now checks that CL_KERNEL_ARG_INFO_NOT_AVAILABLE is returned
when calling clGetKernelArgInfo() with offline compilation modes.

The correct function name is printed if clGetKernelArgInfo() fails
when using online compilation (and not "clSetKernelArgInfo()").

When using online compilation, if the actual arg type is not as
expected, the actual arg type is now logged, and the return value
is now TEST_FAIL (-1) as per other failures (and not 1).

All other test pass/fail values used in the test now use TEST_PASS
and TEST_FAIL instead of 0 and -1 literals.

An unnecessary cast of pipe_kernel_code has been removed.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Sync submission_details with conformance doc v26 (#1389)

Add "Patches" field

* Refactor kernel execution in subgroup tests (#1391)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Update format script and drop Travis badge for branch rename (#1393)

`master` is now `main`, so update `check-format.sh` accordingly.

Also completely drop the Travis badge as we now use GitHub actions.  There is
no replacement badge as the current action is pre-submission, not
post-submission.

* Added simple test for CL_DEVICE_PRINTF_BUFFER_SIZE. (#1386)

* Added simple test for CL_DEVICE_PRINTF_BUFFER_SIZE.

* Clang format fix.

* Check for non-uniform work-group support (#1383)

Only run sub-group tests with non-uniform work-groups on OpenCL 3.0 and
later if it is supported by the device.

* Fix build error for linux with clang-8 (#1304)

-Wabsolute-value warning reported as error (long double truncated to
double)

* add a prefix to OpenCL extension names (#1311)

* add a prefix to OpenCL extension names

* fix formatting

* conversions: Use volatile qualifier to prevent optimizations (#1399)

Use volatile to prevent clang optimizations, fix int2float

* Add cluster size handling in subgroup test helpers (#1394)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Improve cl_khr_subgroup_shuffle* test coverage (#1402)

Test cases where the index/mask/delta is greater than or equal to the
maximum subgroup size.  These are cases that return undefined results
but are not undefined behavior.

The index/mask/delta values now include values less than twice the
subgroup size, and 0xffffffff.

Testing for sub_group_shuffle_xor() already allowed inputs that were
greater or equal to the subgroup size for the last subgroup in a
workgroup, but did not properly account for this in the verification
function, potentially resulting in out of bounds accesses.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* test_api_min_max.cpp: use size_t for get_global_id() value (#1410)

In some rare cases where get_global_id() is larger than 2G, the 32bit int type would convert the value into a negative integer.

* Fix sub_group_ballot_find_msb/lsb tests (#1411)

As per the OpenCL Extension Specification § 38.6 Ballots:

   If no bits representing predicate values from all work items in
   the subgroup are set in the bitfield value then the return value
   is undefined.

The case with no bits set is still worth testing, as it does not result
in undefined behavior, but only an undefined return value.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* refactor work group scan and reduction tests (#1401)

* updated reduce test

* switched all reduce tests to new framework

* switch over scans to new framework

* remove old files

* minor fixes

* add type type name to the kernel name

* fix Windows build and warnings

* address review comments

* Test all cluster sizes for cl_khr_subgroup_clustered_reduce (#1408)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix incorrect use image channel data type and filtering mode (#1375)

* Fix clang 10 build errors (#1387)

* Fix clang 10 build errors

Lossy casts due to inexact float representation of CL_INT_MAX

* Fix clang format

* Remove implicit-const-int-float-conversion flag

* test_basic/enqueue_map: Initialize all the data (#1417)

* imageHelpers: add CL_UNORM_SHORT_{555, 565} in get_max_absolute_error (#1406)

* imageHelpers: add CL_UNORM_SHORT_{555, 565} in get_max_absolute_error

Working on a device supporting CL_UNORM_SHORT_565 image data type, I
noticed that the max absolute error authorized was not the right one
for such image data type.

Also because of normalization, there is always an absolute error
authorized whatever the filtering of the sampler.

Ref #1140

* put back if statement on filter_mode

* Change memory order and scope for atomics that gate final results being stored. (#1377)

* Change memory order and scope for atomics that gate final results being stored.

memory_order_acq_rel with memory_scope_device is now used to guarantee that the correct memory consistency is observed before final results are stored.

Previously it was possible for kernels to be generated that all used relaxed memory ordering, which could lead to false-positive failures.

Fixes #1370

* Disable atomics tests with global, in-program atomics.

If the device does not support `memory_order_relaxed` or `memory_scope_device`, disable atomics tests that declare their atomics in-program with global memory.

There is now an implicit requirement to support `memory_order_relaxed` and `memory_scope_device` for these tests.

* Fix misplaced parentheses.

* Change memory scope for atomic fetch and load calls in kernel

Change the memory scope from memory_scope_work_group to
memory_scope_device so the ordering applies across all work items

Co-authored-by: Sreelakshmi Haridas <sharidas@quicinc.com>

* Update Github Actions CI and add Windows (#1413)

- Add one Windows build to Github Actions
- Remove Appveyor config
- Move a few build steps out of the script
- Use Ninja as the generator (makes for more readable logs)
- Add build cache (except on Windows where it seems to break)

Change-Id: Ida90ee1842af98aff86e5144ab7b9766480378c9
Signed-off-by: Kevin Petit <kevin.petit@arm.com>

* api/kernel_arg_info: Check for read_write image support before testing it (#1420)

Code taken from api/test_min_image_formats.cpp

* images: Stop checking gDeviceType != CL_DEVICE_TYPE_GPU (#1418)

* images: Stop checking gDeviceType != CL_DEVICE_TYPE_GPU

If the device type also advertises CL_DEVICE_TYPE_DEFAULT (which should
be valid), this causes it to be considered a CPU device and the tests
enforce different precision and rounding expectations.

* Fix clang-format

* Drop redundant NORM_OFFSET checks

* Enable mipmap extension pragmas (#1349)

* Enable mipmap pragmas where appopriate.

* clang-format changes.

* Add content to README (#1427)

Fill in the placeholder readme with some basic information on building and
running the project. Information on the conformance submission process and
contributing are also included.

Should help close a few issues referenced in
https://github.com/KhronosGroup/OpenCL-CTS/issues/1096

I don't think this is all the information we want, but is a starting point
from which we can progress. For example, adding the android build instructions
from https://github.com/KhronosGroup/OpenCL-CTS/pull/1021

* Fixes incorrect slice pitch calculation in clCopyImage 1Darray (#1258)

The slice pitch/padding calculation assumed that the 'height' variable contained the pixel height of the image, which it doesn't for IMAGE1D_ARRAY.
Fixes #1257

* test_compiler_defines_for_extensions: fix overflow (#1430)

GCC 11.2.0 warns about a possible string overflow (when
num_not_supported_extensions+num_of_supported_extensions == 0)
since no space would be allocated for the terminating
null byte that string manipulation fns expect to find.

This unconditionally adds an extra byte to the allocation to silence
the warning and fix building with -Werror.

* Fix local memory out of bounds issue in atomic_fence (replaces PR #1285) (#1437)

* Fix local memory out of bounds in atomic_fence

In the error condition, the atomic_fence kernel can illegally access local memory addresses.

In this snippet, localValues is in the local address space and provided as a kernel argument. Its size is effectively get_local_size(0) * sizeof(int). The stores to localValues lead to OoB accesses.

  size_t myId = get_local_id(0);

  ...

  if(hisAtomicValue != hisValue)
  { // fail
    atomic_store(&destMemory[myId], myValue-1);
    hisId = (hisId+get_local_size(0)-1)%get_local_size(0);
    if(myValue+1 < 1)
      localValues[myId*1+myValue+1] = hisId;
    if(myValue+2 < 1)
      localValues[myId*1+myValue+2] = hisAtomicValue;
    if(myValue+3 < 1)
      localValues[myId*1+myValue+3] = hisValue;
  }

* Fix formatting

* Fix formatting again

* Formatting

* Added missing tests for integer_dot_product_input_4x8bit and integer_dot_product_input_4x8bit_packed on feature_macro compiler test. (#1432)

* Added integer_dot_product_input_4x8bit and integer_dot_product_input_4x8bit_packed tests to feature_macro_test

* clang formatting

* Now the test checks whether the array of optional features returned by clGetDeviceInfo contains the standard optional features we are testing.

* Update test_conformance/compiler/test_feature_macro.cpp

Added printing the missing standard feature it it is not found inside the optional features array returned by clGetDeviceInfo.

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

* Fix test_half async_work_group_copy arguments (#1298) (#1299)

Workitems in the last workgroup calls async_work_group_copy with
different argument values depending on 'adjust'. According to spec,
this results in undefined values.

* Initial CTS for external semaphore and memory extensions (#1390)

* Initial CTS for external sharing extensions

Initial set of tests for below extensions
with Vulkan as producer
1. cl_khr_external_memory
2. cl_khr_external_memory_win32
3. cl_khr_external_memory_opaque_fd
4. cl_khr_external_semaphore
5. cl_khr_external_semaphore_win32
6. cl_khr_external_semaphore_opaque_fd

* Updates to external sharing CTS

Updates to external sharing CTS
1. Fix some build issues to remove unnecessary, non-existent files
2. Add new tests for platform and device queries.
3. Some added checks for VK Support.

* Update CTS build script for Vulkan Headers

Update CTS build to clone Vulkan Headers
repo and pass it to CTS build
in preparation for external memory
and semaphore tests

* Fix Vulkan header path

Fix Vulkan header include path.

* Add Vulkan loader dependency

Vulkan loader is required to build
test_vulkan of OpenCL-CTS.
Clone and build Vulkan loader as prerequisite
to OpenCL-CTS.

* Fix Vulkan loader path in test_vulkan

Remove arch/os suffix in Vulkan loader path
to match vulkan loader repo build.

* Fix warnings around getHandle API.

Return type of getHandle is defined
differently based on win or linux builds.
Use appropriate guards when using API
at other places.
While at it remove duplicate definition
of ARRAY_SIZE.

* Use ARRAY_SIZE in harness.

Use already defined ARRAY_SIZE macro
from test_harness.

* Fix build issues for test_vulkan

Fix build issues for test_vulkan
1. Add cl_ext.h in common files
2. Replace cl_mem_properties_khr with cl_mem_properties
3. Replace cl_external_mem_handle_type_khr with
cl_external_memory_handle_type_khr
4. Type-cast malloc as required.

* Fix code formatting.

Fix code formatting to
get CTS CI builds clean.

* Fix formatting fixes part-2

Another set of formatting fixes.

* Fix code formatting part-3

Some more code formatting fixes.

* Fix code formatting issues part-4

More code formatting fixes.

* Formatting fixes part-5

Some more formatting fixes

* Fix formatting part-6

More formatting fixes continued.

* Code formatting fixes part-7

Code formatting fixes for image

* Code formatting fixes part-8

Fixes for platform and device query tests.

* Code formatting fixes part-9

More formatting fixes for vulkan_wrapper

* Code formatting fixes part-10

More fixes to wrapper header

* Code formatting fixes part-11

Formatting fixes for api_list

* Code formatting fixes part-12

Formatting fixes for api_list_map.

* Code formatting changes part-13

Code formatting changes for utility.

* Code formatting fixes part-15
Formatting fixes for wrapper.

* Misc Code formatting fixes

Some more misc code formatting fixes.

* Fix build breaks due to code formatting

Fix build issues arised with recent
code formatting issues.

* Fix presubmit script after merge

Fix presubmit script after merge conflicts.

* Fix Vulkan loader build in presubmit script.

Use cmake ninja and appropriate toolchain
for Vulkan loader dependency to fix
linking issue on arm/aarch64.

* Use static array sizes

Use static array sizes to fix
windows builds.

* Some left-out formatting fixes.

Fix remaining formatting issues.

* Fix harness header path

Fix harness header path
While at it, remove Misc and test pragma.

* Add/Fix license information

Add Khronos License info for test_vulkan.
Replace Apple license with Khronos
as applicable.

* Fix headers for Mac OSX builds.

Use appropriate headers for
Mac OSX builds

* Fix Mac OSX builds.

Use appropriate headers for
Mac OSX builds.
Also, fix some build issues
due to type-casting.

* Fix new code formatting issues

Fix new code formatting issues
with recent MacOS fixes.

* Add back missing case statement

Add back missing case statement
that was accidentally removed.

* Disable USE_GAS for Vulkan Loader build.

Disable USE_GAS for Vulkan Loader build
to fix aarch64 build.

* Update Copyright Year.

Update Copyright Year to 2022
for external memory sharing tests.

* Android specific fixes

Android specific fixes to
external sharing tests.

* Add tests for cl_khr_subgroup_rotate (#1439)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix newline in sample_image_pixel_float_offset log (#1446)

* Fix math tests to allow ftz in relaxed mode. (#1371)

* Fix math tests to allow ftz in relaxed mode.

In recent spec clarification, it is agreed that ftz is
a valid optimization in case of cl-fast-math-relaxed
and doesn't require cl-denorms-are-zero to be passed
explicitly to enforce ftz behavior for implementations
that already support this.

GitHub Spec Issue OpenCL-Docs#579
GitHub Spec Issue OpenCL-Docs#597
GitHub CTS Issue OpenCL-CTS#1267

* Update cl_khr_extended_async_copies tests to the latest extension version (#1426)

* Update cl_khr_extended_async_copies tests to the latest version of the extension

Update the 2D and 3D extended async copies tests. Previously they were based on
an older provisional version of the extension.

Also update the variable names to only use 'stride' to refer to the actual
stride values. Previously the tests used 'stride' to refer to the end of one
line or plane and the start of the next. This is not the commonly understood
meaning.

* Address cl_khr_extended_async_copies PR feedback

* Remove unnecessary parenthesis in kernel code
* Make variables `const` and rearrange so that we can reuse
  variables, rather than repeating expressions.
* Add in missing vector size of 3 for 2D tests

* Use C++ String literals for kernel code

Rather than C strings use C++11 string literals to define the
kernel code in the extended async-copy tests. Doing this makes
the kernel code more readable.

Co-authored-by: Ewan Crawford <ewan@codeplay.com>

* Fix function name in error messages (#1450)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Use clProgramWrapper in math_brute_force (#1451)

Simplify code by avoiding manual resource management.

This allows removing clReleaseProgram from `MakeKernels` to reduce
behavioral differences between `MakeKernels` and `MakeKernel`.

Original patch by Marco Antognini.

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Share BuildKernelInfo struct definition (#1453)

Move the main `BuildKernelInfo` definition into `common.h` to reduce
code duplication.

Some tests (e.g. `i_unary_double.cpp`) use a different struct; rename
those structs to `BuildKernelInfo2` for now to avoid ambiguity.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Tidy up subgroup log messages (#1454)

Add missing newlines and improve wording of messages.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Add missing external memory/sync extensions to list of known khr extensions (#1455)

Signed-off-by: Kévin Petit <kpet@free.fr>

* Fix misleading indentation and enable -Wmisleading-indentation (#1458)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix indentation of test_waitlists.cpp (#1459)

* fix indentation of test_waitlists.cpp

Followup of #1458

* run formatter

* Tidy up BuildKernelInfo (#1461)

Remove the `offset` field from both structures, because it was always
set to the global `gMinVectorSizeIndex`.

Improve documentation and rename some variables:
 - `i` becomes `vectorSize`;
 - `kernel_count` becomes `threadCount`.

Original patch by Marco Antognini.

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Remove unused variables in subgroup tests (#1460)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix test_select verification failure reporting (#1462)

When verification of the computed result fails, the test would still
report as "passed".  This is because `s_test_fail` is only written to
and never read.

Fix the immediate issue by returning a failure value and incrementing
`gFailCount` if any error was detected.  The error handling can be
improved further, but I'm leaving that out of the scope of this fix.

Fixes https://github.com/KhronosGroup/OpenCL-CTS/issues/1445

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] Fix missing `double_double.lo` initializer (#1466)

Fixes a missing-field-initializers warning.  The original intent was
most likely to initialize both fields (similar to other functions in
this file), but a `,` was missed.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] Use Unix-style line endings (#1468)

Use the same line ending style across all source files.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] Fix sign-compare warnings in math_brute_force (#1467)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Use clCommandQueueWrapper in math_brute_force (#1463)

Simplify code by avoiding manual resource management.

This commit only modifies tests that use one queue per thread.  The
other unmodified tests are single-threaded and use the global
`gQueue`.

Original patch by Marco Antognini.

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Fix test skipping in math_brute_force (#1475)

Commit 9666ca3c ("[NFC] Fix sign-compare warnings in math_brute_force
(#1467)", 2022-08-23) inadvertently changed the semantics of the if
condition.  The `i > gEndTestNumber` comparison was relying on
`gEndTestNumber` being promoted to unsigned.  When casting `i` to
`int32_t`, this promotion no longer happens and as a result any tests
given on the command line were being skipped.

Use an unsigned type for `gStartTestNumber` and `gEndTestNumber` to
eliminate the casts and any implicit conversions between signed and
unsigned types.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* support format CL_ABGR (#1474)

* support format CL_ABGR

add code to handle format CL_ABGR

* Update imageHelpers.h

* fix format

* Initial command-buffer extension tests (#1368)

* Initial command-buffer tests

Introduce some basic testing of the
[cl_khr_command_buffer](https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer)
extension. This is intended as a starting point from which we can iteratively build up tests
for the extension collaboratively.

* Move tests into derived classes

* Move tests from methods into derived classes implementing
  a `Run()` interface.
* Fix memory leak when command_buffer isn't freed when a test
  is skipped.
* Print correct error code for
  `CL_DEVICE_COMMAND_BUFFER_CAPABILITIES_KHR`
* Pass `nullptr` for queue parameter to command recording entry-points

* Define command-buffer type wrapper

Other OpenCL object have a wrapper to reference count their use
and free the wrapped object. The command-buffer object can't use
the generic type wrappers which are templated on the appropriate
release/retain function, as the release/retain functions are
queried at runtime.

Instead, define our own command-buffer wrapper class where a base
object is passed on construction which contains function pointers
to the release/retain functions that can be used in the wrapper.

* Use create_single_kernel_helper_create_program

Use `create_single_kernel_helper_create_program` rather than
hardcoding `clCreateProgramWithSource` to allow for other types
of program input.

Also fix bug using wrong enum for passing properties on command-buffer
creation, should be `CL_COMMAND_BUFFER_FLAGS_KHR`

* Add out-of-order command-buffer test

Introduce a basic test for checking sync-point use
with out-of-order command-buffers.

This also includes better checking of required queue properties.

* Use clMemWrapper in math_brute_force (#1476)

Simplify code by avoiding manual resource management.

Original patch by Marco Antognini.

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* fix test kernel attributes when api fcts are failing (#1449)

test_error returns the err given as the first argument. As the
run_test function returns a bool, we end up returning true (meaning
pass) when an api function fails.
Instead return explicitly false (meaning fail).

* Use size_t instead of cl_int (#1414)

* Use size_t instead of cl_int

Memory is allocated for cl_int,
but mapped as size_t.
Use size_t instead of cl_int during
allocation and mapping for consistency.

* Use size_t instead of cl_int

Memory is allocated for cl_int,
but mapped as size_t.
Use size_t instead of cl_int during
allocation and mapping for consistency.

* Use size_t instead of cl_int

Memory is allocated for cl_int,
but mapped as size_t.
Use size_t instead of cl_int during
allocation and mapping for consistency.

* Remove test_half changes.

Remove test_half changes from other fix
that got included in this commit.

* Final formatting fix.

* Update known extensions in compiler define test (#1480)

Add
[cl_khr_command_buffer_mutable_dispatch](https://github.com/KhronosGroup/OpenCL-Docs/pull/819),
[cl_khr_subgroup_rotate](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_subgroup_rotate),
and [cl_khr_extended_async_copies](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_extended_async_copies)
to the list of known extensions used in
`test_compiler_defines_for_extensions`

* Minimum 2 non atomic variables per thread for the c11 atomic fence test for embedded profile devices. (#1452)

* Minimum 2 Non atomic variables per thread for an embedded profile device - https://github.com/KhronosGroup/OpenCL-CTS/issues/1274

* Formatting

* [NFC] Fix whitespace issues in run_conformance.py (#1491)

Fix whitespace issues and remove superfluous parens in the
run_conformance.py script.  This addresses 288 out of the 415
issues reported by pylint.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* [NFCI] Remove unused variables and enable -Wunused-variable (#1483)

Remove unused variables throughout the code base and enable the
`-Wunused-variable` warning flag globally to prevent new unused
variable issues being introduced in the future.

This is mostly a non-functional change, with one exception:

 - In `test_conformance/api/test_kernel_arg_info.cpp`, an error check
   of the clGetDeviceInfo return value was added.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] Fix unused variable warning in Release builds (#1494)

The condition inside the assert is dropped in Release builds, so
`num_printed` becomes unused.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Minor cleanups for run_conformance.py (#1492)

Use the print function from futures for Python 3 compatibility,
remove an unreachable statement, remove unused imports, and add
a missing sys.exit call when opening the log file fails.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Use correct size for memory allocation in SVM test (#1496)

Memory is allocated for cl_int, but mapped as size_t.

Use size_t instead of cl_int during allocation and mapping for consistency.

* [NFC] Reformat code in events test (#1497)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* [NFC] Declare format tables as const (#1493)

Without const, these variables would be flagged up by
`-Wunused-variable`.

Drop `struct` from the declarations as that is not needed in C++.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] Fix typo (enevt_type -> event_type) (#1498)

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* gles: Limit variable definition to the same scope as usage (#1495)

Fix unused-variable errors by limiting variable definition to the
case that would use it

* Tests for cl-ext-image-from-buffer and cl-ext-image-requirements-info (#1438)

* Add CTS tests for cl_ext_image_requirements_info

Change-Id: I20c1c77ff5ba88eb475801bafba30ef9caf82601

* Add CTS tests for cl_ext_image_from_buffer

Change-Id: Ic30429d77a1317d0fea7d9ecc6d603267fa6602f

* Fixes for image_from_buffer and image_requirements extension

* Use CL_MEM_READ_WRITE flag when creating images that support CL_MEM_KERNEL_READ_AND_WRITE (#1447)

* format fixes

Change-Id: I04d69720730440cb61e64fed2cb5065b2ff8bf90

Co-authored-by: Oualid Khelifi <oualid.khelifi@arm.com>
Co-authored-by: oramirez <oramirez@qti.qualcomm.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>

* Include release builds in GitHub Actions (#1486)

The "Ninja" CMake generator does not support multiple configurations,
i.e. it does not support use of the '--config' option when running
'cmake --build'.  As such, the default configuration (i.e. Debug)
was getting used for all builds.

Use the CMAKE_BUILD_TYPE variable instead, so that we do release
builds, but change one build (ubuntu-20.04 aarch64) to use Debug
as its build type, to keep some build coverage for asserts, etc.

For Vulkan-Loader and OpenCL-ICD-Loader, we do release builds
unconditionally, as we assume there is no need in the CI workflow
to actually run the binaries that are built, and therefore no need
for any additional debug info.

Signed-off-by: Stuart Brady <stuart.brady@arm.com>

* Fix more warnings in math_brute_force (#1502)

* Fix "‘nadj’ may be used uninitialized in this function
   [-Werror=maybe-uninitialized]".

 * Fix "specified bound 4096 equals destination size
   [-Werror=stringop-truncation]".

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Improve MTdataHolder design and use it in math_brute_force (#1490)

Improve the design of the MTdataHolder wrapper:

 * Make it a class instead of a struct with a private member, to make
   it clearer that there is no direct access to the MTdata member.

 * Make the 1-arg constructor `explicit` to avoid unintended
   conversions.

 * Forbid copy construction/assignment as MTdataHolder is never
   initialised from an MTdataHolder object in the codebase.

 * Define move construction/assignment as per the "rule of five".

Use the MTdataHolder class throughout math_brute_force, to simplify
code by avoiding manual resource management.

Original patch by Marco Antognini.

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Fix memory oob problem in test half (#1489)

Allocate memory for argc arguments
instead of argc - 1.

* [NFC] Enable -Wall for math_brute_force (#1477)

math_brute_force compiles cleanly with `-Wall` currently, so avoid
regressing from that state.  Ideally we would enable `-Wall` in the
top-level CMakeLists.txt, but other tests do not compile cleanly with
`-Wall` yet.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Update extension list of test_compiler (#1507)

* Update extension list of test_compiler

Upate extension list of test_compiler
with missing external memory and semaphore
extensions

* cmake: Add set_gnulike_module_compile_flags (#1510)

Factor out a macro to set module-specific compilation flags for
GNU-like compilers.  This simplifies setting compilation flags per
test.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Remove __DATE__ and __TIME__ usage (#1506)

These macros make the build non-deterministic.

* [NFC] Fix typo in clang-format directive (#1512)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Creating common functions for image/kernel_read_write read tests (#1141)

* Make InitFloatCoords suitable for all image types

Contributes #616

* Create common functions neutral for image types

Remove 3D specific code from common test_read_image so using
it for other image types is simpler in following patches

Contributes #616

* Removing unused code

Tidying commented out or unnecessary code

Contributes #616

Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>

* Restoring 'lod' variable name

Contributes #616

* Default cases to handle unsupported image types

Contributes #616

* Resolving build issues

Contributes #616

* Fix formatting

Contributes #616

* Using TEST_FAIL as an error code.

Contributes #616

* Add static keyword, improve error handling

Contributes #616

* Fix build errors with least disruption

Contributes #616

Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>

* SVM: Fix memory allocation size. (#1514)

* SVM: Fix memory allocation size.

9ad48998 generally made memory allocation and mapping consistent with a
size of size_t. Apply that fix to the final two allocations.

* check-format fixes

Co-authored-by: spauls <spauls@qti.qualcomm.com>

* [NFC] Avoid mixing signed and unsigned in subhelpers run (#1505)

Fix a `-Wsign-compare` warning in the `run()` function, which resulted
in many repeated warnings when compiling with `-Wall` due to the many
template instantiations.

Both `clGetKernelSubGroupInfo` queries return a `size_t`, so it is
unclear why the results of these queries were being cast to `int`.
The `dynsc` uses don't seem to work with negative values, so make the
field unsigned.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] clang-format test_atomics (#1516)

Add some clang-format off/on comments to keep lists and kernel code
readable.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] atomics: Remove set-but-unused "succeed" variables (#1517)

The "succeed" variables are never read and they don't seem to serve
any purpose that's not already provided by the "fail" variables.

In `add_index_bin_test` the "fail" variable is also set but unused,
but that may require an actual fix, so leaving that out of this
commit.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* math_brute_force: Fix -Wformat warnings (#1518)

* math_brute_force: Fix -Wformat warnings

The main sources of warnings were:

 * Printing of 64-bit types, which is now done using the `PRI*64`
   macros from <cinttypes> to ensure portability across 32 and 64-bit
   builds.

 * Printing of `size_t` types that lacked a `z` length modifier.

 * Printing of values with a `z` length modifier that weren't a
   `size_t` type.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* [NFC] math_brute_force: clang-format after -Wformat changes

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Add Python 3 support to run_conformance.py (#1470)

* Add missing type declaration (#1520)

Add a missing type declaration to OpenCL C code strings in 2D async copy
tests.

* pipes: Fix typos in skip messages (#1523)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* atomics: Fix -Wformat warnings (#1519)

The main sources of warnings were:

 * Printing of `i` which is a `size_t` requiring the `%zu` specifier.

 * Printing of `cl_long` which is now done using the `PRId64` macro
   to ensure portability across 32 and 64-bit builds.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* External sharing new updates (#1482)

* Fix enqueue_flags test to use correct barrier type.

Currently, enqueue_flags test uses CLK_LOCAL_MEM_FENCE.
Use CLK_GLOBAL_MEM_FENCE instead as all threads across work-groups
need to wait here.

* Add check for support for Read-Wrie images

Read-Write images have required OpenCL 2.x.
Read-Write image tests are already being skipped
for 1.x devices.
With OpenCL 3.0, read-write images being optional,
the tests should be run or skipped
depending on the implementation support.

Add a check to decide if Read-Write images are
supported or required to be supported depending
on OpenCL version and decide if the tests should
be run on skipped.

Fixes issue #894

* Fix formatting in case of Read-Write image checks.

Fix formatting in case of Read-write image checks.
Also, combine two ifs into one in case of
kerne_read_write tests

* Fix some more formatting for RW-image checks

Remove unnecessary spaces at various places.
Also, fix lengthy lines.

* Fix malloc-size calculation in test imagedim

unsigned char size is silently assumed to be 1
in imagedim test of test_basic.
Pass sizeof(type) in malloc size calculation.
Also, change loop variable from signed to unsigned.
Add checks for null pointer for malloced memory.

* Initial CTS for external sharing extensions

Initial set of tests for below extensions
with Vulkan as producer
1. cl_khr_external_memory
2. cl_khr_external_memory_win32
3. cl_khr_external_memory_opaque_fd
4. cl_khr_external_semaphore
5. cl_khr_external_semaphore_win32
6. cl_khr_external_semaphore_opaque_fd

* Updates to external sharing CTS

Updates to external sharing CTS
1. Fix some build issues to remove unnecessary, non-existent files
2. Add new tests for platform and device queries.
3. Some added checks for VK Support.

* Update CTS build script for Vulkan Headers

Update CTS build to clone Vulkan Headers
repo and pass it to CTS build
in preparation for external memory
and semaphore tests

* Fix Vulkan header path

Fix Vulkan header include path.

* Add Vulkan loader dependency

Vulkan loader is required to build
test_vulkan of OpenCL-CTS.
Clone and build Vulkan loader as prerequisite
to OpenCL-CTS.

* Fix Vulkan loader path in test_vulkan

Remove arch/os suffix in Vulkan loader path
to match vulkan loader repo build.

* Fix warnings around getHandle API.

Return type of getHandle is defined
differently based on win or linux builds.
Use appropriate guards when using API
at other places.
While at it remove duplicate definition
of ARRAY_SIZE.

* Use ARRAY_SIZE in harness.

Use already defined ARRAY_SIZE macro
from test_harness.

* Fix build issues for test_vulkan

Fix build issues for test_vulkan
1. Add cl_ext.h in common files
2. Replace cl_mem_properties_khr with cl_mem_properties
3. Replace cl_external_mem_handle_type_khr with
cl_external_memory_handle_type_khr
4. Type-cast malloc as required.

* Fix code formatting.

Fix code formatting to
get CTS CI builds clean.

* Fix formatting fixes part-2

Another set of formatting fixes.

* Fix code formatting part-3

Some more code formatting fixes.

* Fix code formatting issues part-4

More code formatting fixes.

* Formatting fixes part-5

Some more formatting fixes

* Fix formatting part-6

More formatting fixes continued.

* Code formatting fixes part-7

Code formatting fixes for image

* Code formatting fixes part-8

Fixes for platform and device query tests.

* Code formatting fixes part-9

More formatting fixes for vulkan_wrapper

* Code formatting fixes part-10

More fixes to wrapper header

* Code formatting fixes part-11

Formatting fixes for api_list

* Code formatting fixes part-12

Formatting fixes for api_list_map.

* Code formatting changes part-13

Code formatting changes for utility.

* Code formatting fixes part-15
Formatting fixes for wrapper.

* Misc Code formatting fixes

Some more misc code formatting fixes.

* Fix build breaks due to code formatting

Fix build issues arised with recent
code formatting issues.

* Fix presubmit script after merge

Fix presubmit script after merge conflicts.

* Fix Vulkan loader build in presubmit script.

Use cmake ninja and appropriate toolchain
for Vulkan loader dependency to fix
linking issue on arm/aarch64.

* Use static array sizes

Use static array sizes to fix
windows builds.

* Some left-out formatting fixes.

Fix remaining formatting issues.

* Fix harness header path

Fix harness header path
While at it, remove Misc and test pragma.

* Add/Fix license information

Add Khronos License info for test_vulkan.
Replace Apple license with Khronos
as applicable.

* Fix headers for Mac OSX builds.

Use appropriate headers for
Mac OSX builds

* Fix Mac OSX builds.

Use appropriate headers for
Mac OSX builds.
Also, fix some build issues
due to type-casting.

* Fix new code formatting issues

Fix new code formatting issues
with recent MacOS fixes.

* Add back missing case statement

Add back missing case statement
that was accidentally removed.

* Disable USE_GAS for Vulkan Loader build.

Disable USE_GAS for Vulkan Loader build
to fix aarch64 build.

* Fixes to OpenCL external sharing tests

Fix clReleaseSemaphore() API.
Fix copyright year.
Some other minor fixes.

* Improvements to OpenCL external sharing CTS

Use SPIR-V shaders instead of NV extension path
from GLSL to Vulkan shaders.
Fixes for lower end GPUs to use limited memory.
Update copy-right year at some more places.

* Fix new code formatting issues.

Fix code formatting issues with
recent changes for external sharing
tests.

* More formatting fixes.

More formatting fixes for recent
updates to external sharing tests.

* Final code formatting fixes.

Minor formatting fixes to get
format checks clean.

* remove implicit conversion to pointer to fix 32-bit compile (#1488)

* remove implicit conversion to pointer to fix 32-bit compile

* fix formatting

* Cap CL_DEVICE_MAX_MEM_ALLOC_SIZE to SIZE_MAX (#1501)

* Fix enqueue_flags test to use correct barrier type.

Currently, enqueue_flags test uses CLK_LOCAL_MEM_FENCE.
Use CLK_GLOBAL_MEM_FENCE instead as all threads across work-groups
need to wait here.

* Add check for support for Read-Wrie images

Read-Write images have required OpenCL 2.x.
Read-Write image tests are already being skipped
for 1.x devices.
With OpenCL 3.0, read-write images being optional,
the tests should be run or skipped
depending on the implementation support.

Add a check to decide if Read-Write images are
supported or required to be supported depending
on OpenCL version and decide if the tests should
be run on skipped.

Fixes issue #894

* Fix formatting in case of Read-Write image checks.

Fix formatting in case of Read-write image checks.
Also, combine two ifs into one in case of
kerne_read_write tests

* Fix some more formatting for RW-image checks

Remove unnecessary spaces at various places.
Also, fix lengthy lines.

* Fix malloc-size calculation in test imagedim

unsigned char size is silently assumed to be 1
in imagedim test of test_basic.
Pass sizeof(type) in malloc size calculation.
Also, change loop variable from signed to unsigned.
Add checks for null pointer for malloced memory.

* Cap CL_DEVICE_MAX_MEM_ALLOC_SIZE to SIZE_MAX

Cap CL_DEVICE_MAX_MEM_ALLOC_SIZE to SIZE_MAX
when CL_DEVICE_GLOBAL_MEM_SIZE is capped with SIZE_MAX.
test_allocation caps the value of GLOBAL_MEM_SIZE to SIZE_MAX
if it exceeds the value of SIZE_MAX(value depends on platform bitness),
but doesn’t modify MAX_ALLOC_SIZE the same way.
Due to this MAX_ALLOC_SIZE becomes greater than GLOBAL_MEM_SIZE
and the test fails.

Modify MAX_MEM_ALLOC_SIZE as GLOBAL_MEM_SIZE when it exceeds SIZE_MAX

OpenCL-CTS #1022

* Factor out GetTernaryKernel (#1511)

Use a common function to create the kernel source code for testing
3-argument math builtins.  This reduces code duplication.  1-argument
and 2-argument math kernel construction will be factored out in future
work.

Change the kernels to use preprocessor defines for argument types and
undef values, to make the CTS code easier to read.

Co-authored-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Co-authored-by: Marco Antognini <marco.antognini@arm.com>

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Signed-off-by: Kévin Petit <kpet@free.fr>
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
Signed-off-by: Marco Cattani <marco.cattani@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
Co-authored-by: Kévin Petit <kpet@free.fr>
Co-authored-by: James Price <jrprice@google.com>
Co-authored-by: BKoscielak <bartosz.koscielak@intel.com>
Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
Co-authored-by: Grzegorz Wawiorko <grzegorz.wawiorko@intel.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>
Co-authored-by: spauls <spauls@qti.qualcomm.com>
Co-authored-by: kalchr01 <83217667+kalchr01@users.noreply.github.com>
Co-authored-by: Feng Zou <feng.zou@intel.com>
Co-authored-by: Senran (Stephen) Zhang <senran.zhang@intel.com>
Co-authored-by: Jeremy Kemp <jeremy@jeremykemp.co.uk>
Co-authored-by: Stuart Brady <stuart.brady@arm.com>
Co-authored-by: marcat03 <94451804+marcat03@users.noreply.github.com>
Co-authored-by: Ewan Crawford <ewan@codeplay.com>
Co-authored-by: oramirez <oramirez@qti.qualcomm.com>
Co-authored-by: Jim Lewis <j.lewis1@samsung.com>
Co-authored-by: Alastair Murray <alastair.murray@codeplay.com>
Co-authored-by: Jack Frankland <30410009+FranklandJack@users.noreply.github.com>
Co-authored-by: Jason Tang <jason.tang@amd.com>
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Co-authored-by: Romaric Jodin <89833130+rjodinchr@users.noreply.github.com>
Co-authored-by: Karol Herbst <karolherbst@gmail.com>
Co-authored-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Co-authored-by: paulfradgley <39525348+paulfradgley@users.noreply.github.com>
Co-authored-by: jansol <jhs@psonet.com>
Co-authored-by: Ahmed <36049290+AhmedAmraniAkdi@users.noreply.github.com>
Co-authored-by: Wenju He <wenju.he@intel.com>
Co-authored-by: Nikhil Joshi <nikhilj@nvidia.com>
Co-authored-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Co-authored-by: Callum Fare <callum@codeplay.com>
Co-authored-by: Matthias Diener <matthias.diener@gmail.com>
Co-authored-by: stoneforestwhu <stoneforestwhu@gmail.com>
Co-authored-by: niranjanjoshi121 <43807392+niranjanjoshi121@users.noreply.github.com>
Co-authored-by: Kévin Petit <kevin.petit@arm.com>
Co-authored-by: Oualid Khelifi <oualid.khelifi@arm.com>
Co-authored-by: Krzysztof Kosiński <tweenk.pl@gmail.com>
Co-authored-by: ellnor01 <51320439+ellnor01@users.noreply.github.com>
Co-authored-by: victzhan <111778801+victzhan@users.noreply.github.com>
Co-authored-by: Marco Antognini <marco.antognini@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants