Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LIBCLC] Add support for more generic atomic operations #7391

Merged
merged 3 commits into from
Dec 14, 2022

Conversation

jchlanda
Copy link
Contributor

@jchlanda jchlanda commented Nov 15, 2022

The diffs are quite hard to follow, but in an essence this patch brings:

  • a new entry, implementing a generic address space for multiple __CLC_NVVM_ATOMIC_XYZ_IMPL, where XYZ stands for CAS, INDEC, LOAD, MAX, MIN, STORE and SUB,
  • fixes the name of mangled function that the IMPL uses,
  • the rest is just formatting to 80 chars.

This patch supersedes: #5849 but it requires the fixes to the remangler from: #7220

Fixes: #7658

@jchlanda jchlanda requested a review from a team as a code owner November 15, 2022 08:46
@jchlanda jchlanda marked this pull request as draft November 15, 2022 08:46
@jchlanda
Copy link
Contributor Author

Changing it to a draft, till #7220 is merged.

@jchlanda
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1446

@jchlanda jchlanda force-pushed the jakub/libclc_generic_atomics_2 branch from b50e560 to e1eeb3d Compare December 12, 2022 12:30
@jchlanda
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1446

@romanovvlad
Copy link
Contributor

@bader Could you please help with reviewing?

@jchlanda
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1446

libclc/ptx-nvidiacl/libspirv/atomic/atomic_cmpxchg.cl Outdated Show resolved Hide resolved
libclc/ptx-nvidiacl/libspirv/atomic/atomic_cmpxchg.cl Outdated Show resolved Hide resolved
jchlanda and others added 2 commits December 13, 2022 09:13
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
@jchlanda
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1446

1 similar comment
@jchlanda
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1446

@jchlanda
Copy link
Contributor Author

Build failure was caused by the discrepancy in the macros, as the use/def were added over 2 commits (d2eb42f and b35194f)

@jchlanda
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1446

1 similar comment
@jchlanda
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1446

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@steffenlarsen
Copy link
Contributor

Verification failures unrelated:

OpenCL:

Timed Out Tests (2):
#7740
SYCL :: Basic/code_location_e2e.cpp

#7741
SYCL :: HostInteropTask/host-task-failure.cpp


Failed Tests (2):
#7742
SYCL :: KernelAndProgram/multiple-kernel-linking.cpp

Also mentioned in #7745
SYCL :: XPTI/kernel/basic.cpp


Level Zero:

Timed Out Tests (2):
#7741
SYCL :: HostInteropTask/host-task-failure.cpp


Failed Tests (19):
#7745
SYCL :: Basic/group_async_copy.cpp

#7744
SYCL :: DeviceLib/imf_fp16_trivial_test.cpp
SYCL :: DeviceLib/imf_fp32_test.cpp
SYCL :: DeviceLib/imf_half_type_cast.cpp

#7743
SYCL :: Reduction/reduction_big_data.cpp
SYCL :: Reduction/reduction_nd_N_vars.cpp
SYCL :: Reduction/reduction_nd_conditional.cpp
SYCL :: Reduction/reduction_nd_dw.cpp
SYCL :: Reduction/reduction_nd_ext_half.cpp
SYCL :: Reduction/reduction_nd_lambda.cpp
SYCL :: Reduction/reduction_range_1d_dw.cpp
SYCL :: Reduction/reduction_range_1d_dw_64bit.cpp
SYCL :: Reduction/reduction_range_1d_rw.cpp
SYCL :: Reduction/reduction_range_2d_dw.cpp
SYCL :: Reduction/reduction_range_2d_rw.cpp
SYCL :: Reduction/reduction_range_3d_dw.cpp
SYCL :: Reduction/reduction_range_N_vars.cpp
SYCL :: Reduction/reduction_range_lambda.cpp
SYCL :: Reduction/reduction_usm.cpp

@steffenlarsen steffenlarsen merged commit e99ead8 into intel:sycl Dec 14, 2022
steffenlarsen pushed a commit to intel/llvm-test-suite that referenced this pull request Dec 14, 2022
@steffenlarsen
Copy link
Contributor

@jchlanda - It looks like some of the newly enabled test-suite tests are failing for CUDA still, specifically due to AtomicStore. See for example https://github.com/intel/llvm/actions/runs/3693963954/jobs/6254968423.

@jchlanda
Copy link
Contributor Author

Interesting, it does look like a generic pointer in those stores, will have a look.

@jchlanda
Copy link
Contributor Author

@jchlanda - It looks like some of the newly enabled test-suite tests are failing for CUDA still, specifically due to AtomicStore. See for example https://github.com/intel/llvm/actions/runs/3693963954/jobs/6254968423.

@steffenlarsen I think I know what's going on here. Seems like there are two problems:

  • hard coded mangled name in libclc is wrong, I've looked in disassembled file and can see instances of _Z19_spirv_AtomicStoreP ... Scope::MemorySemanticsMask ... that :: is most likely a type substitution gone wrong in the name.
  • looks like I've got my check-libclc wrong and it only runs remangler against clc-nvptx64--nvidiacl.bc, whereas we should be probably running it against everything that is generated, i.e.:
builtins.link.clc-nvptx64--nvidiacl.bc     
builtins.link.libspirv-nvptx64--.bc        
builtins.link.libspirv-nvptx64--nvidiacl.bc
builtins.opt.clc-nvptx64--nvidiacl.bc      
builtins.opt.libspirv-nvptx64--.bc
builtins.opt.libspirv-nvptx64--nvidiacl.bc 
clc-nvptx64--nvidiacl.bc                   
libspirv-nvptx64--.bc                      
libspirv-nvptx64--nvidiacl.bc              

which is why it wasn't picked up earlier, as clc-nvptx64--nvidiacl.bc does not contain all the functions.

I'm not sure what is the best course of action here, either reverting the patch, or XFAILING those 4 tests, as I'm unlikely to get the fix today.

@steffenlarsen
Copy link
Contributor

Thank you, @jchlanda ! Since they were just enabled, I think adding XFAIL back to those tests is the best course of action. Your patch seems to still work for a good chunk of them so we may as well keep this functionality, unless you can think of a reason why it may be disruptive?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ptxas: unresolved extern function __spirv_AtomicLoad(long long const*...)
4 participants