Replies: 2 comments 9 replies
-
I agree, all of the SSCP builtin interface should rely on the
Looking forward to your work! :) |
Beta Was this translation helpful? Give feedback.
-
On a related note: I have noticed that sometimes the use of 64-bit integers for indexes generate more arithmetic ops compared to native CUDA (threadIdx and co. return int32, they are converted to int64 to compute global_linear_id, then converted back to int32 in user code when DPC++ has special compiler flag for it, Not a big problem per se (never had been a bottleneck for us 😄), but might be something to consider in the scope of this discussion. |
Beta Was this translation helpful? Give feedback.
-
For functions like
__hipsycl_sscp_get_local_id_x
https://github.com/OpenSYCL/OpenSYCL/blob/cd2c72f459dd449ca40a61b6b0c9671a06d7d1c7/include/hipSYCL/sycl/libkernel/sscp/builtins/core.hpp#L35
when kernel is compiled to hcf
size_t
has one kind of width e.g.uint64
;but when
libkernel-sscp-*-core.bc
is compiled to some GPU architecture (other thannvptx64
,amdgcn
orspir64
, will upstream once done) , they may haveuint32
defined assize_t
,then linking two LLVM IRs leads to expressions like this:
Don't know whether this is expected, but I think such
bitcast
seems strange, so I suggest__hipsycl_uint64
instead ofsize_t
here.Beta Was this translation helpful? Give feedback.
All reactions