Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hipSYCL generates extra empty kernels #899

Closed
misos1 opened this issue Dec 21, 2022 · 1 comment
Closed

hipSYCL generates extra empty kernels #899

misos1 opened this issue Dec 21, 2022 · 1 comment
Labels
discussion General discussion about something

Comments

@misos1
Copy link

misos1 commented Dec 21, 2022

With this code there are generated 5 kernels for a single gpu target. Seems each kernel lambda generates kernel with actual content and one extra empty kernel (and one additional extra empty kernel is generated regardless of how many kernels there are). It is even more when I use "named" kernels.

	sycl::queue q;
	q.single_task([]()
	{
		__hipsycl_if_target_hip(asm("s_nop 1"));
	}).wait();
	q.single_task([]()
	{
		__hipsycl_if_target_hip(asm("s_nop 2"));
	}).wait();

Only these two kernels which start with _Z16 contain actual code.

_Z16__hipsycl_kernelIZ4mainEUlvE_Evv.kd
_Z30__hipsycl_kernel_name_templateIZ4mainEUlvE0_Evv.kd
_Z30__hipsycl_kernel_name_templateIZ4mainEUlvE_Evv.kd
_Z16__hipsycl_kernelIZ4mainEUlvE0_Evv.kd
_Z30__hipsycl_kernel_name_templateI24__hipsycl_unnamed_kernelEvv.kd
@misos1 misos1 added the discussion General discussion about something label Dec 21, 2022
@illuhad
Copy link
Collaborator

illuhad commented Dec 22, 2022

This is expected behavior.

There is no guarantee that one SYCL kernel will translate to one actual kernel. E.g. the implementation might always decide to multiversion kernels based on some argument properties.

The empty kernels that you are seeing are dummy kernels that we need to generate host-side visible kernel names. This is necessary due the restrictions that clang has around its __builtin_get_device_side_mangled_name() builtin, which only works on __global__ functions. Since SYCL kernels are not __global__ during parsing and semantic analysis, we cannot use this builtin directly. So we have to generate dummy __global__ functions to which we can apply the builtin, and then borrow the generated name (hence __hipsycl_kernel_name_template).

These dummy kernels have no negative impact on the generated code.

@illuhad illuhad closed this as completed Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion General discussion about something
Projects
None yet
Development

No branches or pull requests

2 participants