DPCPP offload to accelerator consumes 100% CPU

I am unsure of whether this is by design or an issue with the DPCPP runtime. However, every time I select a accelerator device for setting up a queue and submit a kernel to it, for the duration of the execution of the kernel on device, the CPU utilization remains at 100% i.e one full core is being occupied.

This wasn't the case when using C for Metal. I believe even level-zero allows for the CPU to not busy wait while the accelerator is executing a job. Similar is the case with CUDA where after offloading to GPU the CPU utilization is < 5%.

Is this something that is going to be addressed? This is critical to every single customer I work with.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DPCPP offload to accelerator consumes 100% CPU #404

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DPCPP offload to accelerator consumes 100% CPU #404

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions