-
Notifications
You must be signed in to change notification settings - Fork 818
Closed
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endperformancePerformance related issuesPerformance related issues
Description
When using write and discard_write accessors with the CUDA backend a Host to Device copy is made. Even though this is supposed to be write only which does not need such an action and this is wasted resources.
queue.submit([&] (cl::sycl::handler& cgh) {
auto input_acc = input.get_access<sycl::access::mode::read>(cgh);
auto output_acc = output.get_access<sycl::access::mode::discard_write>(cgh);
auto maxRange = sycl::nd_range<2>(sycl::range<2>{height, width / 4}, sycl::range<2>(1, 128));
cgh.parallel_for<class test>(maxRange, [=](sycl::nd_item<2> item){
output_acc[item.get_global_id()] = 0;
});
});I have never written an issue like this before so if you need additional information just ask.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endperformancePerformance related issuesPerformance related issues
