Currently, when a PSTL algorithm is called with gpu execution policy that contains a stream, we don't set the current device to the stream's device:
cuda::device_ref device{cuda::devices[0]};
cuda::stream stream{device};
const auto policy = cuda::execution::gpu.with(cuda::get_stream, stream);
cudaSetDevice(1);
cuda::std::find(policy, ...); // oops, launches work on device 1
It's basically the user's responsibility to make sure the current device and the stream's device match.
However, this is inconsistent with what we do in cuda::launch, where we ignore current device and always set it to the stream's device. I think we should fix this and guard all device-related operations in PSTL algorithms by __ensure_current_context, so the example is fixed:
cuda::device_ref device{cuda::devices[0]};
cuda::stream stream{device};
const auto policy = cuda::execution::gpu.with(cuda::get_stream, stream);
cudaSetDevice(1);
cuda::std::find(policy, ...); // ok, launches work on device 0
Currently, when a PSTL algorithm is called with
gpuexecution policy that contains a stream, we don't set the current device to the stream's device:cuda::device_ref device{cuda::devices[0]}; cuda::stream stream{device}; const auto policy = cuda::execution::gpu.with(cuda::get_stream, stream); cudaSetDevice(1); cuda::std::find(policy, ...); // oops, launches work on device 1It's basically the user's responsibility to make sure the current device and the stream's device match.
However, this is inconsistent with what we do in
cuda::launch, where we ignore current device and always set it to the stream's device. I think we should fix this and guard all device-related operations in PSTL algorithms by__ensure_current_context, so the example is fixed:cuda::device_ref device{cuda::devices[0]}; cuda::stream stream{device}; const auto policy = cuda::execution::gpu.with(cuda::get_stream, stream); cudaSetDevice(1); cuda::std::find(policy, ...); // ok, launches work on device 0