For CUDA interoperability, originally suggested here: https://discourse.julialang.org/t/how-to-use-copy-or-other-array-operations-inside-a-cuda-kernel/55762/8 .