[SYCL 2020][USM] Add USM memory management functions and usm_allocator #308

illuhad · 2020-09-16T18:43:13Z

This adds USM memory management functions, pointer query functions, and usm_allocator. USM tests will follow later as we will need more functionality (in particular memcpy) for proper testing.

Known caveats:

As far as I am aware, CUDA and HIP do not allow specifying alignments for allocations. The SYCL aligned memory allocation functions therefore currently ignore the alignment argument on CUDA and HIP. On CPU it should work as expected.
HIP/ROCm only has partial support for shared allocations (unified memory in CUDA terminology). My understanding is that it is currently implemented as mapped host memory in ROCm, which means that performance will probably be very bad as every memory access goes across PCIe. I suspect that this is due to hardware limitations, and I expect that this will change with future AMD hardware. After all, AMD has promised a "coherent" architecture for Frontier and El Capitan. hipSYCL correctly calls hipMallocManaged() for shared allocations, so when this is improved hipSYCL will be ready.
Because of the previous point, I believe that there currently is no way in ROCm to distinguish between "shared" and (pinned) "host" allocations. So, I expect that pointer queries for SYCL shared allocations would identify as SYCL host allocations. We use the HIP pointer property hipMemoryTypeUnified to check for shared allocations; the HIP documentation says that it is currently not yet used but I suspect that this would be the right one when this is properly implemented in ROCm.

aTom3333 · 2020-09-16T20:35:35Z

As far as I am aware, CUDA and HIP do not allow specifying alignments for allocations. The SYCL aligned memory allocation functions therefore currently ignore the alignment argument on CUDA and HIP. On CPU it should work as expected.

Can't you align the pointers yourself by allocating a bit more? Something like:

void *cuda_allocator::allocate(size_t min_alignment, size_t size_bytes)
{
  void *ptr, *originalPtr;
  auto err = cudaSetDevice(_dev);
  err = cudaMalloc(&originalPtr, size_bytes + min_alignment);
  ptr = (orignalPtr & ~(min_alignment-1)) + min_alignment;
  int offset = ptr - originalPtr;
  *(static_cast<unsigned char*>(ptr) - 1) = offset;

  if (err != cudaSuccess) {
    register_error(__hipsycl_here(),
                   error_info{"cuda_allocator: cudaMalloc() failed",
                              error_code{"CUDA", err},
                              error_type::memory_allocation_error});
    return nullptr;
  }

  return ptr;
}

void cuda_allocator::free(void *mem)
{
  int offset = *(static_cast<unsigned char*>(mem) - 1);
  void *originalPtr = mem - offset;
  auto err = cudaFree(originalPtr);
  if (err != cudaSuccess) {
    register_error(__hipsycl_here(),
                   error_info{"cuda_allocator: cudaFree() failed",
                              error_code{"CUDA", err},
                              error_type::memory_allocation_error});
  }
}

illuhad · 2020-09-16T21:19:57Z

That might work, but it seems very much like a hack to me. To be honest I'm not sure if it's worth it. CUDA and HIP, while not allowing custom alignments, already guarantee that all allocations will be suitably aligned for any type - I assume this means either something like double4 or even alignment on page boundaries.

EDIT: Also, this approach would in general require launching a kernel or cudaMemset to properly store the offset as pure device allocations (non-shared allocations) are not accessible from the host. And in free() we would need to read from GPU memory across PCIe. Alternatively we could have some map to associate all user pointers with their offset, but that also seems like a lot of effort..

aTom3333 · 2020-09-16T21:38:45Z

If they guarantee enough alignment I agree it is unnecessary. Even more so with the fact that kernel would need to be launched

[SYCL 2020] Add USM memory management functions

77015f1

illuhad force-pushed the feature/sycl-2020/add-usm-memory-mgmt-functions branch from 3228343 to 77015f1 Compare September 16, 2020 23:59

illuhad merged commit 814b296 into develop Sep 17, 2020

illuhad deleted the feature/sycl-2020/add-usm-memory-mgmt-functions branch September 17, 2020 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL 2020][USM] Add USM memory management functions and usm_allocator #308

[SYCL 2020][USM] Add USM memory management functions and usm_allocator #308

illuhad commented Sep 16, 2020 •

edited

aTom3333 commented Sep 16, 2020

illuhad commented Sep 16, 2020 •

edited

aTom3333 commented Sep 16, 2020

[SYCL 2020][USM] Add USM memory management functions and usm_allocator #308

[SYCL 2020][USM] Add USM memory management functions and usm_allocator #308

Conversation

illuhad commented Sep 16, 2020 • edited

aTom3333 commented Sep 16, 2020

illuhad commented Sep 16, 2020 • edited

aTom3333 commented Sep 16, 2020

illuhad commented Sep 16, 2020 •

edited

illuhad commented Sep 16, 2020 •

edited