Skip to content

Latest commit

 

History

History
73 lines (55 loc) · 3.83 KB

device_allocators.rst

File metadata and controls

73 lines (55 loc) · 3.83 KB

Device Allocators

The DeviceAllocator is designed for memory allocations on the GPU.

Creating a Device Allocator --------------------------

To create a DeviceAllocator, users can call the umpire::make_device_allocator host function. This function takes an allocator, the total amount of memory the DeviceAllocator will have, and a unique name for the new DeviceAllocator object, as shown below. A maximum of 64 unique DeviceAllocators can be created at a time.

../../../examples/device-allocator.cpp

When the DeviceAllocator is created, the size parameter that is passed to the umpire::make_device_allocator function is the total memory, in bytes, available to that allocator. Whenever the allocate function is called on the GPU, it is simply atomically incrementing a counter which offsets a pointer to the start of that memory. In other words, the total size from all of the allocates performed on the device with the DeviceAllocator may not exceed the size that was used when creating the device allocator.

To see what the total memory, in bytes, available to the allocator is, simply call the DeviceAllocator::getTotalSize() function.

Retrieving a DeviceAllocator Object

After creating a DeviceAllocator, we can immediately start using that allocator to allocate device memory. To do this, we have the umpire::get_device_allocator host/device function which returns the DeviceAllocator object corresponding to the name (or ID) given. The DeviceAllocator class also includes a helper function, umpire::is_device_allocator, to query whether or not a given name (or ID) corresponds to an existing DeviceAllocator. Below is an example of using the name to obtain the DeviceAllocator object:

../../../examples/device-allocator.cpp

With the umpire::get_device_allocator function, there is no need to keep track of a DeviceAllocator, since function call stacks can become quite complex. Users can instead use this function to obtain it inside whichever host or device function they need.

Note

When compiling without relocatable device code (RDC), the UMPIRE_SET_UP_DEVICE_ALLOCATORS() macro must be called in every translation unit that will use the umpire::get_device_allocator function.

Resetting Memory on the DeviceAllocator

The memory that has been used with the DeviceAllocator is only freed at the end of a program when the ResourceManager is torn down. However, there is a way to overwrite old or outdated data. Users can call the DeviceAllocator::reset() method which will allows old data to be overwritten. Below is an example:

../../../examples/device-allocator.cpp

The above code snippet shows the reset() function being called from the host. Calling the function from the host utilizes the ResourceManager and Umpire's memset operation under the hood. Therefore, there is some kind of synchronization guaranteed. However, if the reset() function is called on the device, there is no synchronization guaranteed, so the user must be very careful not to reset memory that other GPU threads still need.

To see the current size of the DeviceAllocator (aka, the current amount of memory, in bytes, being used), call the DeviceAllocator::getCurrentSize() function.

../../../examples/device-allocator.cpp