You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After implementing a 2D/3D point cloud partitioning algorithm, I noticed the need for some sort of binned array type class that benefits from the automatic memory management of an array.
Imagine a set of bins, each with different sizes, that contain indexes to an array of values that are associated with that bin. Performance-wise, it is useful to store all these indices in a single array and then have each bin point to its portion of the index array in some manner. The order of operations for this goes something like:
First, determine the bin size
Use a scan algorithm to determine where each bin points to in the index array
Finally, write to the index array
It is step 3 that poses a problem with our automatic load and data management algorithms. Do we split the work/data according to the bins or the indices? In my implementation, I decided to split the work according to the bins, setting an uneven split of data across devices for the index array using the set_primary_devices method. This meant that, splitting the load according to the shape of the bin array, all writes to the index array were local to the device as desired. Further, as there was no way to tell the kernel launcher that the index array was a result array, I had to manually call unset_read_mostly and set_read_mostly manually.
As someone who is very intimate with the inner workings of Scalix, this wasn't incredibly difficult. But to a basic user, they might struggle with knowing how to setup the problem to get good scaling performance.
For this reason, I would like to implement a binned_array class that takes a type, the raw data to be binned, and an index generator that maps from a data index to a bin index. It will handle all the nitty gritty details of actually setting up the binned_array in a distributed fashion. From there, it will provide read-only access to the binned data.
We could also then provide some utilities like:
reorganize the raw data so that its order matches the bin order, possibly also setting its device split info to match the bins
get the device_split_info, allowing result data, mapped from binned data, to match the memory split of the bins/binned data, minimizing page faults and data migrations for reads from the binned data
The text was updated successfully, but these errors were encountered:
After implementing a 2D/3D point cloud partitioning algorithm, I noticed the need for some sort of binned array type class that benefits from the automatic memory management of an array.
Imagine a set of bins, each with different sizes, that contain indexes to an array of values that are associated with that bin. Performance-wise, it is useful to store all these indices in a single array and then have each bin point to its portion of the index array in some manner. The order of operations for this goes something like:
It is step 3 that poses a problem with our automatic load and data management algorithms. Do we split the work/data according to the bins or the indices? In my implementation, I decided to split the work according to the bins, setting an uneven split of data across devices for the index array using the
set_primary_devices
method. This meant that, splitting the load according to the shape of the bin array, all writes to the index array were local to the device as desired. Further, as there was no way to tell the kernel launcher that the index array was a result array, I had to manually callunset_read_mostly
andset_read_mostly
manually.As someone who is very intimate with the inner workings of Scalix, this wasn't incredibly difficult. But to a basic user, they might struggle with knowing how to setup the problem to get good scaling performance.
For this reason, I would like to implement a
binned_array
class that takes a type, the raw data to be binned, and an index generator that maps from a data index to a bin index. It will handle all the nitty gritty details of actually setting up the binned_array in a distributed fashion. From there, it will provide read-only access to the binned data.We could also then provide some utilities like:
The text was updated successfully, but these errors were encountered: