Skip to content

Bulk mapped operation

Hüseyin Tuğrul BÜYÜKIŞIK edited this page Feb 13, 2021 · 9 revisions

With

VirtualMultiArray<T>::mappedReadWriteAccess(
     size_t index,
     size_t range,
     std::function<void(T *)> func,
     bool pinBuffer=false,
     bool read=true, 
     bool write=true, 
     bool userPtr=nullptr);

method, user can further optimize element accesses such that data copies can be minimized for less latency and any number of operations can be done on region without any extra copying. Please observe that raw buffer access uses same index as the virtual array get/set methods, without performance penalty (possibly even faster than a plain buffer since it is 4096-aligned).

#include "GraphicsCardSupplyDepot.h"
#include "VirtualMultiArray.h"
#include "PcieBandwidthBenchmarker.h"

// testing
#include <iostream>

int main(int argC, char ** argV)
{
	const size_t n = 1050;
	const size_t pageSize=150;
	const std::vector<int> gpuMultipliers ={1,0,0,0,0,};
	const int maxActivePagesPerGpu = 1;

	GraphicsCardSupplyDepot depot;
	VirtualMultiArray<int> arr(n,depot.requestGpus(),pageSize,maxActivePagesPerGpu,gpuMultipliers);

        bool read=true;
        bool write=true;
        bool pinned=false;
	arr.mappedReadWriteAccess(303,501,[](int * buf){

		for(int i=303;i<303+501;i++)
			buf[i]=i;
	},pinned,read,write);

		for(int i=250;i<850;i++)
			std::cout<<i<<":"<<arr.get(i)<<std::endl;
	return 0;
}
  • index: starting element of mapping region on virtual array

  • range: width of mapping region in elements

  • func: user-defined function to process data (function does not need to return anything, but is given T * parameter that has same indexing base as virtual array)

  • user is responsible to not touch the mapped region concurrently by other threads with any other methods to preserve data consistency.

  • read: enables fetching data from virtual array before executing user function

  • write: enables writing data to virtual array after executing user function

  • pinBuffer: uses Linux's mlock/munlock before/after user function is executed

  • userPtr: if this is not nullptr, then it is used in data-copying part during mapping instead of allocating a temporary buffer. Required to be valid between userPtr[0] and userPtr[range]. Then in func, it is offsetted negatively by range amount to equalize indice usage with virtual array.

Number of page locking depends on size and start/end of region and is same with bulk read/write operations but user function runs on raw buffer so any number of read/write operations can be done, fast, with less number of lines than bulk read/write operation (especially when both reading and writing and especially when a pinned buffer is needed (currently mlock munlock of linux is supported for this part)).