You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The specific usecase which should benefit from this, is partial invalidation of large buffers. Ideally the regions can be computed agonistic, e.g. by comparing current bytes with new bytes.
Tasks:
experiment with the regions
implement more finegrained "dirty byte detection" (already in ByteCache, except scalar arrays)
allow for manual dirty setting?
make configurable?
figure out if the memory mapping for host memory to cpu buffer is a bottleneck as well or if mapping the entire memory can be kept
In the first implementation the new buffer bytes are compared with the existing ones, a mask is created and converted into regions (offset & size).
In the associated test, floats are changed from 1 to 2. On the byte level this only changes 2 out of 4 bytes per float, resulting in a lot of small regions being copyied:
As soon as a cache is dirty and a related shader is run again, the entire buffer will be flushed. This can become quite expensive for large buffers and for cpu-gpu communication. The vulkan specification allows to do partial copies using buffer regions: https://www.khronos.org/registry/vulkan/specs/1.1-extensions/man/html/vkCmdCopyBuffer.html
Currently, only a single region (the entire buffer) is provided:
lava/lava/api/pipeline.py
Lines 201 to 202 in e82f6d3
The specific usecase which should benefit from this, is partial invalidation of large buffers. Ideally the regions can be computed agonistic, e.g. by comparing current bytes with new bytes.
Tasks:
implement more finegrained "dirty byte detection"(already in ByteCache, except scalar arrays)lava/lava/buffer.py
Line 155 in e82f6d3
The text was updated successfully, but these errors were encountered: