Feature Request: Support UMA archs (e.g. iOS) #238
Labels
enhancement
New feature or request
Metal
Issue affects Metal specifically. Don't use this tag if all RenderSystems are affected
Vulkan
Note: This is brainstorming
We currently have:
BT_IMMUTABLE
must be uploaded on creationBT_DEFAULT
must use upload() / staging buffersBT_DYNAMIC_*
must use map()iOS shared works like default in terms of speed, but wants access using map.
Intended use case
Possible solutions:
BT_DEFAULT
can be mappedBufferPacked::getRawGpuPtr()
Issue: Synchronization
All writes from CPU are race conditions if the GPU is using it. That's why
BT_DYNAMIC
3x the memory and whyBT_DEFAULT
uses StagingBuffers to upload. Easiest solution is to disallow writes from CPU i.e.getRawGpuPtr
returns a const pointer. If the user const casts the pointer and writes to it, it's his responsability.All writes from GPU are race conditions if the CPU wants to read that data. Writes from GPU:
BufferPacked::copyTo
Possible solution:
BufferPacked::getRawGpuPtr()
; returns nullptr if it's not supported or there is a data hazardBufferPacked::isRawGpuAvailable()
(name pending). If it returns true; it is safe to call getRawGpuPtr and it will return a valid ptr.BufferPacked::rawGpuBufferAvailableFrame()
(name pending)BufferPacked::copyTo
); MetalSharedBufferInterface has an extra uint32 to store the current frame i.e. it storesVaoManager::getFrameCount
.VaoManager::waitForSpecificFrameToFinish
can be used to wait for the buffer.BT_IMMUTABLE
: Immutable buffers could add a boolean instead of a frame idx. Once the first upload has finished GPU side; it is forever accessible andrawGpuBufferAvailableFrame
could always returnVaoManager::getFrameCount - VaoManager::getDynamicBufferMultiplier
.Issue: Validation layer
Expect Metal to complain a bit that certain operations can or cannot be done when the buffer type is
MTLStorageModeShared
, these are often minor fixesWhat if user holds on to the returned pointer getRawGpuPtr?
User should call getRawGpuPtr to prevent data races. Otherwise all bets are off. It's not our responsibility anymore.
If the user calls
VaoManager::cleanupEmptyPools
, the pointer may no longer be valid.Textures
Textures can't be accessed directly. They don't have linear tiling access. Supporting linear tiling is a mess and we recommend to use
TexBufferPacked
insteadMetal & Vulkan
Metal & Vulkan can benefit from the same features. As long as the GPU exposes a
HOST_VISIBLE_BIT|HOST_COHERENT_BIT|DEVICE_LOCAL_BIT
(HOST_CACHED_BIT too?) then the feature can be implemented.It can benefit Desktop GPUs too. With PCIE resizable bar / SAM, GPU drivers are exposing
HOST_VISIBLE_BIT|HOST_COHERENT_BIT|DEVICE_LOCAL_BIT
(not cached) memory which lives on the GPU and can be read from CPU (it gets pulled from the PCIE bus).Though given that it is not cached, users should read from each byte only once (or use memcpy to make a local copy). It should be treated the same way as write combined memory.
When Resizable Bar is not available we should not use that memory because there's only 256 MB, and the driver needs some of that too; and it's too precious.
The text was updated successfully, but these errors were encountered: