Skip to content

Improve FindMemoryTypeIndex by storing hashed creation parameters for buffers or images as key with memory type index as value #493

@IAmNotHanni

Description

@IAmNotHanni

This is based on an idea from @adam-sawicki-a and @papazhang66 which came up in #419.

Intro

Currently, when FindMemoryTypeIndex is called, it has to do quite some work to get find right memory type for the specified creation parameters:

VkResult VmaAllocator_T::FindMemoryTypeIndex(
    uint32_t memoryTypeBits,
    const VmaAllocationCreateInfo* pAllocationCreateInfo,
    VkFlags bufImgUsage,
    uint32_t* pMemoryTypeIndex) const
{
    memoryTypeBits &= GetGlobalMemoryTypeBits();

    if(pAllocationCreateInfo->memoryTypeBits != 0)
    {
        memoryTypeBits &= pAllocationCreateInfo->memoryTypeBits;
    }

    VkMemoryPropertyFlags requiredFlags = 0, preferredFlags = 0, notPreferredFlags = 0;
    if(!FindMemoryPreferences(
        IsIntegratedGpu(),
        *pAllocationCreateInfo,
        bufImgUsage,
        requiredFlags, preferredFlags, notPreferredFlags))
    {
        return VK_ERROR_FEATURE_NOT_PRESENT;
    }

    *pMemoryTypeIndex = UINT32_MAX;
    uint32_t minCost = UINT32_MAX;
    for(uint32_t memTypeIndex = 0, memTypeBit = 1;
        memTypeIndex < GetMemoryTypeCount();
        ++memTypeIndex, memTypeBit <<= 1)
    {
        // This memory type is acceptable according to memoryTypeBits bitmask.
        if((memTypeBit & memoryTypeBits) != 0)
        {
            const VkMemoryPropertyFlags currFlags =
                m_MemProps.memoryTypes[memTypeIndex].propertyFlags;
            // This memory type contains requiredFlags.
            if((requiredFlags & ~currFlags) == 0)
            {
                // Calculate cost as number of bits from preferredFlags not present in this memory type.
                uint32_t currCost = VMA_COUNT_BITS_SET(preferredFlags & ~currFlags) +
                    VMA_COUNT_BITS_SET(currFlags & notPreferredFlags);
                // Remember memory type with lowest cost.
                if(currCost < minCost)
                {
                    *pMemoryTypeIndex = memTypeIndex;
                    if(currCost == 0)
                    {
                        return VK_SUCCESS;
                    }
                    minCost = currCost;
                }
            }
        }
    }
    return (*pMemoryTypeIndex != UINT32_MAX) ? VK_SUCCESS : VK_ERROR_FEATURE_NOT_PRESENT;
}

Improvement

The idea would be to hash all parameters used for buffer or image creation in an std::unordered_map with the parameters as key and an std::uint32_t as value for the memory type index. You can write your own hashing function for std::unordered_map (by specifying it as third parameter), and using such a cache with an underlying hash system is very common for other Vulkan objects pipelines or descriptor set layouts (for example, here is a nice tutorial which describes an abstraction for descriptor set layouts: https://vkguide.dev/docs/extra-chapter/abstracting_descriptors).

How much this improves performance is something that would have to be measured. On the one side the hashing will take a little time, but so does calling FindMemoryTypeIndex currently. In general, performance is something that must always be measured instead of being estimated. This could easily analyzed by writing a test for it when implementing this feature. There are also other frameworks like Google Benchmark for advanced benchmarking. It is likely that the exact performance improvement depends a lot on the actual parameters, the hardware, and more factors.

This brings me to another question:
@adam-sawicki-a Are there other parts in the code where you think such a cache with hashed values could improve performance?

best regards
Johannes

Metadata

Metadata

Assignees

No one assigned

    Labels

    optimizationImprovement in performance or memory usagewontfixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions