Hash collisions in SVG loader #2020

lmdsp · 2024-03-02T15:07:24Z

The picture loader in tvgLoader.cpp uses this 'hash' function:

uint64_t HASH_KEY(const char* data, uint64_t size)
{
    return (((uint64_t) data) << 32) | size;
}

static LoadModule* _findFromCache(const char* data, uint32_t size, const string& mimeType)
{
    // ...
    auto loader = _activeLoaders.head;

    auto key = HASH_KEY(data, size);

    while (loader) {
        if (loader->type == type && loader->hashkey == key) {

I'm not sure what the intended effect was, but this doesn't hash the image (svg or other) data at all.
It will only work for example if the user loads data from static const char* buffers.

We ran into this problem when attempting to load two svgs that only differ by their opacity value from an buffer external which turns out stayed had the same adress (but obviously different contents).
This is extremely confusing as their is no mention of this behavior in the Picture::load() documentation.

This can easily be fixed by using a real hash function that hashes the full data contents.
In our internal fork we replaced this with XXH3 (BSD) which is probably the fastest hash implementation out there (31Gb/s) and of great quality (low collision rate).

While I understand it might not be feasible to rely on this library for various reasons,
IMHO their should be away to customize the hash function ThorVG uses, either by passing a function pointer or through the use of a macro/define.

To entirely avoid any potential collisions, one could also do a memcmp of the data when hash keys and size are identical.

The text was updated successfully, but these errors were encountered:

hermet · 2024-03-03T06:52:22Z

@lmdsp thanks for reporting & suggestion. We will look into this soon.

hermet · 2024-03-03T06:54:41Z

In our internal fork we replaced this with XXH3 (BSD) which is probably the fastest hash implementation out there (31Gb/s) and of great quality (low collision rate).

ps. ThorVG welcomes contributions offering better solutions.

The previous loader cache mechanism encountered a problem when the user changed the content of the cached data. In such cases, a new request would not be processed because the renderer would use the previously cached content. So far, the TVG cache mechanism utilizes a pointer hash key for the fastest hashing mechanism available. One limitation is that it assumes the address is unique for the data. To resolve this, we modified the caching policy. Now, the renderer will not cache copied data; it will only cache the given data when it is deemed shareable. issue: #2020

hermet · 2024-03-05T08:40:24Z

After this change, ThorVG does not cache data when picture data is copied.
In fact, employing a complex caching mechanism here is not proper,
As in most cases, user would not expect the caching with the data copying.

#2026

The previous loader cache mechanism encountered a problem when the user changed the content of the cached data. In such cases, a new request would not be processed because the renderer would use the previously cached content. So far, the TVG cache mechanism utilizes a pointer hash key for the fastest hashing mechanism available. One limitation is that it assumes the address is unique for the data. To resolve this, we modified the caching policy. Now, the renderer will not cache copied data; it will only cache the given data when it is deemed shareable. issue: #2020

hermet · 2024-03-06T05:35:32Z

the fix will be applied in 0.12.7

The previous loader cache mechanism encountered a problem when the user changed the content of the cached data. In such cases, a new request would not be processed because the renderer would use the previously cached content. So far, the TVG cache mechanism utilizes a pointer hash key for the fastest hashing mechanism available. One limitation is that it assumes the address is unique for the data. To resolve this, we modified the caching policy. Now, the renderer will not cache copied data; it will only cache the given data when it is deemed shareable. issue: #2020

hermet added bug Something isn't working renderer Core rendering showstopper Regression bugs / Critical errors labels Mar 3, 2024

hermet self-assigned this Mar 3, 2024

hermet added this to the 0.13 milestone Mar 3, 2024

hermet mentioned this issue Mar 5, 2024

renderer/loader: revamping the caching mechanism. #2026

Merged

hermet closed this as completed Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hash collisions in SVG loader #2020

Hash collisions in SVG loader #2020

lmdsp commented Mar 2, 2024

hermet commented Mar 3, 2024

hermet commented Mar 3, 2024

hermet commented Mar 5, 2024

hermet commented Mar 6, 2024

Hash collisions in SVG loader #2020

Hash collisions in SVG loader #2020

Comments

lmdsp commented Mar 2, 2024

hermet commented Mar 3, 2024

hermet commented Mar 3, 2024

hermet commented Mar 5, 2024

hermet commented Mar 6, 2024