You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently to translate word_id -> word_str (done for each key in each selected row, potentially millions of times per select) - we need to read lock global dictionary shard.
This incurs significant overhead just for locks themselves.
An in cases where contention might be high (when per-repacker caching is inefficient, e.g. nginx urls) - might also slow down new packets processing.
The lock is only needed, because we use std::deque to find a word by offset, and it might get inserted into while we're reading (changing it's structure).
The proposed idea is to rework dictionary shard to use just a contiguous mmap()-ed memory region, enabling fully-lockless read-at-offset path (as the mmap()-ed region pointer never changes).
The word_t size is less than modern x86 CPU's cache line size, but due to the strong x86 cache-coherence model - this might only incur a performance penalty, but not compromise correctness.
As far as i understand it, anyway :)
The text was updated successfully, but these errors were encountered:
Currently to translate word_id -> word_str (done for each key in each selected row, potentially millions of times per select) - we need to read lock global dictionary shard.
This incurs significant overhead just for locks themselves.
An in cases where contention might be high (when per-repacker caching is inefficient, e.g. nginx urls) - might also slow down new packets processing.
The lock is only needed, because we use
std::deque
to find a word by offset, and it might get inserted into while we're reading (changing it's structure).The proposed idea is to rework dictionary shard to use just a contiguous
mmap()
-ed memory region, enabling fully-lockless read-at-offset path (as themmap()
-ed region pointer never changes).The
word_t
size is less than modern x86 CPU's cache line size, but due to the strong x86 cache-coherence model - this might only incur a performance penalty, but not compromise correctness.As far as i understand it, anyway :)
The text was updated successfully, but these errors were encountered: