Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-1852 sort memtable KV pairs on read #1853

Merged
merged 3 commits into from
Mar 11, 2022
Merged

gh-1852 sort memtable KV pairs on read #1853

merged 3 commits into from
Mar 11, 2022

Commits on Mar 10, 2022

  1. gh-1852 sort memtable KV pairs on read

    The memtable for Map is a binary tree so it's always sorted. However,
    since this is type 'Map' each "row key" holds a map. This map was
    unsorted in the past. In #1832 we introduced a change that made sure
    this change would always be sorted ON DISK, i.e. in the segments. It was
    very natural to also keep it sorted in the memtable, as we did not have
    to do any sorting when flushing. However, the performance tests on
    imports that make heavy use of the inverted index had a large
    performance degradation after #1832. In a test I did locally the import
    time went up by over 30%.
    
    This fix goes back to keeping the KV pairs unsorted and making each
    change an append only operation. This means it now needs to be sorted in
    just two places (as opposed to on every single insertion):
    
    1. On a read query. Those should be rare on memtable, since memtables
       are mostly meant for writing. The added overhead here (minimal) is
       not a problem since it was also there before #1832
    2. When flushing. Flushing is an async operation and the small overhead
       of sorting each row's Map KVs is neglible.
    
    This new implementation has the same import speed as prior to #1832
    while keeping all the runtime benefits of having the KV pairs sorted on
    disk.
    
    closes #1852
    etiennedi committed Mar 10, 2022
    Configuration menu
    Copy the full SHA
    d3d4de5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    303f700 View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2022

  1. gh-1852 fix typos

    etiennedi committed Mar 11, 2022
    Configuration menu
    Copy the full SHA
    5b132f4 View commit details
    Browse the repository at this point in the history