Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSMKV: Object Count #1811

Closed
etiennedi opened this issue Jan 31, 2022 · 1 comment
Closed

LSMKV: Object Count #1811

etiennedi opened this issue Jan 31, 2022 · 1 comment
Labels
autoclosed Closed by the bot. We still want this, but it didn't quite make the latest prioritization round

Comments

@etiennedi
Copy link
Member

etiennedi commented Jan 31, 2022

Background

This is a requirement for #1798. Currently, the only way to count objects in an lsmkv.Bucket is to iterate over all of them. Such an iteration is currently implemented for { Aggregate { meta { count } } } which is orders of magnitudes too slow to be part of a BM25 query. This is because a simple count of keys does not contain enough information to determine the true count: A key could contain a tombstone and we don't know if a previous segment contained objects for those keys. Similarly, a key could either be a create or an update.

Proposed Algorithms

I could not find any info if there is a standard algorithm for an efficient count in LSM-based indices, so I came up with the following which should be fairly efficient without too much overhead:

  • Most parts of this algorithm run just once at start-up. Only the mem-table part needs to run at runtime.
  • This algorithm assumes running in order of segments as they are added (oldest to newest)
  • Starting at the root segment, each segment can reliably determine how many objects it adds itself.
  • On the root segment we can ignore tombstones as there is nothing they could remove.
  • For any non-root segment, we can encounter 4 situations:
    • Previously unseen key: This is guaranteed to increase the count by 1
    • Previously seen key: In all three modes (replace, set, map), a previously seen key means that the count is not changed
    • Previously seen key with tombstone: This is guaranteed to reduce the count by 1
    • Previously unseen key with tombstone: This does not alter the count.
  • To determine if a key is seen or unseen this algorithm can make use of the fact that it runs when segments are added, i.e. it can make use of high-level API functions of all previously included segments to reliably determine if a key exists. This will also take situations into account such as add->delete->re-add since the high-level APIs already produce correct results for those.
  • Each segment will therefore produce a net change number which can be cached along with other structures created at start-up time, such as bloom filters, etc. (Most likely the count might be able to run in the same loop as the bloom filter construction to save additional disk lookups)
  • As new segments get written to disk they can run through the same process when mounted
  • The topmost segment is the memtable. It could be extended with a simple thread-safe counter which uses the same techniques as above: When a new object comes in, determine which of the 4 states it is in.
    • Alternative: The suggested method above would require read-before-write patterns which would most likely add a large overhead on every single write for a relatively small benefit. Alternatively when .Count() is called this information can be aggregated. If such a runtime aggregation is still too slow, it could be cached in the memtable with a new write occurring as the cache invalidation event.
  • Assumption: When performing a compaction, the net count modifier of the compacted segment is the sum of the two previous segments
    • If that assumption is incorrect, we can also run the same "segment initialization" logic from above again, however, we need to be careful that we limit the lookups to all previous segments, so we can't use the highest level APIs here (as to the outside we would include future segments as well which would produce an incorrect count).

Notes

Once present, this new algorithm can also be used to considerably speed up { Aggregate { meta { count } } }.

@etiennedi etiennedi mentioned this issue Jan 31, 2022
11 tasks
etiennedi added a commit that referenced this issue Feb 1, 2022
This primitive implementation does not handle updates andd deletes yet
and must be adapted in a future commit
etiennedi added a commit that referenced this issue Feb 1, 2022
currently only works for unique additions, does not yet handle udpates
or deletes
etiennedi added a commit that referenced this issue Feb 1, 2022
This does not yet check for disk segments, so this only works if all
disk segments consist of unique additions, but does work if the memtable
contains updates or deletes.
etiennedi added a commit that referenced this issue Feb 3, 2022
currently this process is skipped during compactions, so the values are
not reliable for compacted segments.

Has not yet been evaluated for performance.
etiennedi added a commit that referenced this issue Feb 4, 2022
@stale
Copy link

stale bot commented Apr 2, 2022

Thank you for your contribution to Weaviate. This issue has not received any activity in a while and has therefore been marked as stale. Stale issues will eventually be autoclosed. This does not mean that we are ruling out to work on this issue, but it most likely has not been prioritized high enough in the last months.
If you believe that this issue should remain open, please leave a short reply. This lets us know that the issue is not abandoned and acts as a reminder for our team to consider prioritizing this again.
Please also consider if you can make a contribution to help with the solution of this issue. If you are willing to contribute, but don't know where to start, please leave a quick message and we'll try to help you.
Thank you, The Weaviate Team

@stale stale bot added the autoclosed Closed by the bot. We still want this, but it didn't quite make the latest prioritization round label Apr 2, 2022
@stale stale bot closed this as completed Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoclosed Closed by the bot. We still want this, but it didn't quite make the latest prioritization round
Projects
None yet
Development

No branches or pull requests

1 participant