New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Organize storage/access around a page #32
Comments
Also think about blocklist polling. As the blocklist grows this will become worse. There is almost certainly a maximum value of blocks before things no longer work. |
Compaction already does this through an iterator interface. Could be added to the query path relatively easily by adding an additional Finder that access the index in a paged fashion. That may be all that's necessary, but using tempo at very large scales would be necessary to confirm. |
This now only applies to the index. Bloom filter sizes were addressed in another PR. |
The current backend design hits a number of limitations as the size of the index and bloom filters continue to grow due to compaction. Reorganize around the idea of page access to allow for bloom/index/objects to grow larger as time goes on. Research and determine a good page size that is cache friendly and easy to work with.
One of the big concerns is partial updates. Partial reads are possible by using
GET
with theContent-Range
header, but uncertain if partial updates are possible with thePATCH
method. This will not matter to ingestion as it will cut relatively small blocks, but compaction will struggle as blocks become larger and larger.Bloom
https://news.ycombinator.com/item?id=22010586
https://github.com/nehbit/aether/blob/master/aether-core/aether/services/rollingbloom/rollingbloom.go
Index
Objects
https://en.wikipedia.org/wiki/External_memory_algorithm
https://en.wikipedia.org/wiki/Cache-oblivious_algorithm
The text was updated successfully, but these errors were encountered: