state_machine: reduce memory usage by about 200 MiB #1429
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This one is tricky! The big picture here is that we have a cache of objects, which is a normal cache with arbitrary eviction policy.
However, we want to maintain an invariant --- all objects touched by a bar of events must not be evicted during this bar.
To achieve that, we place a stash below the cache. The job of a stash is to catch all objects that fall out from the cache inside a single bar (between bars, the stash is reset).
What's the size of the stash that we need?
The conservative estimate is the number of queries for the cache. That is, inserts + lookups, and that is, using the old logic,
The insight of this commit is that a lookup and an insert for the same key are double counted that way.
In other words, what we are interested in is not the amount of queries to the cache overall, but the amount of different keys the queries touch.
And for most of operations, we are actually going to update exactly the keys we've prefetched.
The three exceptions are: