You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The first phase of disk usage eviction is to enumerate all layers across all tenants, so that the layers can then be globally ordered by LRU. This generates an O(n_layers) data structure, which will be millions of layers. This has a memory cost, and also will generate a lot of spurious atomics from cloning args into EvictionCandidate, etc.
Basic approach: avoid using O(N_layers) memory
We can make this much more scalable by accepting an inexact ordering:
First calculate how much space we want to free
Then consider the first 10% of tenants: order their layers, and then delete layers until we have reclaimed 10% of the target
...and so on for the next 10% of tenants, etc.
That way we avoid holding the entire list of layers in memory at a time.
Sampling approach: operate in constant size memory
A more sophisticated approach would be to use statistical sampling of the layer age distribution:
Make a histogram of layer ages
Sample a modest number of layers from a modest number of tenants, e.g. 100 layers from 100 tenants each.
To free 10% of the used space, take a 10th percentile sample from the histogram: that is our age threshold for deleting layers
Iterate through tenants & layers, evicting anything older than the age threshold.
Unfair but fast approach
Avoid iterating through all tenants at all, by accepting that some tenants will "take one for the team" so that we don't have to touch all of them.
For example, to touch only half the tenants:
Start with the sampling approach
If our target is 10%, then adjust it up to 20%
Pick a random 50% subset of the tenants, and apply eviction with the 20% threshold
This will work fine if eviction is somewhat common, as each iteration we'll pick different tenants.
The text was updated successfully, but these errors were encountered:
We do take the LayerMap rwlock which is tokio lock for each attached timeline which will make progress per tokio's coop facilities and so yield every now and then. This is not true for secondaries. I'll add a yield per secondary tenant.
The first phase of disk usage eviction is to enumerate all layers across all tenants, so that the layers can then be globally ordered by LRU. This generates an O(n_layers) data structure, which will be millions of layers. This has a memory cost, and also will generate a lot of spurious atomics from cloning args into EvictionCandidate, etc.
Basic approach: avoid using O(N_layers) memory
We can make this much more scalable by accepting an inexact ordering:
That way we avoid holding the entire list of layers in memory at a time.
Sampling approach: operate in constant size memory
A more sophisticated approach would be to use statistical sampling of the layer age distribution:
Unfair but fast approach
Avoid iterating through all tenants at all, by accepting that some tenants will "take one for the team" so that we don't have to touch all of them.
For example, to touch only half the tenants:
This will work fine if eviction is somewhat common, as each iteration we'll pick different tenants.
The text was updated successfully, but these errors were encountered: