Skip to content

[QUESTION] LFU Morris Counter #7943

@minilek

Description

@minilek

The documentation page https://redis.io/topics/lru-cache states that LFU uses the Morris Counter, but it seemed to not be the case when I tried looking at the source code (line 315 of https://github.com/redis/redis/blob/unstable/src/evict.c). I am raising this issue since I'm not sure whether it's intentional or a bug.

Before I continue, I should give some background since not everyone reading this might be familiar. Think of the following data structural problem: you want to maintain a counter N subject to three operations: (1) init() sets N to 0, (2) increment() changes N to N+1, and (3) query() reports the value of N. This is solved easily with a counter, which uses O(log N) bits of memory. Morris figured out that if you only want a randomized algorithm that approximates N with good probability, you can do better: more like O(log log N) bits. The absolute most basic idea is: what if instead of storing N, we store a different counter X. init() sets X to 0. During increment(), we increment X randomly: with 50% probability we increment it, else we do nothing! Then E[X] = N/2, so we can estimate N by 2*X during a query(). The benefit is that since X is typically only half as big as N, we save a bit by storing X instead.

OK, the above idea is fine, but it's only saving at most 1 bit, and has large probability of large error..not that great. How did Morris fix it? The first idea is to increment even less often. We increment X with probability 1/2^X. One can then show that the estimator N~ = (2^X) - 1 is an unbiased estimator of N, i.e. E[N~] = N. Now X is typically roughly log N, which means we can store it in O(log log N) bits. The variance isn't great though. This is fixed by making the following observation: if we increment with probability 1.0^X, then the memory sucks (log N bits), but the variance is great (it's zero; this is a deterministic counter). On the other hand, we know that if we increment with probability 0.5^X, then the memory is great (loglog N bits), but the variance sucks. A natural question, which Morris asked and answered, is what happens if we increment with probability 0.99^X? Or more generally, probability 1/(1+a)^X for some small a close to 0? It turns out that the memory is concentrated around O(loglog N + log(1/a)) bits, and the variance goes to 0 as a goes to 0. Specifically, by setting "a" appropriately small it's possible to show that the estimator N^ = (1/a)((1+a)^X - 1) satisfies Pr(|N^ - N| > epsN) < delta, and the space usage is O(loglog N + log(1/eps) + loglog(1/delta)) bits (see https://arxiv.org/abs/2010.02116).

Now back to Redis: Redis neither increments X w.p. 1/2, nor 1/2^X, nor 1/(1+a)^X. Rather, it seems it increments X with probability 1/(1 + a*X) (where 'a' is server.lfu_log_factor, and is set to 10 by default but can be changed). It only does such probabilistic increments once X > LFU_INIT_VAL (which is 5 by default); this is a good decision, even for the Morris Counter it can be shown that if you want good estimates for small N, you should use a deterministic counter at first then only switch to probabilistic updates as N grows. What concerns me though is the update probability being ~ 1/X instead of 1/(1+a)^X. There isn't any analysis I'm aware of which shows that this has good behavior, either analytically or empirically. Is there a reason this update rule was chosen? I also spent a few minutes trying to figure out what the estimator should be to get an unbiased estimate of N, but that wasn't clear to me either.

P.S. I'm attaching here a plot of running the actual Morris Counter parametrized to use 8 bits of memory whp. I did the following 5000 times using code Huacheng and I wrote when we were working on our previously linked paper: pick a random number N between 50k and 100k (so about 16-17 bits) then increment() the Morris Counter N times. As you can see, even though we only used 8 bits to count this bigger number, the median error was about 10% and the error was never more than 50%. Note: a point at (x,y) in this plot means x% of the time the relative error was y% or less. morris_8bit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions