Skip to content

allow stats to be disabled by setting environment variable STATS=0 #96

Open
wants to merge 1 commit into from

3 participants

@richardkmiller

Allow stats collection to be disabled by setting STATS=0 in the environment.

@danhunsaker

What's the benefit in doing this? Can't you get the same effect by simply passing false (the default) as the fourth (and final) argument to Redis::enqueue()?

@richardkmiller

My understanding is that the $trackStatus/$monitor aspect of php-resque inserts Redis keys like "resque:job:123:status" and it's disabled by default.

Stats, on the other hand, look like "resque:stat:failed:..." or "resque:stat:processed:..." and are always on. We don't use stats and they were filling up our Redis database, so we disabled them.

@chrisboulton
Owner

The stats are just key/value counters - were they really consuming so much memory they became an issue?

@richardkmiller

The stats may be small, but we were trying to conserve as much memory as possible on an 8GB Redis instance that's handling millions of jobs. It ran out of memory a couple times so that was an easy place to cut since we don't use stats. It's also nice when you manually do a "redis-cli keys *" and want to diagnose an issue if there are far fewer keys.

@danhunsaker

This is intriguing because the number of stat keys you'd expect to see should never be over 2w+2, where w is the number of worker processes running at once. When a worker shuts down, it removes its stat keys from the DB. When new workers start, they clean out any keys associated with dead workers. So unless you have a huge number of workers in play, you should never reach the point where pruning the stat keys will make any noticeable difference. Especially since the number of jobs processed is completely independent of the number of workers in play, and thus the number of keys added by leaving this feature enabled.

On that note, you never specified a time period in which those "millions of jobs" are processed. Two million jobs per day is around 23 per second, for example, and depending on job complexity, that could reasonably be handled by even fewer workers - say, 6 workers doing tasks that consume around a quarter second apiece. Longer-running jobs (say, 10 seconds each) would of course require more workers to handle the same volume of work in the same amount of time (about 230 would suffice).

By the same token, if you instead meant that you have millions of workers running simultaneously, you absolutely need to switch to a Redis cluster, and immediately. A single instance, no matter how much RAM it has allocated, is going to serve as a far greater bottleneck than even process forking would. That would probably also alleviate the out-of-memory issues you've been seeing. Though I think I'd have to agree, at that point, that jettisoning the stat keys would free up a good deal of keyspace.

I suspect the reality is somewhere in between the extremes I painted above. I suspect you have dozens of workers, or maybe low hundreds. I suspect that the 2w+2 is large enough to be annoying in your use case, but submit that it isn't quite as high as it might seem. I also suspect that there are some other optimizations that could be made in your project - using a different database number for non-Resque keys, reducing the number and size of arguments passed into jobs, streamlining tasks so they execute more quickly with fewer resources, and so forth - which would more than compensate for the amount of memory used by your stat keys.

That said, I can see the utility of this particular quick fix - in your particular use case. The thing is, environments that don't closely resemble your (admittedly high-volume) use case won't benefit enough to care, and I suspect they far outnumber the environments that do. So I guess we'll see what happens. This could be useful, but only in certain cases.

@richardkmiller

Our number of workers is in the high hundreds so it was worth it to us to remove stats at the time. But we also upgraded memory on our Redis server recently so it's less of a concern now. If this PR is useful to anyone else, great; it was what we needed at the time. If not, I can understand that too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.