activeExpireCycle() tries to test just a few DBs per iteration so that it scales if there are many configured DBs in the Redis instance. However this commit makes it a bit smarter when one a few of those DBs are under expiration pressure and there are many many keys to expire. What we do is to remember if in the last iteration had to return because we ran out of time. In that case the next iteration we'll test all the configured DBs so that we are sure we'll test again the DB under pressure. Before of this commit after some mass-expire in a given DB the function tested just a few of the next DBs, possibly empty, a few per iteration, so it took a long time for the function to reach again the DB under pressure. This resulted in a lot of memory being used by already expired keys and never accessed by clients.
This small number of DBs is set to 16 so actually in the default configuraiton Redis should behave exactly like in the past. However the difference is that when the user configures a very large number of DBs we don't do an O(N) operation, consuming a non trivial amount of CPU per serverCron() iteration.
This is the first step to lower the CPU usage when many databases are configured. The other is to also process a limited number of DBs per call in the active expire cycle.
REDIS_HZ is the frequency our serverCron() function is called with. A more frequent call to this function results into less latency when the server is trying to handle very expansive background operations like mass expires of a lot of keys at the same time. Redis 2.4 used to have an HZ of 10. This was good enough with almost every setup, but the incremental key expiration algorithm was working a bit better under *extreme* pressure when HZ was set to 100 for Redis 2.6. However for most users a latency spike of 30 milliseconds when million of keys are expiring at the same time is acceptable, on the other hand a default HZ of 100 in Redis 2.6 was causing idle instances to use some CPU time compared to Redis 2.4. The CPU usage was in the order of 0.3% for an idle instance, however this is a shame as more energy is consumed by the server, if not important resources. This commit introduces HZ as a runtime parameter, that can be queried by INFO or CONFIG GET, and can be modified with CONFIG SET. At the same time the default frequency is set back to 10. In this way we default to a sane value of 10, but allows users to easily switch to values up to 500 for near real-time applications if needed and if they are willing to pay this small CPU usage penalty.
A new server.orig_commands table was added to the server structure, this contains a copy of the commant table unaffected by rename-command statements in redis.conf. A new API lookupCommandOrOriginal() was added that checks both tables, new first, old later, so that rewriteClientCommandVector() and friends can lookup commands with their new or original name in order to fix the client->cmd pointer when the argument vector is renamed. This fixes the segfault of issue #986, but does not fix a wider range of problems resulting from renaming commands that actually operate on data and are registered into the AOF file or propagated to slaves... That is command renaming should be handled with care.
Usually this does not happens since we trim for " \t\r\n", but if there are other chars that return true with isspace(), we may end with an empty argv. Better to handle the condition in an explicit way.
This should improve things in two ways: 1) Prevent timeouts caused by the execution of long commands. 2) Improve detection of real connection errors. This is mostly effective only on Linux because of the bogus default keepalive settings. In Linux we have OS-specific calls to set the keepalive interval to reasonable values.
Otherwise we end with less reliable connections because it's too easy that a single packet gets lost.