Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix SCAN bug regression test, avoiding empty SREM call. config: handle special configuration "" for auth rdb: incremental fsync when redis saves rdb add rdb-save-incremental-fsync option in redis.conf Fix DEBUG LOADAOF so that redis-server will not crash unexpectedly and will not be inconsistent after we call debug loadaof. Before this commit, there were 2 problems: 1, When appendonly is set to no and there is not a appendonly file, redis-server will crash if we call DEBUG LOADAOF. 2, When appendonly is set to no and there is a appendonly file, redis-server will hold different data after loading appendonly file. Fix core dump when using 'command getkeys' with wrong arguments. Update geohash.c fix geohasEncode bug when step > 31 optimize reply list memory usage Fix config_set_numerical_field() integer overflow. fix exists command on slave Fix infinite loop in dbRandomKey(). Thanks to @kevinmcgehee for signaling the issue and reasoning about the consequences and potential fixes. Issue #5015. Sentinel: add an option to deny online script reconfiguration. The ability of "SENTINEL SET" to change the reconfiguration script at runtime is a problem even in the security model of Redis: any client inside the network may set any executable to be ran once a failover is triggered. This option adds protection for this problem: by default the two SENTINEL SET subcommands modifying scripts paths are denied. However the user is still able to rever that using the Sentinel configuration file in order to allow such a feature. XADD MAXLEN should return an error for values < 0. Streams: fix xreadGetKeys() for correctness. The old version could not handle the fact that "STREAMS" is a valid key name for streams. Now we really try to parse the command like the command implementation would do. Related to #5028 and 4857. Fix update_zmalloc_stat_alloc in zrealloc Update sort.c Redundant second if, and we may use "not" operation for bool, I suppose it's faster. Streams: Change XADD MAXLEN handling of values <= 0. Now a MAXLEN of 0 really does what it means: it will create a zero entries stream. This is useful in order to make sure that the behavior is identical to XTRIM, that must be able to reduce the stream to zero elements when MAXLEN is given. Also now MAXLEN with a count < 0 will return an error. Streams: fix xreadGetKeys() buffer overflow. The loop allocated a buffer for the right number of keys positions, then overflowed it going past the limit. Related to #4857 and cause of the memory violation seen in #5028. Refactor createObjectFromLongLong() to be suitable for value objects. Fix incrDecrCommand() to create shared objects when needed. See #5011. Revert fix #4976 just leaving the flush() part. Update README.md Remove AOF optimization to skip expired keys. Basically we cannot be sure that if the key is expired while writing the AOF, the main thread will surely find the key expired. There are possible race conditions like the moment at which the "now" is sampled, and the fact that time may jump backward. Think about the following: SET a 5 EXPIRE a 1 AOF rewrite starts after about 1 second. The child process finds the key expired, while in the main thread instead an INCR command is called against the key "a" immediately after a fork, and the scheduler was faster to give execution time to the main thread, so "a" is yet not expired. The main thread will generate an INCR a command to the AOF log that will be appended to the rewritten AOF file, but that INCR command will target a non existin "a" key, so a new non volatile key "a" will be created. Two observations: A) In theory by computing "now" before the fork, we should be sure that if a key is expired at that time, it will be expired later when the main thread will try to access to such key. However this does not take into account the fact that the computer time may jump backward. B) Technically we may still make the process safe by using a monotonic time source. However there were other similar related bugs, and in general the new "vision" is that Redis persistence files should represent the memory state without trying to be too smart: this makes the design more consistent, bugs less likely to arise from complex interactions, and in the end what is to fix is the Redis expire process to have less expired keys in RAM. Thanks to Oran Agra and Guy Benoish for writing me an email outlining this problem, after they conducted a Redis 5 code review. 64 bit RDB_OPCODE_RESIZEDB in rdb saving this complication in the code is from times were rdbSaveLen didn't support 64 bits. use safe macro (non empty) in memrev64ifbe to eliminate empty if warning Modify clusterRedirectClient() to handle ZPOP and XREAD. AOF: remove no longer used variable "now". fix redis-rdb-check to provide proper arguments to rdbLoadMillisecondTime due to incorrect forward declaration, it didn't provide all arguments. this lead to random value being read from the stack and return of incorrect time, which in this case doesn't matter since no one uses it. Modules: convert hash to hash table for big objects. Test RDB stream encoding saving/loading. add malloc_usable_size for libc malloc this reduces the extra 8 bytes we save before each pointer. but more importantly maybe, it makes the valgrind runs to be more similiar to our normal runs. note: the change in malloc_stats struct in server.h is to eliminate an name conflict. structs that are not typedefed are resolved from a separate name space. Enhance RESTORE with RDBv9 new features RESTORE now supports: 1. Setting LRU/LFU 2. Absolute-time TTL Other related changes: 1. RDB loading will not override LRU bits when RDB file does not contain the LRU opcode. 2. RDB loading will not set LRU/LFU bits if the server's maxmemory-policy does not match. Fix rdbLoadIntegerObject() to create shared objects when needed. fix streams memory estimation, missing raxSeek Update ZPOPMIN/ZPOPMAX command declaration Unlike the BZPOP variants, these functions take a single key. This fixes an erroneous CROSSSLOT error when passing a count to a cluster enabled server. Fix compiler warning in restoreCommand Sentinel command renaming: config rewriting. Sentinel: make SENTINEL SET able to handle different arities. Sentinel command renaming: fix CONFIG SET after refactoring. Sentinel command renaming: base machanism implemented. Sentinel command renaming: rename-command option parsing. Sentinel command renaming: implement SENTINEL SET. Sentinel command renaming: fix CONFIG SET event logging. Sentinel command renaming: use case sensitive hashing for the dict. Sentinel command renaming: document it into sentinel.conf. Sentinel: drop the renamed-command entry in a more natural way. Instead of telling the user to set the renamed command to "" to remove the renaming, to the obvious thing when a command is renamed to itself. So if I want to remove the renaming of PING, I just rename it to PING again. Fixed replication authentication with whitespace in password Addressed comments Sentinel: fix SENTINEL SET error reporting. Thanks to @shenlongxing for reporting the problem. Related to #5062. Remove black space. Fix type of argslen in sendSynchronousCommand(). Related to #5037. Sentinel: test command renaming feature. test suite infra improvements and fix * fail the test (exit code) in case of timeout. * add --wait-server to allow attaching a debugger * add --dont-clean to keep log files when tests are done add defrag hint support into jemalloc 5 add active defrag support for streams CLIENT ID implemented. Take clients in a ID -> Client handle dictionary. CLIENT UNBLOCK implemented. minor fix in creating a stream NACK for rdb and defrag tests Update help.h Update t_stream.c CLIENT UNBLOCK: support unblocking by error. Add unblock in CLIENT HELP. Make CLIENT HELP output nicer to the eyes. Added link to Google Group fix some compile warnings clients: show pubsub flag in client list clients: add type option for client list Add --no-auth-warning help message. Rax library updated (node callback). Avoid -Woverlength-strings compile warning. Using another fprintf call to output the rest help message. Don't output password warning message when --no-auth-warning is used. Fix code format issue. Fix trailing white space. Check if password is used on command line interface. Change CLIENT LIST TYPE help string. Making it more similar to KILL. cluster.tcl: Add master consecutively down test. fix server.repl_down_since resetting, so that slaves could failover automatically as expected. Update the comment move get clients max buffer calculate into info clients command Remove unnecessary return statements Signed-off-by: charpty <charpty@gmail.com> Adds MODULE HELP and implements addReplySubSyntaxError Globally applies addReplySubSyntaxError Capitalizes subscommands Applies addReplySubSyntaxError to stream commands fix empty string for sentinel rename-command fix tests/test_helper.tcl with --wait-server option. Issue #5063 added --wait-server option, but can not work. addReplySubSyntaxError() renamed to addReplySubcommandSyntaxError(). Set repl_down_since to zero on state change. PR #5081 fixes an "interesting" bug about Redis Cluster failover but in general about the updating of repl_down_since, that is used in order to count the time a slave was left disconnected from its master. While the fix provided resolves the specific issue, in general the validity of repl_down_since is limited to states that are different than the state CONNECTED, and the disconnected time is set when the state is DISCONNECTED. However from CONNECTED to other states, the state machine must always go to DISCONNECTED first. So it makes sense to set the field to zero (since it is meaningless in that context) when the state is set to CONNECTED. Clarify the pending_querybuf field of clients. Localtime function skeleton and file added. Localtime: basics initial calculations. Year missing. Localtime: compute year, month and day of the month. Localtime: fix timezone adjustment. Localtime: day of month is 1 based. Convert from 0 based "days". Localtime: add a test main() function to check the output. Fix indentation. Localtime: fix daylight time documentation and computation. Localtime: fix daylight saving adjustment. Use * not +. Localtime: fix comment about leap year. Localtime: clarify is_leap_year() working with comments. Cache timezone and daylight active flag for safer logging. With such information will be able to use a private localtime() implementation serverLog(), which does not use any locking and is both thread and fork() safe. Use nolocks_localtime() for safer logging. fix compile warning in addReplySubcommandSyntaxError CLIENT UNBLOCK: fix client unblock help message. fixing broken link in CONTRIBUTING http://help.github.com/send-pull-requests/ is no longer supported this change modifies the link to the working one https://help.github.com/articles/creating-a-pull-request/ redis-cli: cliConnect() flags CC_FORCE and CC_QUIET. We need CC_QUIET in order to fix #5096 by silently failing if needed. redis-cli: fix #5096 double error message. redis-cli: fix #4990 additional argument in help. Bugfix: PEL is incorrect when consumer is blocked using xreadgroup with NOACK option. Save NOACK option into client.blockingState structure. Streams: fix xreadgroup crash after xgroup SETID is sent. For issue #5111. Capitalizes subcommands & orders lexicographically limit the size of pending-querybuf in masterclient Improve style of PR #5084. fix whitespace in redis-cli.c Streams: fix unblocking logic into a consumer group. When a client blocks for a consumer group, we don't know the actual ID we want to be served: other clients blocked in the same consumer group may be served first, so the consumer group latest delivered ID changes. This was not handled correctly, all the clients in the consumer group were unblocked without data but the first. Streams: send an error to consumers blocked on non-existing group. To detect when the group (or the whole key) is destroyed to send an error to the consumers blocked in such group is a problem, so we leave the consumers listening, the sysadmin is free to create or destroy groups assuming she/he knows what to do. However a client may be blocked in a given consumer group, that is later destroyed. Then the stream receives new elements. In that case there is no sane behavior to serve the consumer... but to report an error about the group no longer existing. More about detecting this synchronously and why it is not done: 1. Normally we don't do that, we leave clients blocked for other data types such as lists. 2. When we free a stream object there is no longer information about what was the key it was associated with, so while destroying the consumer groups we miss the info to unblock the clients in that moment. 3. Objects can be reclaimed in other threads where it is no longer safe to do client operations. Streams: make blocking for > a truly special case. To simplify the semantics of blocking for a group, this commit changes the implementation to better match the description we provide of conusmer groups: blocking for > will make the consumer waiting for new elements in the group. However blocking for any other ID will always serve the local history of the consumer. However it must be noted that the > ID is actually an alias for the special ID ms/seq of UINT64_MAX,UINT64_MAX. Streams: fix new XREADGROUP sync logic. Streams: fix typo "consumer". fix repeat argument issue and reduce unnessary loop times for redis-cli. Active defrag fixes for 32bit builds (again) * overflow in jemalloc fragmentation hint to the defragger Simplify duplicated NACK #5112 fix. We don't really need to distinguish between the case the consumer is the same or is a different one. Streams: when re-delivering because of SETID, reset deliveries counter. We don't want to increment the deliveries here, because the sysadmin reset the consumer group so the desire is likely to restart processing, and having the PEL polluted with old information is not useful but probably confusing. Related to #5111. Add regression test for #5111. Delete unused role checking. When check rdb file, it is unnecessary to check role. Cluster Manager: auth support (-a argument). Redis-trib deprecated: it no longer works and it outputs a warning to the user. Cluster Manager: more checks on --cluster-weight option. Add test in slowlog.tcl Fix config set slowlog-log-slower-than and condition in createLatencyReport Test: XDEL basic test. update leap year comment Test: XDEL fuzz testing, stream creation. Test: add lshuffle in the Tcl utility functions set. Test: fix lshuffle by providing the "K" combinator. Test: XDEL fuzz testing. Remove and check stage. Accept write commands if persisting is disabled, event if we do have problems persisting on disk previously. Streams: using streamCompareID() instead of direct compare. Streams: add streamCompareID() declaration in stream.h. Streams: using streamCompareID() instead of direct compare in block.c. bugfix in sdsReqType creating 64bit sds headers on 32bit systems remove one ineffective loop in dictGetSomeKeys. Streams: show last id for streams and groups Update dict.c change coding style. Streams: free lp if all elements are deleted Modify XINFO field from last-id to last-generated-id. Related to #5129. Add a few comments to streamIteratorRemoveEntry(). Streams: correctly propagate xdel if needed slave buffers were wasteful and incorrectly counted causing eviction A) slave buffers didn't count internal fragmentation and sds unused space, this caused them to induce eviction although we didn't mean for it. B) slave buffers were consuming about twice the memory of what they actually needed. - this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time but networking.c not storing more than 16k (partially fixed recently in 237a387). - besides it wasn't able to store half of the new string into one buffer and the other half into the next (so the above mentioned fix helped mainly for small items). - lastly, the sds buffers had up to 30% internal fragmentation that was wasted, consumed but not used. C) inefficient performance due to starting from a small string and reallocing many times. what i changed: - creating dedicated buffers for reply list, counting their size with zmalloc_size - when creating a new reply node from, preallocate it to at least 16k. - when appending a new reply to the buffer, first fill all the unused space of the previous node before starting a new one. other changes: - expose mem_not_counted_for_evict info field for the benefit of the test suite - add a test to make sure slave buffers are counted correctly and that they don't cause eviction Hopefully improve commenting of #5126. Reading the PR gave me the opportunity to better specify what the code was doing in places where I was not immediately sure about what was going on. Moreover I documented the structure in server.h so that people reading the header file will immediately understand what the structure is useful for. dict.c: remove a few trailing spaces. fix rare replication stream corruption with disk-based replication The slave sends \n keepalive messages to the master while parsing the rdb, and later sends REPLCONF ACK once a second. rarely, the master recives both a linefeed char and a REPLCONF in the same read, \n*3\r\n$8\r\nREPLCONF\r\n... and it tries to trim two chars (\r\n) from the query buffer, trimming the '*' from *3\r\n$8\r\nREPLCONF\r\n... then the master tries to process a command starting with '3' and replies to the slave a bunch of -ERR and one +OK. although the slave silently ignores these (prints a log message), this corrupts the replication offset at the slave since the slave increases the replication offset, and the master did not. other than the fix in processInlineBuffer, i did several other improvments while hunting this very rare bug. - when redis replies with "unknown command" it includes a portion of the arguments, not just the command name. so it would be easier to understand what was recived, in my case, on the slave side, it was -ERR, but the "arguments" were the interesting part (containing info on the error). - about a year ago i added code in addReplyErrorLength to print the error to the log in case of a reply to master (since this string isn't actually trasmitted to the master), now changed that block to print a similar log message to indicate an error being sent from the master to the slave. note that the slave is marked as CLIENT_SLAVE only after PSYNC was received, so this will not cause any harm for REPLCONF, and will only indicate problems that are gonna corrupt the replication stream anyway. - two places were c->reply was emptied, and i wanted to reset sentlen this is a precaution (i did not actually see such a problem), since a non-zero sentlen will cause corruption to be transmitted on the socket. Streams: return an error message if using xreadgroup with '$' ID. Redis will always return an empty result when '$' ID is specified with xreadgroup command, it's meaningless. Streams: remove meaningless if condition. It's already checked if xreadgroup is set and groupname is NULL. Panic when we are sending an error to our master/slave. Related to #5135, see discussion there. Streams: better error when $ is given with XREADGROUP. make active defrag test more stable on slower machines, the active defrag test tended to fail. although the fragmentation ratio was below the treshold, the defragger was still in the middle of a scan cycle. this commit changes: - the defragger uses the current fragmentation state, rather than the cache one that is updated by server cron every 100ms. this actually fixes a bug of starting one excess scan cycle - the test lets the defragger use more CPU cycles, in hope that the defrag will be faster, but also give it more time before we give up. Refine comment in addReplyErrorLength() about replying to masters/slaves. See #5135 for some context. In addReplyErrorLength() only panic when replying to slave. See #5135 for more context. Make vars used only by INFO CLIENTS local to the block. Related to #4727. clientsCronTrackExpansiveClients() skeleton and ideas. clientsCronTrackExpansiveClients() actual implementation. Implement a function to retrieve the expansive clients mem usage. Rename var in clientsCronTrackExpansiveClients() for clarity. Change INFO CLIENTS sections to report pre-computed max/min client buffers. Fix wrong array index variable in getExpansiveClientsInfo(). Rename INFO CLIENT max buffers field names for correctness. They are actually delayed a few seconds, so let's call them "recent". Clarify that clientsCronTrackExpansiveClients() indexes may jump ahead. Top comment clientsCron(). Fix typo fix typo Update redis-trib.rb placeholder from unstable. Merge stuff from unstable.
- Loading branch information