-
Notifications
You must be signed in to change notification settings - Fork 23.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
redis bug report about redis crash (BGREWRITEAOF) #2897
Comments
Hello, it looks like wat3() returned with an error, causing the first conditional in
Since there is no rdb_child_pid active so it's set to -1. However with I wonder if @oranagra have any additional idea about this... |
My guess was that wait3() with WNOHANG could never return -1 and an error. However issue #2897 may possibly indicate that this could happen under non clear conditions. While we try to understand this better, better to handle a return value of -1 explicitly, otherwise in the case a BGREWRITE is in progress but wait3() returns -1, the effect is to match the first branch of the if/else block since server.rdb_child_pid is -1, and call backgroundSaveDoneHandler() without a good reason, that will, in turn, crash the Redis server with an assertion.
My guess was that wait3() with WNOHANG could never return -1 and an error. However issue #2897 may possibly indicate that this could happen under non clear conditions. While we try to understand this better, better to handle a return value of -1 explicitly, otherwise in the case a BGREWRITE is in progress but wait3() returns -1, the effect is to match the first branch of the if/else block since server.rdb_child_pid is -1, and call backgroundSaveDoneHandler() without a good reason, that will, in turn, crash the Redis server with an assertion.
My guess was that wait3() with WNOHANG could never return -1 and an error. However issue #2897 may possibly indicate that this could happen under non clear conditions. While we try to understand this better, better to handle a return value of -1 explicitly, otherwise in the case a BGREWRITE is in progress but wait3() returns -1, the effect is to match the first branch of the if/else block since server.rdb_child_pid is -1, and call backgroundSaveDoneHandler() without a good reason, that will, in turn, crash the Redis server with an assertion.
wow, this is a hard one to follow. I think it is reasonable to believe that wait3 can return an error even if it is being told not to wait. in case this is a temporary error, your fix will work, but in case this is a permanent failure, panic maybe be better. |
It happens periodicaly. Example success:
Example error:
do not stop Sorry for my English. |
Thanks, looks like @oranagra and I were right, wait3() returns an error and specifically ECHILD. From the Linux documentation:
But wait3() should alias to |
After further investigation, what I think is that in the user system, for some strange reason, SIGCHLD is ignored instead of not having an handler as it is by default. So a much simpler fix would be just to make sure SIGCHLD handler is set to SIG_DFL. However this happens only starting from the second time a child exit AFAIK, since the first AOF rewrite works as expected. Note that another option would be wait3() returning -1 because there are no longer children, however as you can see from the error logs, there is an AOF child active since |
@betonimig please could you try to edit the sources of Redis 3.0.5 (the one you are using) in order to make this change and check if it fixes the issue? Now if you edit the file
You should add a single line so that the final code looks like this:
Thanks. Also would be great to have a few details about the Linux system you are using. |
The following program shows the behavior: #include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <signal.h>
int main(int argc, char **argv) {
int ign;
if (argc == 1) {
printf("usage: %s [ign|dfl]\n",argv[0]);
exit(1);
} else if (!strncmp(argv[1],"ign")) {
ign = 1;
} else if (!strncmp(argv[1],"dfl")) {
ign = 0;
} else {
printf("Argument must be 'ign' or 'dfl'\n");
exit(1);
}
signal(SIGCHLD, ign ? SIG_IGN: SIG_DFL);
pid_t childpid = fork();
if (childpid == -1) {
perror("fork");
exit(1);
}
if (childpid > 0) {
int statloc;
while(1) {
pid_t res = wait3(&statloc,WNOHANG,NULL);
printf("wait3 returned %d\n", (int)res);
if (res == -1) {
perror("wait3");
}
if (res != 0) break;
sleep(1);
}
} else {
sleep(2);
}
return 0;
} Output:
|
A few more questions for @betonimig (thanks for helping). Please could you provide a Also I wonder if you had this problem in the past or is happening after an upgrade. The more context you could provide the better for us, it's a really strange problem that should not happen at all and we never saw this in the past. Thanks. |
Asking for help to Stack Overflow in the meantime: http://stackoverflow.com/questions/33994543/wait3-waitpid-alias-returns-1-with-errno-set-to-echild-when-it-should-not |
Linux 3.2.0-90-generic #128-Ubuntu SMP Fri Aug 14 21:43:58 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
27849 and 27850 are threads. I found no zombies in process list. I will try your code with setting default handler on SIGCHLD and will report result later. |
Thanks. Please could you send the full log of the Redis instance to me via
|
@betonimig before you retry, I'm creating a branch with a different change that could help us. Posting it ASAP. |
Thanks a lot for sending the Redis log @betonimig. I just published here on Github a new branch called |
Thanks a lot you for your work and help! |
Also by modifying the above PoC to dump sigaction status you will note some interesting facts: #include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <signal.h>
void getchldaction(const char *desc) {
struct sigaction sa;
sigaction(SIGCHLD, NULL, &sa);
printf("%s HANDLER:\n", desc);
printf(" handler = 0x%lx\n", sa.sa_handler);
printf(" action = 0x%lx\n", sa.sa_sigaction);
printf(" mask = 0x%lx\n", sa.sa_mask);
printf(" flags = 0x%lx\n", sa.sa_flags);
printf(" restorer= 0x%lx\n", sa.sa_restorer);
}
int main(int argc, char **argv) {
struct sigaction sa;
int hdl = 0;
if (argc == 1) {
printf("usage: %s [ign|dfl|cus]\n",argv[0]);
exit(1);
} else if (!strncmp(argv[1],"ign")) {
hdl = 1;
} else if (!strncmp(argv[1],"dfl")) {
hdl = 0;
} else if (!strncmp(argv[1],"cus")) {
hdl = 2;
} else {
printf("Argument must be 'ign' or 'dfl'\n");
exit(1);
}
getchldaction("OLD");
if (hdl == 2) {
sa.sa_handler = SIG_IGN;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
if (sigaction(SIGCHLD, &sa, 0) == -1) {
perror(0);
exit(1);
}
}
else {
signal(SIGCHLD, hdl ? SIG_IGN: SIG_DFL);
}
getchldaction("NEW");
pid_t childpid = fork();
if (childpid == -1) {
perror("fork");
exit(1);
}
if (childpid > 0) {
int statloc;
while(1) {
pid_t res = wait3(&statloc,WNOHANG,NULL);
printf("wait3 returned %d\n", (int)res);
if (res == -1) {
perror("wait3");
}
if (res != 0) break;
sleep(1);
}
} else {
sleep(2);
}
return 0;
}
|
Thanks @giusc, also worth to note that not calling |
The problem persists:
|
@betonimig How do you start Redis? Is it via systemd/upstart/runit or something similar? Maybe the parent process (the one that starts redis) sets a signal mask that is inherited by redis (and that signal mask ignores SIGCHLD)? |
Another idea for debugging - log the parent PID of the AOF-rewrite child process just before it exits; if the parent PID is for some reason not redis' PID then the SIGCHLD goes elsewhere. Also the reason I asked about upstart is that if you use the |
I start redis via upstart.
I'll try to run directly without upstart |
Thanks @dubek and @betonimig, probably is upstart indeed. It does |
If you want to use upstart, I suggest to configure Redis to not daemonize (use |
@dubek good advices. But what about also setting up signal handling in Redis so that it overrides the change performed by the parent? Still upstart will expect fork and maybe will misbehave but should avoid this exact problem. Just waiting acknowledge from @betonimig that not using upstart fixes the problem in order to work to this fix. |
@antirez I think what happens here is that (Note: I haven't tested this theory. I'm also don't know whether @betonimig is running with Then again, there might be other things that you can do in Redis which are OK (maybe for other situations), like the |
@dubek interesting, so maybe upstart uses |
without upstart:
config:
|
Hmm, there goes my upstart theory... Another shot in the dark: @betonimig do you happen to run with SELinux enabled? If I understand correctly, SELinux can enforce very granular permissions, including whether a process is allowed to send/receive specific types of signals. If so, can you check in SELinux's |
Thanks @betonimig, I was ready to bet it was due to upstart! So are you now starting Redis just with a normal init file? The configuration is fine AFAIK btw, it's a more system-wide problem for sure, not just config. |
p.s. I'll update the special branch in order to print more debugging information and to set the signal handlers in a more precise way. |
@betonimig another thing, are you able to re-trigger this issue easily by issuing, manually, multiple times, BGREWRITEAOF? This way you could test if it happens again in a trivial way just starting a different Redis instance in the same machine, without affecting your operations at all, and calling two times BGREWRITEAOF spaced by a few seconds. |
Now I start with:
If I run |
@betonimig is there anything weird in |
I logged parent process PID. Two backups was successful and the third one failed. Parent PID is correct in all cases. See logs. SELinux is not enabled. Nothing weird in syslog. I'm going to reinstall system probably it is the shortes way to solve the problem.
|
One hypothesis that fits the symptoms is that another thread successfully calls wait3() in a race with serverCron(). Could sentinelCollectTerminatedScripts() or stopAppendOnly() (or a redundant serverCron()) be to blame here? Sorry, I haven't looked closely enough to see if those routines are called from within the server process. Another hypothesis, more desperate, is that some sort of external code we can't see — a lua EVAL or LD_PRELOAD'd library — either sets the server's SIGCHLD disposition to SIG_IGN or calls waitpid(-1, ...). (You could easily imagine an EVAL routine calling some popen() or system() equivalent which naively called wait().) It'd be easy to detect the former, but hard to catch the latter. If this is a continuing problem for @betonimig , I'd suggest a branch change that:
|
Interesting suggestion @pilcrow . I looked at the Lua code and it calls libc's So it needs to be an external Lua library (implemented natively in C) loaded into Redis to actually execute |
@dubek , sure, or the Lua script EVALuated is a complicated beast that uses lua-subprocess (which IIUC naively calls |
@pilcrow the threads only do basic system calls, they never call the code paths that the main thread is calling, so it's very unlikely it's a race due to different threads AFAIK. About the other hypothesis:
Since we never saw this in the past, with so many Redis instances configured with AOF, I believe it must be some specific thing with this instance, or a bug that materializes only with some very specific set of configurations, kernel, libc, combo. |
My guess was that wait3() with WNOHANG could never return -1 and an error. However issue redis#2897 may possibly indicate that this could happen under non clear conditions. While we try to understand this better, better to handle a return value of -1 explicitly, otherwise in the case a BGREWRITE is in progress but wait3() returns -1, the effect is to match the first branch of the if/else block since server.rdb_child_pid is -1, and call backgroundSaveDoneHandler() without a good reason, that will, in turn, crash the Redis server with an assertion.
The text was updated successfully, but these errors were encountered: