-
Notifications
You must be signed in to change notification settings - Fork 783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mux_h1 segfault #1943
Comments
Thanks, I'm on it. I'm testing a fix. |
That's fast 😄 Well, if I get a trace before your patch I'll share it either way in case it's helpful, otherwise I'll test the fix :-) |
Clearly not fast enough :) Well, the fix works, but I'm must be sure it is the right way to fix the bug. BTW, here it the patch: diff --git a/src/mux_h1.c b/src/mux_h1.c
index 9eca08c337..a0d4db6dda 100644
--- a/src/mux_h1.c
+++ b/src/mux_h1.c
@@ -2890,6 +2890,8 @@ static int h1_send(struct h1c *h1c)
h1c->flags |= H1C_F_ERR_PENDING;
if (h1c->flags & H1C_F_EOS)
h1c->flags |= H1C_F_ERROR;
+ else if (!(h1c->wait_event.events & SUB_RETRY_RECV))
+ h1c->conn->xprt->subscribe(h1c->conn, h1c->conn->xprt_ctx, SUB_RETRY_RECV, &h1c->wait_event);
b_reset(&h1c->obuf);
}
@@ -3262,7 +3264,7 @@ struct task *h1_timeout_task(struct task *t, void *context, unsigned int state)
}
}
- if (h1c->state == H1_CS_UPGRADING) {
+ if (!se_fl_test(h1c->h1s->sd, SE_FL_ORPHAN)) {
/* Don't release the H1 connection right now, we must destroy the
* attached SC first. */
se_fl_set(h1c->h1s->sd, SE_FL_EOS | SE_FL_ERROR);
@@ -3846,6 +3848,8 @@ static int h1_snd_pipe(struct stconn *sc, struct pipe *pipe)
h1c->flags = (h1c->flags & ~H1C_F_WANT_SPLICE) | H1C_F_ERR_PENDING;
if (h1c->flags & H1C_F_EOS)
h1c->flags |= H1C_F_ERROR;
+ else if (!(h1c->wait_event.events & SUB_RETRY_RECV))
+ h1c->conn->xprt->subscribe(h1c->conn, h1c->conn->xprt_ctx, SUB_RETRY_RECV, &h1c->wait_event);
se_fl_set_error(h1s->sd);
TRACE_DEVEL("connection error", H1_EV_STRM_ERR|H1_EV_H1C_ERR|H1_EV_H1S_ERR, h1c->conn, h1s);
} |
At least it certainly seems to match some of the last outputs of the earlier crashes (which I always forget to check):
Can totally understand that; it's not a big problem for me right now anyway so no worries 👍 |
copy that ! |
alright, if it helps you decide whether the fix is correct or not, I have a backtrace+trace combo of it: backtrace
|
…ached When the H1 task timed out, we must be careful to not release the H1 conneciton if there is still a H1 stream with a stream-connector attached. In this case, we must wait. There are some tests to prevent it happens. But the last one only tests the UPGRADING state while there is also the CLOSING state with a SC attached. But, in the end, it is safer to test if there is a H1 stream with a SC attached. This patch should partially fix the issue #1943. However, it only prevent the segfault. There is another bug under the hood. BTW, this one is 2.7-specific. Not backport is needed.
The recent refactoring about errors handling in the H1 multiplexer introduced a bug on abort when the client wait for the server response. The bug only exists if abortonclose option is not enabled. Indeed, in this case, when the end of the message is reached, the mux stops to receive data because these data are part of the next request. However, error on the sending path are no longer fatal. An error on the reading path must be caught to do so. So, in case of a client abort, the error is not reported to the upper layer and the H1 connection cannot be closed if some pending data are blocked (remember, an error on sending path was detected, blocking outgoing data). To be sure to have a chance to detect the abort in the case, when an error is detected on the sending path, we force the subscription for reads. This patch, with the previous one, should fix the issue #1943. It is 2.7-specific, no backport is needed.
I pushed 2 patches that should fix this issue. |
I'll build and deploy that 👍 |
Related to issue haproxy/haproxy#1943
Fwiw no crashes since 👍 |
Thanks ! |
Detailed Description of the Problem
Discussed in #1903 and @a-denoyelle said the issue had been seen before, so just to not clutter that other issue with this unrelated one
Expected Behavior
n/a
Steps to Reproduce the Behavior
n/a
Do you have any idea what may have caused this?
No response
Do you have an idea how to solve the issue?
No response
What is your configuration?
Output of
haproxy -vv
Last Outputs and Backtraces
From #1903 (comment)
host 1
host 2
Additional Information
Already working to get H1 traces after amaury asked for these
The text was updated successfully, but these errors were encountered: