New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] release connections in CLOSE_WAIT & CON_STATE_READ_POST state #115
Conversation
Please describe your issue with (much) more detail. Your proposed "solution" is not valid. Both proposed changes are wrong. |
Hi gstrauss, this is another issue found on CentOS-8 with lighttpd-1.14.66.
static void connection_check_timeout (connection * const con, const unix_time64_t cur_ts) {
static unsigned int xxx = 0;
xxx++;
if (xxx < 10 && cur_ts - con->read_idle_ts > 3600) {
log_error(r->conf.errh, __FILE__, __LINE__,
"connection not closed : fd %d, state %d",
con->fd, (int)r->state);
}
}
|
| @@ -1071,7 +1071,7 @@ connection_revents_err (request_st * const r, connection * const con) | |||
| ~(FDEVENT_STREAM_REQUEST_BUFMIN|FDEVENT_STREAM_REQUEST_POLLIN); | |||
| r->conf.stream_request_body |= FDEVENT_STREAM_REQUEST_POLLRDHUP; | |||
| con->is_readable = 1; /*(can read 0 for end-of-stream)*/ | |||
| if (chunkqueue_is_empty(con->read_queue)) r->keep_alive = 0; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong. If there is data that has been read from the client socket, but not yet processed, lighttpd attempts to process that data before considering the read side of the connection to have been shutdown (SHUT_RD)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong. If there is data that has been read from the client socket, but not yet processed, lighttpd attempts to process that data before considering the read side of the connection to have been shutdown (SHUT_RD)
Is r->keep_alive for "HTTP/1.1 keepalive" only? No other function checks this variable except connection_handle_response_end_state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Wrong again. git log -G 'r->keep_alive'
| @@ -1438,7 +1438,7 @@ static void connection_check_timeout (connection * const con, const unix_time64_ | |||
| if (changed) | |||
| con->is_readable = 0; | |||
| } | |||
| else if (waitevents & FDEVENT_IN) { | |||
| else if ((waitevents & FDEVENT_IN) || r->state == CON_STATE_READ_POST || r->state == CON_STATE_READ) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong. If lighttpd is not waiting to read from the socket, there should not be a read timeout. If lighttpd is streaming the request body to the backend, then lighttpd might have connection in state CON_STATE_READ_POST.
|
CentOS 8 was end-of-lifed 2021-12-31. Is there a (good) reason why you are wasting so much of your time on old, dead, and buried OS distributions? |
Consider enabling lighttpd.conf debugging to capture the requests that later end up in this state. Consider |
|
If a connection is in CON_STATE_READ_POST, but lighttpd is not waiting to read on the socket, then something else has hung. You have not shared your lighttpd.conf, so I do not know if you you are using Here's a strong hint: your problems are likely with your ancient software or poorly written backends, rather than issues in lighttpd. |
Sorry for misinformation, actually the OS is rocky-linux-8. |
It is an important service, I can not interrupt it. |
mod_fastcgi and php-7.4 is used. At the time lighttpd had hundreds of CLOSE_WAIT sockets, no alive unix socket was established to "/tmp/php-fastcgi.sock", |
|
server.stream-request-body is not set. |
That is gross sloppiness and wastes my time. You suggested patches are wrong, so I am closing this PR. Please see lighttpd forum post: How to get support You seem to have missed direct hints such as "You have not shared your lighttpd.conf", and other config settings that I have posted above. Maybe look into using gdb to get a core dump of your running process so that you can further analyze the (connection *) of the stuck connections. |
|
This might help your aborted uploads. |
handle RDHUP as soon as RDHUP detected when collecting HTTP/1.1 chunked request body (and when not streaming request body to backend) x-ref: #115
Many CLOSE_WAIT connetions are seen on live sytem, and there connections will never been closed.
Extra Logs in connection_check_timeout function show that these connections are in CON_STATE_READ_POST. state
It is likely FDEVENT_IN is unset after FDEVENT_RDHUP event is received, and no other code will check these connections afterwards
This change treats this case as "connection read-timeout".