-
Notifications
You must be signed in to change notification settings - Fork 36.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: avoid fd exhaustion #12274
http: avoid fd exhaustion #12274
Conversation
Responses are usually added to the response queue by worker threads. As an optimization, and in order to close connections as quickly as possible, allow for immediate replies (bypassing the queue) when there is no need for a worker to service the request.
No functional change, just preparation for the next commits.
Rather than potentially accepting an unbounded number of connections, give them a ceiling of twice the size of the processing queue. Once the ceiling is reached, new connections will not be accepted until the count has fallen back under the limit. Note that file descriptors are removed from the limiter in the request's close callback.
If we're unable to service a request, close the connection rather than potentially keeping it alive longer. This helps to be slightly more fair to new connections.
This avoids assigning file descriptors faster than we can close them. It is the same as using an evconnlistener with the LEV_OPT_DEFERRED_ACCEPT flag, which is only available with libevent 2.1+.
Libevent 2.2 adds a callback allowing us to drop connections as they come in. Use it when possible.
@@ -288,14 +376,19 @@ static void http_request_cb(struct evhttp_request* req, void* arg) | |||
if (i != iend) { | |||
std::unique_ptr<HTTPWorkItem> item(new HTTPWorkItem(std::move(hreq), path, i->handler)); | |||
assert(workQueue); | |||
if (workQueue->Enqueue(item.get())) | |||
if (workQueue->Enqueue(item.get())) { | |||
// Disable reading to work around a libevent bug, fixed in 2.2.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this block be wrapped in a LIBEVENT_VERSION_NUMBER macro guard so it only occurs in the affected version range?
Seems like it would be easier for future cleanups to see where the version macro is checked, say in the instance that a minimum version of libevent is defined and code like this could be safely removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just move-only from above. There's no need to guard it with an ifdef because there's no api difference, it's purely a runtime check.
|
||
const unsigned int m_limit; | ||
std::vector<evconnlistener*> m_listeners; | ||
std::set<evutil_socket_t> m_sockets; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that it probably matters in most configurations where the fd limit is ~1K, but making this an unordered_set would be slightly beneficial in the high fd limit cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
This needs to be tested by those experiencing the issue: @KanoczTomas @vii |
src/httpserver.cpp
Outdated
@@ -300,6 +300,22 @@ static void connection_close_cb(evhttp_connection* conn, void *arg) | |||
} | |||
} | |||
|
|||
#if LIBEVENT_VERSION_NUMBER >= 0x02020001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome that you're using the callback - one way to unambiguously detect the presence of the new feature would be to add a new autoconf test maybe using AC_LINK_IFELSE for evhttp_set_newreqcb
Awesome! The defer accept tweak will defend against typical inadvertent denial of services Note that the original issue can still be activated depending on the file descriptor limit that is set, as the budgeting in the daemon does not account for this usage or the usage in the db code - see To test it, something like this slowhttptest program
or another tool which can simulate a slowloris (HTTP connection header trickling) attack |
Sorry, I'm removing the 0.16.0 milestone here. As it introduces many new concepts and work-arounds it is hard to review last-minute, the problem is hard to trigger in the first place, and it is not a regression since 0.15. Also according to @vii it doesn't solve the issue completely. I think getting the fd issue fixed properly should be a focus for 0.16.1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just a superficial check, I can do another pass later in more detail. I'm curious if you can reproduce the fd exhaustion issue.
g_limiter = MakeUnique<ConnectionLimiter>(std::move(listeners), workQueueDepth * 2); | ||
evhttp_set_gencb(http, http_request_cb, g_limiter.get()); | ||
|
||
#if LIBEVENT_VERSION_NUMBER >= 0x02020001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to see this defined separately as:
#if LIBEVENT_VERSION_NUMBER >= 0x02020001
#define HAVE_EVHTTP_SET_NEWREQCB 1
#endif
That way the check here is more obvious, particularly since you do the check elsewhere. Of course, even better than this would be to use AC_CHECK_FUNC()
...
{ | ||
bool ret = false; | ||
#ifdef TCP_DEFER_ACCEPT | ||
static constexpr int set = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the first time I've seen static constexpr
used correctly in the Bitcoin code, congratulations.
@eklitzke I can still produce a crash with these changes on top of 2405ce1:
When last discussed with @theuni he was going to revamp these changes. |
@fanquake Interesting -- do you have a script to stress test it to induce that condition? Or does that happen during normal operation? |
The last travis run for this pull request was 175 days ago and is thus outdated. To trigger a fresh travis build, this pull request should be closed and re-opened. |
{ | ||
// Disable socket accepting if adding this connection puts us equal to the limit | ||
if (!Interrupted() && m_sockets.insert(fd).second && m_sockets.size() == m_limit) { | ||
LogPrint(BCLog::HTTP, "Suspending new connections"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All calls to LogPrintf() and LogPrint() should be terminated with \n
Needs rebase |
There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened. |
So is this the recommended fix for #11368 ? I just bumped into it and need to resolve it |
Addresses #11368. This adds a file descriptor limiter for the http server. It's possible for the open fd count to potentially exhaust the user's defined limit if not checked.
I'm having a hard time reproducing the issue now, I'm not sure what's changed in my environment. I was finally able to hit it using some pretty nasty tricks, but I'm less worried about this now for 0.16.
I do think it's worth considering taking de6cfce, 5f5099a, and (part of) 3b0ebb3 for 0.16, though, as they help significantly and are easy to review. I'm happy to break that out into a new PR if necessary.