-
Notifications
You must be signed in to change notification settings - Fork 36.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init: Fixes for file descriptor accounting #16003
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
4b2a7d6
to
1a297d2
Compare
amount of descriptors requested [init] initialize more file descriptors based on number of bind interfaces, MAX_OUTBOUND_CONNECTIONS, and CConnman::Options:nMaxFeeler [init] add verbose output for FD limits, cleanup lines. [init] cleanup LogPrintF to print FDs (oops), use better variable semantics, include MAX_ADDNODE_CONNECTIONS in nUserMinConnections [init] addnode connections don't count towards maxconnections [init] correct soft limit for MAX_OUTBOUND_CONNECTIONS [init] connections are softcapped at 0 [init] fix up descriptor commits, remove unused variables, add verbosity [init] fix up strings slightly [init] small refactor for better verbosity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference IRC discussion, IIUC: http://www.erisian.com.au/bitcoin-core-dev/log-2019-05-10.html#l-127
Tests that show the expected behavior and verify the issue/regression would be good here if possible.
Two requests to simplify review:
|
@Empact Thanks. I thought it would be crazy to open two PRs, because bug 1 fixes something for sure, and bug 2 isn't really reproducible. Any bad behavior from bug 2 could be confused for bug 1, and it would be hard to tell which one it is. Eg, things could go south if not enough FDs were allocated (maybe allocated is the wrong word) or not enough were asserted. And since file descriptor problems via threads are hard to reproduce, I thought it made sense to mash it all together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking for more suggestions here. I'm not quite sure how to logically split this into multiple PRs, so I'll leave that for now.
This commit certainly isn't perfect, but is essentially an open improvement to enforce/allocate a slightly more accurate number of file descriptors. If the user has an absurdly low system setting, we lower the number of maximum inbound connections until it's no longer possible, instead of possibly passing with an unsafe amount.
One note: I've included mmaped things as well, because they require a temporary handle to obtain the mapping. Not sure if this was overkill but it seemed like a good idea.
I also have a feeling that the number of bind arguments isn't always equal to the number of interfaces bitcoind will listen on, so that's probably a bad assumption.
Any additions or ideas on how to fix this up is appreciated. Thanks :)
return InitError(strprintf(_("Not enough file descriptors available. %d available, %d required."), nFD, nFDMin)); | ||
|
||
// Calculate new -maxconnection count. Note: std::min<int>(...) to work around FreeBSD compilation issue described in #2695 | ||
nMaxConnections = std::min<int>(nMaxConnections, nFD - nFDMin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need <int>
here? Both arguments are int
now. But I didn't want to cover up this old fix.
init: nMaxConnections and (nFD-nFDMin) cannot be negative, don't check against 0 init: try to fix Travis note about deducing types init: try to fix travis again "deduced conflicting types" init: try a better way through the Travis problem init: revert old init warning for drahtbot net: name nit "NUM"->"MAX"_FEELER_CONNCTIONS init: name nit "NUM"->"MAX"_FEELER_CONNCTIONS init: fixup typo
Maybe we can just implement |
I figured I should supply some pseudocode to show what happens differently in the code and how it's related to both Bug 1 and 2. There's a bit of redundant code, and lots of constants from other places, so I tried to reduce the amount of references needed to actually understand what's going on. Suppose you are running Mac OS X and listening on the network interface. The ulimit is 256. Suppose you also have 1 bind interface Current code
Usable FDs: This PR
Usable FDs: |
Needs rebase |
Hi @tryphe - Wondering if you are going to continue with this PR or whether it should be closed. It looks like this has needed a rebase for some time. I'd also suggest you squash your commits in accordance with the CONTRIBUTING guidelines and add tests as suggested above to verify the issue. |
Will close this for now. However I think the changes are worth following up on, so will open a new good first issue. Will also extract one change that can just be merged now. |
e3047ed test: use p2p constants in denial of service tests (fanquake) 25d8264 p2p: add MAX_FEELER_CONNECTIONS constant (tryphe) Pull request description: Extracted from #16003. ACKs for top commit: naumenkogs: utACK e3047ed Tree-SHA512: 14fc15292be4db2e825a0331dd189a48713464f622a91c589122c1a7135bcfd37a61e64af1e76d32880ded09c24efd54d3c823467d6c35367a380e0be33bd35f
e3047ed test: use p2p constants in denial of service tests (fanquake) 25d8264 p2p: add MAX_FEELER_CONNECTIONS constant (tryphe) Pull request description: Extracted from bitcoin#16003. ACKs for top commit: naumenkogs: utACK e3047ed Tree-SHA512: 14fc15292be4db2e825a0331dd189a48713464f622a91c589122c1a7135bcfd37a61e64af1e76d32880ded09c24efd54d3c823467d6c35367a380e0be33bd35f
Thanks for following up @adamjonas @fanquake! I certainly don't think this is the most correct way to proceed, especially with the large scope of changes with new sockets and threads. But it's closer than what we had before, in terms of knowing the [required, requested, allocated] counts of FDs. I think it makes much more sense to modularize the summing of FDs and make it easier to obtain the counts with future changes. Maybe something like a Will ping #18911 so they can see this message. |
Bug 1: If the daemon does not have enough file descriptors (we'll call them FDs), it is not asserted.
The daemon only aborts if
FD count < MIN_CORE_FILEDESCRIPTORS(150)
.Bug 2: (Unsure how to reproduce undefined behavior) If the system ulimit setting is a low number, the daemon does not allocate the right amount of FDs during init via
RaiseFileDescriptorLimit
. But it's close.Steps to produce Bug 1: Run
ulimit -n 150
, then startbitcoind
in the same shell.Result: bitcoind runs normally and displays an erroneous warning:
Expected: bitcoind should not start, as it needs roughly 163
172(?) FDs.-
MIN_CORE_FILEDESCRIPTORS = 150
-
MAX_ADDNODE_CONNECTIONS = 8
(new in assertion)-MAX_OUTBOUND_CONNECTIONS = 8
-(Edit: these exist within the bounds of -maxconnections, if it's high enough)CConnman::Options:nMaxFeeler = 1
-Number of
-bind
interfaces, default = 1 (new in allocation)-Number of
-rpcthreads
, default = 4 (new in allocation and assertion)150 + 8 + 1 + 4 = 163 minimum
+125 maxconnections = 288 speculative maximum
Coverage with this patch +
ulimit -n 150
:daemon closes
Coverage with this patch +
ulimit -n 163
:Coverage with this patch +
ulimit -n 203
:Coverage with this patch + 1024 ulimit (matches most Linux systems)
The replaced (old) arithmetic smells strange if you consider the subtraction can subtract negative values to increase their value, and any unintended behavior will then be silenced by std::max(x,0). Now there should be less likelyhood of suppressed failure.
Possible fix for #14870
Edit: thanks to IRC chat for the tips!