Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init: Fixes for file descriptor accounting #16003

Closed
wants to merge 4 commits into from

Conversation

tryphe
Copy link
Contributor

@tryphe tryphe commented May 10, 2019

Bug 1: If the daemon does not have enough file descriptors (we'll call them FDs), it is not asserted.
The daemon only aborts if FD count < MIN_CORE_FILEDESCRIPTORS(150).

Bug 2: (Unsure how to reproduce undefined behavior) If the system ulimit setting is a low number, the daemon does not allocate the right amount of FDs during init via RaiseFileDescriptorLimit. But it's close.

Steps to produce Bug 1: Run ulimit -n 150, then start bitcoind in the same shell.

Result: bitcoind runs normally and displays an erroneous warning:

Warning: Reducing -maxconnections from 125 to -8, because of system limitations.
...
Using at most -8 automatic connections (150 file descriptors available)

Expected: bitcoind should not start, as it needs roughly 163172(?) FDs.

-MIN_CORE_FILEDESCRIPTORS = 150
-MAX_ADDNODE_CONNECTIONS = 8 (new in assertion)
-MAX_OUTBOUND_CONNECTIONS = 8
-CConnman::Options:nMaxFeeler = 1(Edit: these exist within the bounds of -maxconnections, if it's high enough)
-Number of -bind interfaces, default = 1 (new in allocation)
-Number of -rpcthreads, default = 4 (new in allocation and assertion)

150 + 8 + 1 + 4 = 163 minimum
+125 maxconnections = 288 speculative maximum

Coverage with this patch + ulimit -n 150:

Bitcoin Core version v0.18.99.0-50ccaa56f -dirty (release build)
Error: Not enough file descriptors available. 150 available, 163 required.

daemon closes

Coverage with this patch + ulimit -n 163:

Bitcoin Core version v0.18.99.0-50ccaa56f (release build)
There are 163 file descriptors available, 163 required, 163 reserved, and 288 requested.
Warning: Reducing -maxconnections from 125 to 0, because of file descriptor limitations.
...
Using at most 0 automatic connections (163 file descriptors available)

Coverage with this patch + ulimit -n 203:

Bitcoin Core version v0.18.99.0-50ccaa56f (release build)
There are 203 file descriptors available, 163 required, 203 reserved, and 288 requested.
Warning: Reducing -maxconnections from 125 to 40, because of file descriptor limitations.
...
Using at most 40 automatic connections (203 file descriptors available)

Coverage with this patch + 1024 ulimit (matches most Linux systems)

Bitcoin Core version v0.18.99.0-50ccaa56f (release build)
There are 1024 file descriptors available, 163 required, 288 reserved, and 288 requested.
...
Using at most 125 automatic connections (1024 file descriptors available)

The replaced (old) arithmetic smells strange if you consider the subtraction can subtract negative values to increase their value, and any unintended behavior will then be silenced by std::max(x,0). Now there should be less likelyhood of suppressed failure.

Possible fix for #14870

Edit: thanks to IRC chat for the tips!

@tryphe tryphe changed the title [init] an incorrect amount of file descriptors is requested, and a different amount is also asserted init: an incorrect amount of file descriptors is requested, and a different amount is also asserted May 10, 2019
@DrahtBot DrahtBot added the P2P label May 10, 2019
@DrahtBot
Copy link
Contributor

DrahtBot commented May 10, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #16362 (Add bilingual_str type by hebasto)
  • #15759 ([p2p] Add 2 outbound blocks-only connections by sdaftuar)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@tryphe tryphe force-pushed the fd-limits-3 branch 2 times, most recently from 4b2a7d6 to 1a297d2 Compare May 10, 2019 13:27
amount of
descriptors requested

[init] initialize more file descriptors based on number of bind
interfaces, MAX_OUTBOUND_CONNECTIONS, and
CConnman::Options:nMaxFeeler

[init] add verbose output for FD limits, cleanup lines.

[init] cleanup LogPrintF to print FDs (oops), use better variable
semantics, include MAX_ADDNODE_CONNECTIONS in nUserMinConnections

[init] addnode connections don't count towards maxconnections

[init] correct soft limit for MAX_OUTBOUND_CONNECTIONS

[init] connections are softcapped at 0

[init] fix up descriptor commits, remove unused variables, add verbosity

[init] fix up strings slightly

[init] small refactor for better verbosity
Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference IRC discussion, IIUC: http://www.erisian.com.au/bitcoin-core-dev/log-2019-05-10.html#l-127

Tests that show the expected behavior and verify the issue/regression would be good here if possible.

src/net.h Outdated Show resolved Hide resolved
@Empact
Copy link
Member

Empact commented May 15, 2019

Two requests to simplify review:

  • squash travis fixups into the appropriate commit
  • split fixes into 2 PRs, one addressing each

@tryphe
Copy link
Contributor Author

tryphe commented May 17, 2019

@Empact Thanks. I thought it would be crazy to open two PRs, because bug 1 fixes something for sure, and bug 2 isn't really reproducible. Any bad behavior from bug 2 could be confused for bug 1, and it would be hard to tell which one it is. Eg, things could go south if not enough FDs were allocated (maybe allocated is the wrong word) or not enough were asserted. And since file descriptor problems via threads are hard to reproduce, I thought it made sense to mash it all together.

Copy link
Contributor Author

@tryphe tryphe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for more suggestions here. I'm not quite sure how to logically split this into multiple PRs, so I'll leave that for now.

This commit certainly isn't perfect, but is essentially an open improvement to enforce/allocate a slightly more accurate number of file descriptors. If the user has an absurdly low system setting, we lower the number of maximum inbound connections until it's no longer possible, instead of possibly passing with an unsafe amount.

One note: I've included mmaped things as well, because they require a temporary handle to obtain the mapping. Not sure if this was overkill but it seemed like a good idea.

I also have a feeling that the number of bind arguments isn't always equal to the number of interfaces bitcoind will listen on, so that's probably a bad assumption.

Any additions or ideas on how to fix this up is appreciated. Thanks :)

return InitError(strprintf(_("Not enough file descriptors available. %d available, %d required."), nFD, nFDMin));

// Calculate new -maxconnection count. Note: std::min<int>(...) to work around FreeBSD compilation issue described in #2695
nMaxConnections = std::min<int>(nMaxConnections, nFD - nFDMin);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need <int> here? Both arguments are int now. But I didn't want to cover up this old fix.

init: nMaxConnections and (nFD-nFDMin) cannot be negative, don't check against 0
init: try to fix Travis note about deducing types
init: try to fix travis again "deduced conflicting types"
init: try a better way through the Travis problem
init: revert old init warning for drahtbot
net: name nit "NUM"->"MAX"_FEELER_CONNCTIONS
init: name nit "NUM"->"MAX"_FEELER_CONNCTIONS
init: fixup typo
src/init.cpp Show resolved Hide resolved
@tryphe
Copy link
Contributor Author

tryphe commented Jul 18, 2019

Maybe we can just implement -extrafds=n (or maybe just fds for the total and define the minimum somewhere) instead of modifying this code every time we realize something new or existing needs a file handle. It doesn't really make sense to modify this code if we can just modify a default config value, right (except of course for dynamic values that can't be known)?

@tryphe
Copy link
Contributor Author

tryphe commented Jul 19, 2019

I figured I should supply some pseudocode to show what happens differently in the code and how it's related to both Bug 1 and 2. There's a bit of redundant code, and lots of constants from other places, so I tried to reduce the amount of references needed to actually understand what's going on.

Suppose you are running Mac OS X and listening on the network interface. The ulimit is 256. Suppose you also have 1 bind interface nBind(1) and 1 RPC thread nRpc(1).

Current code

nFD = RaiseFileDescriptorLimit(nMaxConn(125) + nCore(150) + nAddnode(8)) = gets reduced to 256
nMaxConnections = min(125, nFD(256) - nBind(1) - nCore(150) - nAddnode(8) = min(125, 97) = 97
if ( nFD < nCore(150) ) error("not enough fds")
nMaxConnections = min(nFD(256) - nCore(150) - nAddnode(8), nMaxConnections(97)) = min(98, 97) = 97

Usable FDs: nCore(150) + nMaxConnections(97) = 247 + nAddnode(8) + nRpc(1) + nBind(1) = 257
There's too many, even if we only have 1 bind interface and 1 RPC thread.

This PR

nFDMin = nCore(150) + nAddnode(8) + nBind(1) + nRpc(1) = 160
nFD = RaiseFileDescriptorLimit(nMaxConn(125) + nFDMin(160)) = gets reduced to 256
if ( nFD < nFDMin(160) ) error("not enough fds")
nMaxConnections = min(nMaxConn(125), 256-nFDMin(160)) = 96

Usable FDs: nCore(150) + nMaxConnections(96) = 246 + nAddnode(8) + nRpc(1) + nBind(1) = 256

@DrahtBot
Copy link
Contributor

Needs rebase

@laanwj laanwj changed the title init: an incorrect amount of file descriptors is requested, and a different amount is also asserted init: Fixes for file descriptor accounting Sep 30, 2019
@adamjonas
Copy link
Member

Hi @tryphe - Wondering if you are going to continue with this PR or whether it should be closed. It looks like this has needed a rebase for some time. I'd also suggest you squash your commits in accordance with the CONTRIBUTING guidelines and add tests as suggested above to verify the issue.

@fanquake
Copy link
Member

fanquake commented May 7, 2020

Will close this for now. However I think the changes are worth following up on, so will open a new good first issue. Will also extract one change that can just be merged now.

fanquake added a commit that referenced this pull request May 12, 2020
e3047ed test: use p2p constants in denial of service tests (fanquake)
25d8264 p2p: add MAX_FEELER_CONNECTIONS constant (tryphe)

Pull request description:

  Extracted from #16003.

ACKs for top commit:
  naumenkogs:
    utACK e3047ed

Tree-SHA512: 14fc15292be4db2e825a0331dd189a48713464f622a91c589122c1a7135bcfd37a61e64af1e76d32880ded09c24efd54d3c823467d6c35367a380e0be33bd35f
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 12, 2020
e3047ed test: use p2p constants in denial of service tests (fanquake)
25d8264 p2p: add MAX_FEELER_CONNECTIONS constant (tryphe)

Pull request description:

  Extracted from bitcoin#16003.

ACKs for top commit:
  naumenkogs:
    utACK e3047ed

Tree-SHA512: 14fc15292be4db2e825a0331dd189a48713464f622a91c589122c1a7135bcfd37a61e64af1e76d32880ded09c24efd54d3c823467d6c35367a380e0be33bd35f
@tryphe
Copy link
Contributor Author

tryphe commented May 18, 2020

Thanks for following up @adamjonas @fanquake!

I certainly don't think this is the most correct way to proceed, especially with the large scope of changes with new sockets and threads. But it's closer than what we had before, in terms of knowing the [required, requested, allocated] counts of FDs. I think it makes much more sense to modularize the summing of FDs and make it easier to obtain the counts with future changes. Maybe something like a DescriptorMan that does all of the dirty work. Doing this will make adding more sockets or modifying connection min/max bounds much less conflicted. It's apparent that we don't want to keep editing these general files (init.cpp and net.h) every time we need to change the descriptor count.

Will ping #18911 so they can see this message.

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants