Small step towards demangling cs_main from CNodeState #10652

Open
wants to merge 4 commits into
from

Conversation

Projects

Blockers in High-priority for review

8 participants
Contributor

TheBlueMatt commented Jun 22, 2017

This is a rather long branch, split up into three PRs, with lots of cleanup and tweaks here and there that evntually gets us CNodeStates which have their own locks, locked before cs_main, which makes nodes not always require cs_main to do simple things like call Misbehaving(). The final result is at https://github.com/TheBlueMatt/bitcoin/tree/2017-06-cnodestateaccessors-3 (and next PR is https://github.com/TheBlueMatt/bitcoin/tree/2017-06-cnodestateaccessors-2).

This PR is primarily some small cleanups and the first few commits of #9447 rebased on #10179 and with small tweaks.

The result is that we take the lock for the CNodeState for the node we're processing at the top of ProcessMessages/SendMessages, and hold it until the end, passing around an accessor to the CNodeState as we go. This moves towards a few goals: (1) ability to run ProcessMessages() in paralell, (2) move towards a world where ProcessMessage() is a function in CNodeState, which are objects owned by CConnman.

This isn't a trivial thing to pull off because of lockorder, which introduces two issues:

  • There are a few places we call State() as a result of callbacks from CValidationInterface. Addressed by extending the CValidationInterface threading introduced in #10179 and used for wallet to a few more functions.
  • You can't take two State() accessors at once, because then you'd have undefined lockorder. Luckily with the CValidationInterface threading changes gets most of those, but also the first few commits from #9447 are needed (included here rebased and slightly tweaked) as well as two little tweaks to be able to apply DoS score after unlocking the CNodeState of the node we're processing on.

By the end DEBUG_LOCKORDER features are added to stricly enforce these potential issues (and I've been testing it a bunch in helgrind, etc).

Contributor

TheBlueMatt commented Jun 22, 2017

Ping @theuni.

src/net.cpp
}
+void CConnman::ForEachNode(std::function<void (CNode* pnode)> func)
@jonasschnelli

jonasschnelli Jun 28, 2017

Member

Wouldn't this discard the rvalue reference && (move schematic)?

@TheBlueMatt

TheBlueMatt Jun 28, 2017

Contributor

I mean it does, but I'm not too worried about requiring move for a callable type (places where we std::move a lambda into this will still get the benifit of calling std::function(Callable&&)). The real concern here, of course, is that it introduces a performance penalty, but its not huge (an extra function call and few if's on top of the callback function call) and we're already calling this with lambda functions, so I dont think we were ever too worried about the performance here.

@TheBlueMatt

TheBlueMatt Jun 28, 2017

Contributor

Moved this commit to #10697.

Contributor

TheBlueMatt commented Jun 28, 2017

Rebased on a rebased version of the updated #10179 and filled in more text for the " MarkBlockAsReceived on NewPoWValidBlock at receive." commit message which helps provide a bit more context for this PR:

"Note that this also helps with future per-CNodeState locks, as those
will require no two CNodeState locks to be held at the same time
(for lockorder reasons), and this prevents us from accessing
another peer's mmapBlocksInFlight entry during processing as we can
move the NewPoWValidBlock callback into the CValidationInterface
background thread mechanics."

- // (but if it does not build on our best tip, let the SendMessages loop relay it)
- if (!IsInitialBlockDownload() && chainActive.Tip() == pindex->pprev)
- GetMainSignals().NewPoWValidBlock(pindex, pblock);
+ GetMainSignals().NewPoWValidBlock(pindex, pblock, chainActive.Tip() == pindex->pprev);
@theuni

theuni Jul 6, 2017

Member

I think this wants fInitialDownload as a param so that it's always evaluated in its receiving context rather than having subscribers potentially coming to different conclusions.

@TheBlueMatt

TheBlueMatt Jul 7, 2017

Contributor

Hmm? I'm not sure I caught your point - we never get here if we already have the block on disk, and the new uses of it in net_processing need to know that we got here for any ProcessNewBlock call they make.

@theuni

theuni Jul 7, 2017

Member

My point was that IsInitialBlockDownload() may be false by the time it's checked in a callback on a separate thread (once that stuff comes in), though it was true in AcceptBlock. So it's potentially racy to subscribers.

@TheBlueMatt

TheBlueMatt Jul 7, 2017

Contributor

Ahh, ok. Yes, is not an issue in any of my work, I believe (worst-case you have an extra few fast-announcements when you first leave IBD), but I added a comment noting the issue.

+ // TODO: Only process if requested from this peer?
+ forceProcessing |= mmapBlocksInFlight.count(hash);
+ // Block is no longer in flight from this peer
+ MarkBlockAsNotInFlight(hash, pfrom->GetId());
@theuni

theuni Jul 6, 2017

Member

Probably makes sense for MarkBlockAsNotInFlight() to return whether it removed something for the peer? I think that would take care of the TODO as well.

@TheBlueMatt

TheBlueMatt Jul 7, 2017

Contributor

Given the TODO has a "?" on it (and I dunno if we want it or not), I'll leave this for a later PR?

Contributor

TheBlueMatt commented Jul 7, 2017

Rebased.

@morcos

morcos approved these changes Jul 10, 2017

Looks pretty good to me.
utACK 0010932

src/net_processing.cpp
+ range.first++;
+ if (itInFlight->second.first == nodeid) {
+ if (clearState)
+ ClearDownloadState(itInFlight);
@morcos

morcos Jul 10, 2017

Contributor

same line or braces

@TheBlueMatt

TheBlueMatt Jul 11, 2017

Contributor

Fixed.

@@ -2024,7 +2066,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
}
// If we're not close to tip yet, give up and let parallel block fetch work its magic
- if (!fAlreadyInFlight && !CanDirectFetch(chainparams.GetConsensus()))
+ if (!fInFlightFromSamePeer && !CanDirectFetch(chainparams.GetConsensus()))
@morcos

morcos Jul 10, 2017

Contributor

I'm a bit hesitant about this change. I can't really remember exactly why we have this CanDirectFetch logic in compact block processing. But this could make us unnecessarily avoid an optimistic compact block reconstruction if for some reason it took more than 200 mins to receive 2 blocks (right we do reconstructions if within 2 blocks of the announced block?)

@morcos

morcos Jul 10, 2017

Contributor

Actually now that I think about it, that concern exists prior to the changes in this PR.
Why do we have the CanDirectFetch guard for compact block processing?

@TheBlueMatt

TheBlueMatt Jul 11, 2017

Contributor

No, I think you'd have to go 200 minutes for one block for it to matter (if you go 200 minutes for two, and missed the first one, you wont compact-download that older one, but thats OK, you were probably offline or something anyway). Still, we dont have a much better way to say "ok, I really, really only want to do this if I'm caught up" (ie cause otherwise mempool will be worthless). We could drop it - its not the end of the world to take the extra RT to use the compact blocks and get a tiny bit of compression, but at that point you may very well be just as well off just doing a regular block fetch and not taking the extra RT.

morcos and others added some commits Dec 29, 2016

@morcos @TheBlueMatt morcos Turn mapBlocksInFlight into a multimap 10b7508
@morcos @TheBlueMatt morcos Only request full blocks from the peer we thought had the block in-fl…
…ight

This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer)
ca66c4a
@TheBlueMatt TheBlueMatt Call NewPoWValidBlock callbacks for all new blocks, not just !IBD
This pushes some "is this callback useful" logic down into
net_processing, which is useful for later changes as it allows for
more notifications to be used.
45a3f05
@TheBlueMatt TheBlueMatt MarkBlockAsReceived on NewPoWValidBlock at receive.
The received block could be malleated, so this is both simpler, and
supports parallel downloads.

Note that this also helps with future per-CNodeState locks, as those
will require no two CNodeState locks to be held at the same time
(for lockorder reasons), and this prevents us from accessing
another peer's mmapBlocksInFlight entry during processing as we can
move the NewPoWValidBlock callback into the CValidationInterface
background thread mechanics.
26b2db5
Contributor

TheBlueMatt commented Jul 11, 2017 edited

Rebased to remove the useless old commits, but no changes. Still would be nice to see thtis for 0.15, but not a huge deal if it misses.

laanwj added this to the 0.15.0 milestone Jul 11, 2017

Contributor

morcos commented Jul 11, 2017 edited

utACK ca66c4a

src/net_processing.cpp
+ range.first++;
+ if (itInFlight->second.first == nodeid) {
+ if (clearState)
+ ClearDownloadState(itInFlight);
@promag

promag Jul 12, 2017

Contributor

Missing block or same line as if ().

- std::map<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator> >::iterator it = mapBlocksInFlight.find(resp.blockhash);
- if (it == mapBlocksInFlight.end() || !it->second.second->partialBlock ||
- it->second.first != pfrom->GetId()) {
+ bool fExpectedBLOCKTXN = false;
@promag

promag Jul 12, 2017

Contributor

expected_BLOCKTXN?

- it->second.first != pfrom->GetId()) {
+ bool fExpectedBLOCKTXN = false;
+ std::pair<BlockDownloadMap::iterator, BlockDownloadMap::iterator> rangeInFlight = mmapBlocksInFlight.equal_range(resp.blockhash);
+ BlockDownloadMap::iterator it = rangeInFlight.first;
@promag

promag Jul 12, 2017

Contributor

Remove it and mutate rangeInFlight.first.

+ while (it != rangeInFlight.second) {
+ if (it->second.first == pfrom->GetId()) {
+ if (it->second.second->partialBlock)
+ fExpectedBLOCKTXN = true;
@promag

promag Jul 12, 2017

Contributor
if (it->second.first == pfrom->GetId()) {
    fExpectedBLOCKTXN = it->second.second->partialBlock;
    break;
}
@@ -2382,7 +2423,10 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
LOCK(cs_main);
// Also always process if we requested the block explicitly, as we may
// need it even though it is not a candidate for a new best tip.
- forceProcessing |= MarkBlockAsReceived(hash);
+ // TODO: Only process if requested from this peer?
+ forceProcessing |= mmapBlocksInFlight.count(hash);
@promag

promag Jul 12, 2017

Contributor

Replace these mmapBlocksInFlight.count() with, for instance:

forceProcessing |= IsBlockInFlight(hash);
src/scheduler.cpp
@@ -139,3 +139,69 @@ size_t CScheduler::getQueueInfo(boost::chrono::system_clock::time_point &first,
}
return result;
}
+
+bool CScheduler::AreThreadsServicingQueue() const {
@promag

promag Jul 12, 2017

Contributor

{ in new line for all functions.

src/scheduler.cpp
+bool CScheduler::AreThreadsServicingQueue() const {
+ return nThreadsServicingQueue;
+}
+
@promag

promag Jul 12, 2017

Contributor

Remove extra line.

@@ -105,7 +105,8 @@ namespace {
bool fValidatedHeaders; //!< Whether this block has validated headers at the time of request.
std::unique_ptr<PartiallyDownloadedBlock> partialBlock; //!< Optional, used for CMPCTBLOCK downloads
};
- std::map<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator> > mapBlocksInFlight;
+ typedef std::multimap<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator>> BlockDownloadMap;
@promag

promag Jul 12, 2017

Contributor

Suggestion, add NodeId to QueuedBlock structure?

Contributor

TheBlueMatt commented Jul 18, 2017

This should likely be un-tagged 0.15. Its not a critical enough bugfix to merit merging at this late stage.

MarcoFalke removed this from the 0.15.0 milestone Jul 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment