Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize CheckInputs() in AcceptToMemoryPool() #15169

Open
wants to merge 4 commits into
base: master
from

Conversation

@sdaftuar
Copy link
Member

commented Jan 15, 2019

This PR takes advantage of our script check threads (used for block validation) in order to speed up transaction validation when adding to the mempool.

The motivation is to minimize the amount of time we might spend validating a single transaction. I think there are several approaches we could take to mitigate this concern, but this approach seemed straightforward to implement, and would be easy to revert if we later opted for a different solution (such as parallelizing our message handler or dropping transactions into some sort of queue for background processing).

This PR does introduce some behavior changes:

  • we would no longer ban peers for transactions with invalid scripts (however, the status quo behavior is spotty and inconsistent in how it bans anyway, since the mandatory consensus flags set is years old and we stop at the first script execution error)
  • we could relay transactions from a whitelisted peer that would cause us to be banned by an older peer (but -whitelistforcerelay now defaults to off, mitigating this concern)

@fanquake fanquake added the Validation label Jan 15, 2019

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch to 8cd9e2f Jan 15, 2019

@DrahtBot

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #16486 ([consensus] skip bip30 checks when assumevalid is set for the block by pstratem)
  • #16400 ([refactor] Rewrite AcceptToMemoryPoolWorker() using smaller parts by sdaftuar)
  • #15713 (refactor: Replace chain relayTransactions/submitMemoryPool by higher method by ariard)
  • #15192 (Add missing cs_main locks in ThreadImport(...)/Shutdown(...)/gettxoutsetinfo(...)/InitScriptExecutionCache(). Add missing locking annotations. by practicalswift)
  • #13868 (Remove unused fScriptChecks parameter from CheckInputs by Empact)
  • #13204 (Faster sigcache nonce by JeremyRubin)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

we could relay transactions from a whitelisted peer that would cause us to be banned by an older peer

This seems bad. The original motivation for making whitelist relay is long long long since dead (*). Can we just eliminate this behaviour, as it makes whitelisting less useful and isn't useful itself?

If there is still any need for forced-relay transactions it could be accomplished by introducing a new P2P message that only whitelisted peers are allowed to use, that carries with it a TX whos relay is forced.

(*) my recollection is that the original reason that behaviour was added was because people using armory would start armory and their bitcoin node at the same time, and their armory wallet would blast txn to send at bitcoind which would end up in its mempool but not relayed on the network because it had no actual peers up. ... then later attempts by armory to rebroadcast were ineffectual because the txn were already in the node's mempool. However, because it causes all your peers to be flooded by txn relayed by your whitelisted neighbours it makes whitelisting useless outside of that armory usecase, unless you add the second option to force relay off.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

On the overall change, sounds interesting but without benchmarks I dunno how exciting it is. I guess I'm kinda surprised that there is a speedup at all considering that IBD doesn't seem to scale past 15/16 cores... which I'd assumed was due to synchronization overheads in the script validation thread dispatch.

@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented Jan 15, 2019

I would love to get rid of the whitelist relay behavior. Any opinions about whether I should open a new PR for that, or just do it here?

My initial benchmark is on a very large transaction that has 640 P2PKH inputs, 2 outputs, and is policy-valid. I compared this PR with master, and I tried this on two computers, a Power9 workstation (which doesn't have the SHA optimizations) and my laptop (a recent macbook pro).

With master (verification threads are unused in mempool acceptance):

Computer Verification Threads average scriptcheck time average total atmp time
power9 16 283.0ms 285.6ms
macbook pro 8 176.7ms 178.5ms

The "scriptcheck time" is a timing I added from just before the first call to CheckInputs(), to just after the second call. (I compare with the total ATMP time to sanity check that this is a meaningful optimization.)

With this PR:

Computer Verification Threads average scriptcheck time average total atmp time
power9 16 35.7ms 38.2ms
macbook pro 8 40.7ms 42.6ms

I'm actually somewhat more interested in benchmarking transactions that use maximal CPU resources but don't get into the mempool, since such transactions are costless for someone to generate (and because they're slower for us to process, at least in this PR as it stands) -- I will try to post those numbers later. Also, if you have any suggestions for transactions that might be more costly to validate than what I've done so far, please let me know.

@MarcoFalke

This comment has been minimized.

Copy link
Member

commented Jan 15, 2019

It would be nice to have the benchmark in master before the changes are made here so that it is easier to see how much each of the two commits change the performance.

@JeremyRubin
Copy link
Contributor

left a comment

concept ack.

I think I'd rather see the parallelization determination be abstracted into the CheckQueue itself; although I think it's difficult to envision how that could be done neatly without incurring the overhead of constructing the check jobs. Maybe what would be best is to make it so that the overhead of using the queue is sufficiently low such that we don't need to worry about only using it for a certain number of inputs. IE, someday if the queue is reworked such that this overhead goes away, it would be weird that code gating the use of the queue dependent on job parallelization exists outside the queue.

That said, can you provide a benchmark detailing how bad it is to just always use the queue as-is?

bool parallel_verify = false;
if (legacy_inputs >= 15 && nScriptCheckThreads) {
parallel_verify = true;
}

This comment has been minimized.

Copy link
@JeremyRubin

JeremyRubin Jan 15, 2019

Contributor

Perhaps marginally less clear, but as follows, we skip the legacy input count if we don't have parallel threads & we abort the check as soon as we trigger the legacy count condition.

    bool parallel_verify = nScriptCheckThreads > 0;
    size_t legacy_inputs_threshold = parallel_verify ? 15 : 0;
    for (size_t i = 0; i < tx.vin.size() && legacy_input_threshold; i++) {
        legacy_inputs_threshold -= tx.vin[i].scriptWitness.IsNull();
    }
    parallel_verify &= legacy_input_threshold == 0; 

This comment has been minimized.

Copy link
@sdaftuar

sdaftuar May 13, 2019

Author Member

I think you're right that it's not worth the complexity of gating the logic on the number of legacy inputs. I did some microbenchmarking on a couple of my computers to see how long it takes to validate small transactions at various thresholds, and the effect seems very small.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2019

@sdaftuar on forced relay, I think a PR that defaulted it to off could just be open separately and make it in faster than this work... and this would could continue assuming that issue is resolved elsewhere.

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch from 8cd9e2f to f09e242 May 7, 2019

@DrahtBot DrahtBot removed the Needs rebase label May 7, 2019

@sdaftuar sdaftuar changed the title WIP: Parallelize CheckInputs() in AcceptToMemoryPool() Parallelize CheckInputs() in AcceptToMemoryPool() May 7, 2019

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch from f09e242 to ba7c10e May 14, 2019

@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented May 14, 2019

Updated to always use parallel validation for mempool transactions if scriptcheck threads are available, and also updated the OP to describe the current behavior.

One remaining open question in my mind is around the new behavior where a peer sending an invalid-according-to-consensus rules transaction would no longer be disconnected. The motivation for this is to mitigate adversarial behavior: in order to determine whether a transaction is invalid according to consensus rules (rather than policy) we end up validating a transaction's scripts twice, so an adversary trying a CPU DoS could construct transactions that only fail policy checks, but otherwise tie up our CPU for twice as long.

However, in a non-adversarial case, where our peer is merely broken (or intended to operate on a different network) but non-malicious, this means that we will stay connected to the peer longer than we should. This is also not ideal. I have two ideas for addressing this, but I welcome other suggestions:

(1) probabilistically evaluate failing transactions with consensus flags, and using the result to decide whether to disconnect a peer -- say we had a 1% chance of validating a failed transaction a second time, so that if a peer is giving us garbage we can figure it out relatively quickly while still limiting CPU usage for a bad transaction in the general case.

(2) @TheBlueMatt suggested I refactor the checkqueue logic to pass back the reason for a script failure (say, the first script that fails when doing parallel validation), and then we use that to determine whether to disconnect a peer based on a single validation of scripts. Assuming that there are no conceptual issues with this, then this should be optimal behavior for us. This also seems a little invasive and IMO the gain here is not obviously worth the effort, but Matt suggested we might want this the next time a consensus change is proposed anyway.

Any thoughts on whether the behavior change proposed here is ok, or if I should pursue an alternate approach, such as one of the two ideas above?

@JeremyRubin

This comment has been minimized.

Copy link
Contributor

commented May 14, 2019

The check queue approach is IMO a bad idea because we may fail for different reasons on different executions.

If you want a reason, I would re-run validation in single core and reproduce the first failure.

Otherwise maybe you can produce all failure tuples (index, failure) and then take the sorted first one?

@JeremyRubin

This comment has been minimized.

Copy link
Contributor

commented May 14, 2019

You can even track the min txn that any worker has attempted and then just restart from there -- that might save you some work against blocks which have their last txn fail.

@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented May 15, 2019

@JeremyRubin For a transaction containing multiple failing script inputs, I don't think it should matter whether the failure we associate with it is deterministic, since trying to find the reason for validation failure is only something we're trying to do for a non-adversarial case anyway. Do you have a scenario in mind where it would be an issue?

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch from ba7c10e to b689d4d Jun 6, 2019

@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented Jun 7, 2019

I added a commit (to be squashed) which uses single-threaded validation for transactions submitted over RPC, since we're not worried about DoS there and users can get more detailed error information for invalid transactions when we scripts are evaluated serially. @jnewbery -- thanks for the suggestion.

@jnewbery jnewbery referenced this pull request Jul 8, 2019
@TheBlueMatt
Copy link
Contributor

left a comment

utACK b689d4d. Some comments but none all that important, really think this is a more important change than it gets credit for.

@@ -895,6 +895,20 @@ static bool AcceptToMemoryPoolWorker(const CChainParams& chainparams, CTxMemPool
// This is done last to help prevent CPU exhaustion denial-of-service attacks.
PrecomputedTransactionData txdata(tx);
if (!CheckInputs(tx, state, view, true, scriptVerifyFlags, true, false, txdata)) {
// Assume that this is due to a policy violation, rather than

This comment has been minimized.

Copy link
@TheBlueMatt

TheBlueMatt Jul 10, 2019

Contributor

Hmm, one thing this potentially protects us from is inbound peers on fork chains where the chain doesn't (yet) have any blocks to relay us (so that we ban them for invalid headers). We may want to broaden the ConsiderEviction logic to apply some similar (but relaxed - don't want to disconnect peers which are in IBD and making progress) rules to inbound peers. No need to do it in this PR, though - I think our inbound peer eviction logic could use some tuning independently and no huge risk even in the case of more fork chains given the outbound peer eviction logic robustness.

src/validation.cpp Show resolved Hide resolved
@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented Jul 12, 2019

Thanks for the review @TheBlueMatt. I'll hold off on the comment nit unless I end up updating this PR for something else.

@jnewbery
Copy link
Member

left a comment

A few small comments on the test changes.

A few comments on the git commits:

  • I recommend removing the word 'Refactor' from the commit log for Refactor CheckInputs() to only validate scripts once (since it's not a refactor - it's a change in behaviour)
  • Are you able to update the commit log for [refactor] Eliminate default argument to ATMP to explain why you're making this change?
  • Can you squash the squashme commit?
test/functional/feature_dersig.py Outdated Show resolved Hide resolved
test/functional/feature_cltv.py Outdated Show resolved Hide resolved
test/functional/p2p_segwit.py Outdated Show resolved Hide resolved
@@ -154,8 +154,8 @@ def run_test(self):
self.skip_mine(self.nodes[2], p2sh_ids[NODE_2][WIT_V1][0], True) # block 427

self.log.info("Verify unsigned p2sh witness txs without a redeem script are invalid")
self.fail_accept(self.nodes[2], "mandatory-script-verify-flag", p2sh_ids[NODE_2][WIT_V0][1], False)
self.fail_accept(self.nodes[2], "mandatory-script-verify-flag", p2sh_ids[NODE_2][WIT_V1][1], False)
self.fail_accept(self.nodes[2], "script-verify-flag", p2sh_ids[NODE_2][WIT_V0][1], False)

This comment has been minimized.

Copy link
@jnewbery

jnewbery Jul 15, 2019

Member

The failure string can be "script-verify-flag-failed" here. Even better, with the specific error string "script-verify-flag-failed (Operation not valid with the current stack size)"

This comment has been minimized.

Copy link
@sdaftuar

sdaftuar Jul 16, 2019

Author Member

Leaving this one alone -- someone interested in improving this test can take it on in a separate PR.

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch from b689d4d to cb0c42c Jul 16, 2019

@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented Jul 16, 2019

Addressed all but one of @jnewbery's comments, and included the comment change that @TheBlueMatt previously suggested.

Prior version of this PR can be compared here: 15169.1

@jnewbery
Copy link
Member

left a comment

A few comments inline. One more:

  • can you remove MANDATORY_SCRIPT_VERIFY_FLAGS and its comment from policy.h (and replace it in the definition of STANDARD_SCRIPT_VERIFY_FLAGS with SCRIPT_VERIFY_P2SH) now that it no longer has any meaning?
test/functional/p2p_segwit.py Outdated Show resolved Hide resolved
@@ -558,7 +559,7 @@ static bool CheckInputsFromMempoolAndCache(const CTransaction& tx, CValidationSt
}
}

return CheckInputs(tx, state, view, true, flags, cacheSigStore, true, txdata);
return RunCheckInputsMaybeParallel(tx, state, view, flags, cacheSigStore, true, txdata);

This comment has been minimized.

Copy link
@jnewbery

jnewbery Jul 16, 2019

Member

Note for reviewers that there's a change in behaviour here where previously state would be set to NOT_STANDARD and now it's set to CONSENSUS. I think that's correct behaviour, since this check is against consensus rules.

Also note that this should never fail. If CheckInputsFromMempoolAndCache() fails script checking it's because a policy change has been deployed which is a hard fork against consensus rules. That's a bug.

@@ -558,7 +559,7 @@ static bool CheckInputsFromMempoolAndCache(const CTransaction& tx, CValidationSt
}
}

return CheckInputs(tx, state, view, true, flags, cacheSigStore, true, txdata);
return RunCheckInputsMaybeParallel(tx, state, view, flags, cacheSigStore, true, txdata);

This comment has been minimized.

Copy link
@jnewbery

jnewbery Jul 16, 2019

Member

I believe this breaks script caching - CheckInputs() will only cache script success if running non-parallel.

// depends on the details of how net_processing handles
// such errors).
return state.Invalid(ValidationInvalidReason::CONSENSUS, false, REJECT_INVALID, strprintf("mandatory-script-verify-flag-failed (%s)", ScriptErrorString(check.GetScriptError())));
// NOTE: We return CONSENSUS as the reason here, even

This comment has been minimized.

Copy link
@jnewbery

jnewbery Jul 16, 2019

Member

It seems slightly weird (and scary) to return that this was a consensus failure, when this is a dumb utility function which is just checking whichever rules it gets passed. Perhaps a future PR should remove the CValidationState argument and force the caller to set the validation failure reason:

  • The initial call to CheckInputs() in ATMP should always set the state to TX_NOT_STANDARD on failure
  • The subsequent call to CheckInputs() in ATMP should always set the state to TX_WITNESS_MUTATED on failure
  • ConnectBlock should update this anyway (eg by setting a BlockValidationState after #15921)
  • The call in CheckInputsFromMempoolAndCache() should set this to CONSENSUS (since it's checking against consensus flags)
@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented Jul 16, 2019

@jnewbery Good catch on the caching bug. I've pushed up a quick fix for now, but probably I should do something better here when I have a chance to revisit.

@@ -925,7 +925,7 @@ static UniValue testmempoolaccept(const JSONRPCRequest& request)
{
LOCK(cs_main);
test_accept_res = AcceptToMemoryPool(mempool, state, std::move(tx), &missing_inputs,
nullptr /* plTxnReplaced */, false /* bypass_limits */, max_raw_tx_fee, /* test_accept */ true);
nullptr /* plTxnReplaced */, false /* bypass_limits */, max_raw_tx_fee, /* test_accept */ true, /* single_threaded_validation */ true);

This comment has been minimized.

Copy link
@fjahr

fjahr Jul 17, 2019

Contributor

Style-Nit: Would be a bit nicer to have consistency in the ordering of argument and comment.

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Jul 17, 2019

Member

The arguments might be wrapped into a struct before they are passed at some point in the future, so I guess this style inconsistency will be solved at that point

@@ -1672,6 +1675,23 @@ void ThreadScriptCheck(int worker_num) {
scriptcheckqueue.Thread();
}

bool RunCheckInputsMaybeParallel(const CTransaction& tx, CValidationState &state, const CCoinsViewCache &inputs, unsigned int flags, bool cacheSigStore, bool cacheFullScriptStore, PrecomputedTransactionData& txdata, bool single_threaded_validation) EXCLUSIVE_LOCKS_REQUIRED(cs_main)

This comment has been minimized.

Copy link
@fjahr

fjahr Jul 17, 2019

Contributor

Question on code structure: I would like to understand the reasoning for wrapping CheckInputs() in RunCheckInputsMaybeParalell(). Rather than that I would probably have put that logic into CheckInputs() itself and then have it delegate to CheckInputsParalell() or so. Are there other factors at play that I am not seeing, like keeping the changes in CheckInputs() as minimal as possible for example?

This comment has been minimized.

Copy link
@sdaftuar

sdaftuar Jul 22, 2019

Author Member

CheckInputs() is also invoked directly from ConnectBlock(), which handles the parallel scriptcheck threads itself.

src/validation.cpp Outdated Show resolved Hide resolved

sdaftuar added some commits Jan 15, 2019

Change CheckInputs() to only validate scripts once
Previously CheckInputs() could invoke a script twice to determine if a script
failure was due to standardness rules. However, it only performed this in
single-threaded validation mode, and it directly accessed policy variables to
make this determination (even though the caller was passing in the desired
script flags to use).

The policy vs consensus script flags comparison seems out of touch with current
network deployments; the only MANDATORY script flag is P2SH, even though many
other consensus changes have been deployed since. However, adding further flags
to the MANDATORY set risks making the p2p behavior overly restrictive by
disconnecting unupgraded peers unnecessarily.

Moreover, AcceptToMemoryPoolWorker() already invokes CheckInputs multiple
times, with different script flags, to detect certain kinds of segwit
validation errors. Given that the caller is responsible for passing in the
correct validation flags, it feels like a layer violation for CheckInputs() to
be re-running checks with different script flags by itself.

This patch moves all responsibility for determining which script flags are
mandatory to the caller, so CheckInputs() will now always return CONSENSUS for
any script failures. The caller can rewrite the error code itself, eg if
policy-flags were included.

This patch also changes the ban behavior for peers sending transactions with
invalid scripts; we will no longer disconnect such peers. It was already
trivial for an adversary to mutate a transaction to violate a policy rule,
previously, in order to relay a transaction and avoid a ban, so we were not
relying on the ban behavior previously.

As this results in changes to the reject reasons and codes we can return, many
of the tests have been updated as well.
[refactor] Eliminate default argument to ATMP
Default arguments make review easier when adding a single new argument to a
widely called function, but then are harder to keep track of (and ensure
correctness of callers) when adding new arguments in the future.

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch from 775b71d to 25e60d2 Jul 22, 2019

@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented Jul 22, 2019

There was a silent merge conflict in the prior version of this PR (15169.2) so I've rebased -- still intend to revisit the fix for the scriptcache and outstanding nits.

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch from 25e60d2 to db61ed2 Jul 22, 2019

@sdaftuar

This comment has been minimized.

Copy link
Member Author

commented Jul 22, 2019

Updated with an improved fix to the script caching issue; @jnewbery @TheBlueMatt care to take a look?

sdaftuar added some commits Jan 15, 2019

Parallelize script validation in AcceptToMemoryPool()
When receiving transactions over the network, use script check threads (if
available) to reduce worst-case message handling time.

Continue to use single-threaded validation for rpc invocations (which results
in more descriptive error messages, on failure), where DoS is not a significant
concern.
Remove unused MANDATORY_SCRIPT_VERIFY_FLAGS
This variable is no longer used, so remove it.

Previously, this was used by validation logic to determine whether a
transaction with invalid scripts was violating a policy rule or a consensus
rule -- the idea being that we might ban a peer that relays an
invalid-according-to-consensus rule, but not an invalid-according-to-policy
one.

However, consensus rule changes are designed so that nodes need not all upgrade
at once. Consequently, banning a peer for relaying a transaction that is newly
invalid as a result of a recent soft-fork would risk partitioning old nodes
from the network needlessly. Because of this type of concern, the only consensus
flag every added to the MANDATORY set was P2SH.

@sdaftuar sdaftuar force-pushed the sdaftuar:2018-12-parallel-mempool-scriptchecks branch from 1de40f7 to 8b8148d Aug 1, 2019

@DrahtBot

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2019

Needs rebase
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants
You can’t perform that action at this time.