Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster duplicate input check in CheckTransaction (alternative to #14387) #14397

Closed
wants to merge 2 commits into from

Conversation

@sipa
Copy link
Member

@sipa sipa commented Oct 5, 2018

This is a simple improvement to the performance of the duplicate input check in CheckTransaction.

It includes the benchmark from #14387, for which it shows about a 2x speedup compared to master (while #14387 shows a 4.5x speedup, but with a lot more complexity).

@sipa sipa force-pushed the 201810_fastduplicate branch from fb8edd0 to 6d1779d Oct 5, 2018
@JeremyRubin
Copy link
Contributor

@JeremyRubin JeremyRubin commented Oct 5, 2018

cr-ack 6d1779d -- one of my earlier versions looked a lot like that.

I'd also suggest another optimization to make the vector a prevector with say 20 outputs so we can avoid the allocation on most txns.

May I suggest using adjacent_find with std::equal instead of your own loop?
https://en.cppreference.com/w/cpp/algorithm/adjacent_find

@jnewbery
Copy link
Member

@jnewbery jnewbery commented Oct 5, 2018

I prefer this to #14387, but I have the same question for @sipa as I asked Jeremy in that PR:

This makes block propagation faster, but do we have an understanding of how much these milliseconds matter? Is there a way we can determine whether the increased complexity introduced is a reasonable price to pay for the performance win?

Granted, the complexity added here is minimal compared to 14387, but I'd still like to understand the cost/benefit analysis.

@sipa
Copy link
Member Author

@sipa sipa commented Oct 5, 2018

@jnewbery Yeah, that's fair. I just wanted to show an alternative which is a smaller improvement but at much lower review complexity. I'm not convinced we need either approach.

@jamesob
Copy link
Member

@jamesob jamesob commented Oct 5, 2018

For what it's worth, bitcoinperf didn't show any significant performance regression for IBD or reindex after the merge of #14247, so it's not clear that either changeset is needed.

@MarcoFalke
Copy link
Member

@MarcoFalke MarcoFalke commented Oct 5, 2018

@jamesob I think this is not so much about ibd but rather propagation

@JeremyRubin
Copy link
Contributor

@JeremyRubin JeremyRubin commented Oct 5, 2018

@MarcoFalke correct.

It's something like 13ms slower.

@DrahtBot
Copy link
Contributor

@DrahtBot DrahtBot commented Oct 5, 2018

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #14400 (Add Benchmark to test input de-duplication worst case by JeremyRubin)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@gmaxwell
Copy link
Contributor

@gmaxwell gmaxwell commented Oct 5, 2018

Even block propagation is no longer critical for this: When the prior optimization went in, we didn't have relay-before-validate. I'm not opposed to this change-- as it's pretty straight forward to review-- but since this particular validity criteria is pure (a function of the tx itself with no dependency on external state) effort might be better spent reorging things so that this test ends up covered by the wtxid validity caching which, although complementary, would likely give a bigger improvement.

@ryanofsky
Copy link
Contributor

@ryanofsky ryanofsky commented Oct 6, 2018

Is there an explanation of what purpose the duplicate input check will serve if the connect block code is fixed to check for valid spends?

@gmaxwell
Copy link
Contributor

@gmaxwell gmaxwell commented Oct 7, 2018

@ryanofsky Making block connection alone check for duplicate inputs isn't sufficient: The whole reason this code was initially added in PR #443 was to keep duplicate inputs out of the mempool-- block connection did originally prevent the dupes. If both block-connection and mempool processing prevented duplicate inputs, then, indeed, there should be no reason for this code.

@jl2012
Copy link
Contributor

@jl2012 jl2012 commented Oct 10, 2018

I think both set+insert (the existing one) and sort (this PR) are O(NlogN). So sort is faster as it is more optimized?

@sipa sipa force-pushed the 201810_fastduplicate branch from 6d1779d to d5573dc Oct 12, 2018
@sipa
Copy link
Member Author

@sipa sipa commented Oct 12, 2018

@jl2012 You get better memory locality and lower allocation overhead by representing things as a vector rather than a set. The set permits much faster updating though, but that's not something we need here.

@sipa sipa force-pushed the 201810_fastduplicate branch from d5573dc to 7a9ac18 Oct 12, 2018
@@ -182,11 +182,14 @@ bool CheckTransaction(const CTransaction& tx, CValidationState &state, bool fChe

// Check for duplicate inputs - note that this check is slow so we skip it in CheckBlock
if (fCheckDuplicateInputs) {
Copy link
Contributor

@jl2012 jl2012 Oct 13, 2018

Tx with 1 input is quite common. Would it be faster if we skip those?

MarcoFalke added a commit to MarcoFalke/bitcoin-core that referenced this issue Nov 25, 2018
… case

e4eee7d Add Benchmark to test input de-duplication worst case (Jeremy Rubin)

Pull request description:

  Because there are now 2PRs referencing this benchmark commit, we may as well add it independently as it is worth landing the benchmark even if neither patch is accepted.

  bitcoin#14397
  bitcoin#14387

Tree-SHA512: 4d947323c02297b0d8f5871f9e7cc42488c0e1792a8b10dc174a25f4dd53da8146fd276949a5dbacf4083f0c6a7235cb6f21a8bc35caa499bc2508f8a048b987
@DrahtBot
Copy link
Contributor

@DrahtBot DrahtBot commented Nov 25, 2018

Needs rebase

@MarcoFalke
Copy link
Member

@MarcoFalke MarcoFalke commented Apr 19, 2019

There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened.

@MarcoFalke MarcoFalke closed this Apr 19, 2019
Munkybooty added a commit to Munkybooty/dash that referenced this issue Jul 30, 2021
… case

e4eee7d Add Benchmark to test input de-duplication worst case (Jeremy Rubin)

Pull request description:

  Because there are now 2PRs referencing this benchmark commit, we may as well add it independently as it is worth landing the benchmark even if neither patch is accepted.

  bitcoin#14397
  bitcoin#14387

Tree-SHA512: 4d947323c02297b0d8f5871f9e7cc42488c0e1792a8b10dc174a25f4dd53da8146fd276949a5dbacf4083f0c6a7235cb6f21a8bc35caa499bc2508f8a048b987
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet