Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate package transactions with their in-package ancestor sets #26711

Closed
wants to merge 7 commits into from

Conversation

glozow
Copy link
Member

@glozow glozow commented Dec 16, 2022

This contains everything to make mempool/validation logic ready for package relay (see #27463).

Design goals:

  • Be able to gracefully deal with any arbitrary list of transactions (these come from p2p)
  • Validate any ancestor package so that the incentive-compatible transactions end up in our mempool.
    • If the transactions would be accepted individually, they should also be accepted through AcceptPackage. We don't want to accidentally reject things just because we happened to download them as a package.
    • Bug prior to these changes: we required IsChildWithParents but if there were dependencies between the parents, we could end up (1) accepting a low-feerate child or (2) rejecting a high-feerate parent CPFPing another parent. See the "interdependent parents" test case for a specific example.
  • Be DoS-resistant.
    • Avoid quadratic validation costs.
    • Avoid loading a lot of stuff from disk, or loading repeatedly.

There are 2 main improvements to package evaluation here:
(1) We submit transactions with their in-package ancestor sets.
- See unit test package_ppfp: without this change, we would reject everything.
- See unit test package_ppfc: shows that this doesn't let us do "parent pays for child;" we only do this when the individual and ancestor feerates meet mempool minimum feerate
(2) We linearize the package transactions based on ancestor set scoring.
- See unit test package_needs_reorder: without this change, if we use the original order of transactions, we would only accept 1 grandparent, even if we submit subpackages.
- See unit test package_desc_limits: without this change, we accept one of the lower-feerate transactions (a bit more of a "nice to have" than a "must have" example).

A description of the package validation logic (originally #26711 (comment)):

  • Basic sanitization. Quit if it's too big, inconsistent, etc.
  • Linearize (Topological sort only)
  • PreChecks loop For each tx, grab UTXOs to calculate fees and filter out anything we should skip:
    • If already in mempool (or same txid in mempool), mark as skip
    • If missing inputs or conflict, record this failure and mark this and all descendants as skip.
    • If no failures or TX_SINGLE_FAILURE, continue
    • For some failures that we expect due to differing chainstates, skip these transactions and their descendants, but continue.
    • Otherwise, record this failure and mark that we will quit_early (i.e. not do the Subpackage validation loop).
  • Refine our linearization using the fee information.
  • Subpackage validation loop For each transaction in the new linearized order:
    • Get the transaction's ancestor subpackage.
    • If the feerate of this transaction is insufficient, continue;
    • If the feerate of this subpackage is insufficient, continue;
    • Otherwise, try to submit the subpackage, using AcceptSingleTransaction() if it's just 1 tx
    • if at any point we get a non-fee-related error, abort all.
  • Call LimitMempoolSize
  • Backfill results:
    • If the transaction was in mempool, check to see if it's still there in case it was trimmed in LimitMempoolSize.
    • Try to use results from the subpackage validation loop.
    • If that doesn't exist (i.e. we quit early), use results from prechecks loop.
    • If that doesn't exist (i.e. we quit early and we hadn't found a failure with it yet), fill with TX_UNKNOWN.

This means we will call PreChecks for each transaction 2 times (fewer if we quit early), and run all other validation checks at most 1 time. A transaction shouldn't be validated in the subpackage validation loop more than once.

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 16, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Stale ACK instagibbs, naumenkogs

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28813 (rpc: permit any ancestor package through submitpackage by glozow)
  • #28690 (build: Introduce internal kernel library by TheCharlatan)
  • #28455 (refactor: share and use GenerateRandomKey helper by theStack)
  • #28391 (refactor: Simplify CTxMempool/BlockAssembler fields, remove some external mapTx access by TheCharlatan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

src/validation.cpp Outdated Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
@glozow glozow changed the title [WIP] validate package transactions with their in-package ancestor sets validate package transactions with their in-package ancestor sets Jan 17, 2023
src/validation.cpp Outdated Show resolved Hide resolved
Copy link
Member

@instagibbs instagibbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropping some initial comments

src/validation.cpp Outdated Show resolved Hide resolved
src/policy/packages.cpp Outdated Show resolved Hide resolved
src/policy/packages.cpp Outdated Show resolved Hide resolved
src/policy/packages.h Outdated Show resolved Hide resolved
src/policy/packages.h Outdated Show resolved Hide resolved
src/policy/packages.h Outdated Show resolved Hide resolved
src/validation.cpp Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
src/policy/packages.cpp Outdated Show resolved Hide resolved
src/policy/packages.cpp Outdated Show resolved Hide resolved
src/policy/packages.h Outdated Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
Copy link
Member

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review at commit “[txpackages] add AncestorPackage for linearizing packages”.

/** Fees of this transaction. Starts as std::nullopt, can be updated using AddFeeAndVsize(). */
std::optional<CAmount> fee;

/** Virtual size of this transaction (depends on policy, calculated externally). Starts as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the usage of virtual size to build the ancestor score of a package might come with incentive suboptimal block template if you have a significant number of segwit spends constituting it (sounds 90% as of 2023). A 500 vbyte transaction A paying 100 sats with 100 vbytes of witnesses isn’t equivalent to a 500 vbyte transaction B paying 100 sats with 20 vbytes of witnesses considering block max size is checked against MAX_BLOCK_WEIGHT in CheckBlock().

Copy link
Member Author

@glozow glozow Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, I'm pretty sure two transactions with the same virtual size but different amounts of witness data are treated exactly the same when checking against the maximum weight. These are both 1997-2000 weight units.

* If fee and vsizes are given for each transaction, it can also linearize the transactions using
* the ancestor score-based mining algorithm via MiniMiner.
*
* IsIsInMempool() and IsDanglingWithDescendants() can be used to omit transactions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one limitation of the partial set of ancestors being used to evaluate the ancestor-score of a package could be the edge-case where unconfirmed ancestors are replaced in the mempool by better same-nonwitness-data-in-mempool candidates, with increased or diminished witness size. This is not an issue right now as we don’t allow wtxid-replacement in PreChecks().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not an issue because we omit the already-in-mempool transactions in our linearization.


explicit PackageEntry(CTransactionRef tx_in) : tx(tx_in) {}

// Used to sort the result of Txns, FilteredTxns, and FilteredAncestorSet. Always guarantees
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding of the guarantees of the PackageEntry comparison operator is correct, a breakage of the guarantee would be a yield AncestorPackage that produce a consensus-invalid block template, e.g a more incentive-compatible subset of descendants where the parents are sorted after. This comment to understand how test coverage should be built for this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the sorting is broken, it means that FilteredTxns etc. might return transactions out of order, but this means validation will get a "missing inputs" error and nothing will be submitted. It can also mean that FilteredTxns doesn't give the more incentive compatible transactions first, which can affect whether we accept or reject these transactions.

This function being incorrect can't make us produce a consensus-invalid block template. Transactions still need to pass validation to be submitted, CTxMemPool sorts its entries, and BlockAssembler does its own sanity checks.

//
// If ancestor score-based linearization sequence exists for both transactions, the
// transaction with the lower sequence number comes first.
// If unavailable or there is a tie, the transaction with fewer in-package ancestors comes first (topological sort).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know if it has been discussed before in the review of this PR, though in case of a tie sounds the breaker should not be on the absolute number of in-package ancestor though on their accumulated vsize ? E.g package entry A might have 1 ancestor of 1 kvB and package entry B might have 3 ancestors of 200 vbytes each.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is just to make sure a descendant doesn't come before its child. If A is an ancestor of B, then all of A's ancestors are also B's ancestors. It must be that A.ancestor_count < B.ancestor_count. Using total ancestor vsize would also work, but count is much easier to check.

@glozow
Copy link
Member Author

glozow commented Nov 2, 2023

Some test commits have been split off into #28764.

glozow added a commit to bitcoin-core/gui that referenced this pull request Nov 3, 2023
…action invariants

fcb3069 Use CheckPackageMempoolAcceptResult for package evaluation fuzzing (Greg Sanders)
34088d6 [test util] CheckPackageMempoolAcceptResult for sanity-checking results (glozow)
651fa40 fuzz: tx_pool checks ATMP result invariants (Greg Sanders)

Pull request description:

  Poached from bitcoin/bitcoin#26711 since that PR is being split apart, and modified to match current behavior.

ACKs for top commit:
  glozow:
    reACK fcb3069, only whitespace changes
  dergoegge:
    ACK fcb3069

Tree-SHA512: abd687e526d8dfc8d65b3a873ece8ca35fdcbd6b0f7b93da6a723ef4e47cf85612de819e6f2b8631bdf897e1aba27cdd86f89b7bd85fc3356e74be275dcdf8cc
fanquake added a commit to bitcoin-core/gui that referenced this pull request Nov 3, 2023
b5a60ab MOVEONLY: CleanupTemporaryCoins into its own function (glozow)
10c0a86 [test util] CreateValidTransaction multi-in/out, configurable feerate, signal BIP125 (glozow)
6ff647a scripted-diff: rename CheckPackage to IsWellFormedPackage (glozow)
da9aceb [refactor] move package checks into helper functions (glozow)

Pull request description:

  This is part of #27463. It splits off the more trivial changes from #26711 for ease of review, as requested in bitcoin/bitcoin#26711 (comment).

  - Split package sanitization in policy/packages.h into helper functions
    - Add some tests for its quirks (bitcoin/bitcoin#26711 (comment))
  - Rename `CheckPackage` to `IsPackageWellFormed`
  - Improve the `CreateValidTransaction` unit test utility to:
    - Configure the target feerate and return the fee paid
    - Signal BIP125 on transactions to enable RBF tests
    - Allow the specification of multiple inputs and outputs
  - Move `CleanupTemporaryCoins` into its own function to be reused later without duplication

ACKs for top commit:
  dergoegge:
    Code review ACK b5a60ab
  instagibbs:
    ACK b5a60ab

Tree-SHA512: 39d67a5f0041e381f0d0f802a98ccffbff11e44daa3a49611189d6306b03f18613d5ff16c618898d490c97a216753e99e0db231ff14d327f92c17ae4d269cfec
achow101 added a commit to bitcoin-core/gui that referenced this pull request Nov 3, 2023
d9cc99d [test] MiniMiner::Linearize and manual construction (glozow)
dfd6a37 [refactor] unify fee amounts in miniminer_tests (glozow)
f4b1b24 [MiniMiner] track inclusion order and add Linearize() function (glozow)
0040759 [test] add case for MiniMiner working with negative fee txns (glozow)
fe6332c [MiniMiner] make target_feerate optional (glozow)
5a83f55 [MiniMiner] allow manual construction with non-mempool txns (glozow)
e3b2e63 [refactor] change MiniMinerMempoolEntry ctor to take values, update includes (glozow)
4aa98b7 [lint] update expected boost includes (glozow)

Pull request description:

  This is part of #27463. It splits off the `MiniMiner`-specific changes from #26711 for ease of review, as suggested in bitcoin/bitcoin#26711 (comment).

  - Allow using `MiniMiner` on transactions that aren't in the mempool.
  - Make `target_feerate` param of `BuildMockTemplate` optional, meaning "don't stop building the template until all the transactions have been selected."
    - Add clarification for how this is different from `target_feerate=0` (bitcoin/bitcoin#26711 (comment))
  - Track the order in which transactions are included in the template to get the "linearization order" of the transactions.
  - Tests

  Reviewers can take a look at #26711 to see how these functions are used to linearize the `AncestorPackage` there.

ACKs for top commit:
  TheCharlatan:
    ACK d9cc99d
  kevkevinpal:
    reACK [d9cc99d](bitcoin/bitcoin@d9cc99d)
  achow101:
    re-ACK d9cc99d

Tree-SHA512: 32b80064b6679536ac573d674825c5ca0cd6245e49c2fd5eaf260dc535335a57683c74ddd7ce1f249b5b12b2683de4362a7b0f1fc0814c3b3b9f14c682665583
@glozow
Copy link
Member Author

glozow commented Nov 3, 2023

add TX_SINGLE_FAILURE / TX_UNKNOWN + tests (+ package-fee-too-low?)

This has been split off in #28785

@glozow
Copy link
Member Author

glozow commented Nov 7, 2023

The "linearize using fees" commit message seems to have an extra line at the end left over from a rebase? The algorithm description seems like it would be better as a code comment rather than a commit message.

Added the algo description to the documentation of the AcceptPackage function

  • linearisation + miniminer + unit/fuzz test and bench
  • acceptpackage loop + subpackage linearization + unit/fuzz tests?

I'd prefer to keep the AncestorPackage changes (along with its unit/bench/fuzz) together with the validation changes in this PR since it's part of the implementation of the algorithm and doesn't have any other uses.

  • permit any ancestor package, functional tests

I've moved this to #28813 which will follow this PR.

The "permit any ancestor package" commit seems like it should be annotated as an RPC change, rather than a "policy" change?

Done, in #28813.

So to recap, I think we are down to 3 PRs to get this done: #28785, then #26711 (this PR), then #28813.

fanquake added a commit to bitcoin-core/gui that referenced this pull request Nov 8, 2023
…for reconsiderable fee failures and skipped transactions

1147e00 [validation] change package-fee-too-low, return wtxid(s) and effective feerate (glozow)
10dd9f2 [test] use CheckPackageMempoolAcceptResult in previous tests (glozow)
3979f1a [validation] add TxValidationResult::TX_RECONSIDERABLE, TX_UNKNOWN (glozow)
5c786a0 [refactor] use Wtxid for m_wtxids_fee_calculations (glozow)

Pull request description:

  Split off from #26711 (suggested in bitcoin/bitcoin#26711 (comment)). This is part of #27463.

  - Add 2 new TxValidationResults
    - `TX_RECONSIDERABLE` helps us encode transactions who have failed fee checks that can be bypassed using package validation. This is distinguished from `TX_MEMPOOL_POLICY` so that we re-validate a transaction if and only if it is eligible for package CPFP. In the future, we will have a separate cache for reconsiderable rejects so these transactions don't go in `m_recent_rejects`.
    - `TX_UNKNOWN` helps us communicate that we aborted package validation and didn't finish looking at this transaction: it's not valid but it's also not invalid (i.e. don't cache it as a rejected tx)
  - Return effective feerate and the wtxids of transactions used to calculate that effective feerate when the error is `TX_SINGLE_FAILURE`. Previously, we would only provide this information if the transaction passed. Now that we have package validation, it's much more helpful to the caller to know how the failing feerate was calculated. This can also be used to improve our submitpackage RPC result (which is currently a bit unhelpful when things fail).
  - Use the newly added `CheckPackageMempoolAcceptResult` for existing package validation tests. This increases test coverage and helps test the changes made in this PR.

ACKs for top commit:
  instagibbs:
    reACK bitcoin/bitcoin@1147e00
  achow101:
    ACK 1147e00
  murchandamus:
    reACK 1147e00
  ismaelsadeeq:
    ACK 1147e00

Tree-SHA512: ac1cd73c2b487a1b99d329875d39d8107c91345a5b0b241d54a6a4de67faf11be69a2721cc732c503024a9cca381dac33d61e187957279e3c82653bea118ba91
glozow and others added 7 commits November 9, 2023 09:37
We cannot require that peers send topologically sorted lists, because we
cannot check for this property without ensuring we have the same chain
tip and ensuring we have the full ancestor set. Instead, add the ability
to handle arbitrarily ordered transaction lists.
The AncestorPackage ctor linearizes the transactions topologically. If
fee and vsize information is added, it uses MiniMiner to refine the
linearization using the ancestor score-based algorithm.
Adds usage of AncestorPackage to linearize transactions and skip ones
that are submitted or already in mempool. This linearization is just
topological for now, but a subsequent commit will add logic to gather
fee information and use that to refine the linearization. Even though
packages are currently required to be sorted, the order may be changed,
since AncestorPackage can produce a different but valid topological
sort.

Use a switch statement to ensure we are handling all possible
TxValidationResults. Also simplifies the diff of later commits.
General algorithm:
1. Basic sanitization to ensure package is well formed and not too big
   to work on.
2. Linearize (based on topological sort).
3. PreChecks loop: call PreChecks on all transactions to get fees and
   vsizes, and to find obvious errors for which we should skip parts of
   the package or abort validation altogether.
4. Linearize (based on ancestor set score).
5. For each transaction in the new linearized order, submit with its
   subpackage (or skip if subpackage/individual feerate is too low).
6. Limit mempool size and backfill map of MempoolAcceptResults.

Q: What's so great about doing a fee-based linearization?
Linearization helps us understand which transactions need to be grouped
together as CPFPs. This helps us skip the low feerate parent and wait
until we see it in the high feerate child's ancestor package; another
approach would be to retry submission with another grouping, but we want
to avoid repeated validation. Linearization also helps us try to accept
the highest feerate subset of the package when we don't don't have room
for all of it. For example, if each of the transactions in the package
share an ancestor whose descendant limits only permit one, we should try
the highest feerate transaction(s) first.

Q: Why use PreChecks to get fees and vsize?
We need the fee and vsize of each transaction, which requires looking up
inputs. While some of the work done in PreChecks might not be necessary,
it applies fail-fast anti-DoS checks that we definitely want (e.g. a
transaction that is too big could thus cause us to load too many UTXOs).
In the future, we may also be interested in using information like the
transaction's mempool conflicts and ancestor set (calculated in
PreChecks) in the linearization or quit early logic. So PreChecks is
best for this purpose.

Q: We don't know if txns have valid signatures yet, so if something is
invalid, we might have a suboptimal linearization/grouping. Should we
regroup or retry in that case?
No. If the peer maliciously constructs a package this way, there isn't
much we can do to find the "optimal subset" without doing a ton of work.
Generally, we hope to only validate each transaction 1 time.
Instead, we just hope we have another honest peer that will send us the
"normal" package.
- package_ppfp shows benefit of assessing subpackages
- package_ppfc shows that this doesn't let us do "parent pays for
  child;" we only do this when the individual and ancestor feerates meet
mempool minimum feerate
- package_needs_reorder shows necessity of linearizing using ancestor
  set scoring. If we used the original order, we would reject
everything, even if we submit subpackages.
- package_desc_limits shows that linearization can help us pick the
  most incentive-compatible tx to accept when transactions affect each
other's eligibility (descendant package limits)
- package_confirmed_parent shows benefit of treating
  txns-already-known as "in mempool" tx
- package_with_rbf tests existing behavior that shouldn't change. Even
  though package RBF is not enabled, it's important that submitting a
transaction as part of a package does not block it from doing a normal
replacement. Otherwise we may blind ourselves to a replacement simply
because it has a child and we happened to download it as a package.

Co-authored-by: Andrew Chow <achow101@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet