Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Replace sets of txiter with CTxMemPoolEntryRefs #28886

Closed
wants to merge 4 commits into from

Conversation

TheCharlatan
Copy link
Contributor

Currently the mempool returns and consumes sets of multiindex iterators in its public API. A likely motivation for this over working with references to the actual values is that the multi index interface works with these iterators and not with pointers or references to the actual values.

However, using the iterator type in the setEntries set provides little benefit in practice as applied currently. Its purpose, ownership, and safety semantics often remain ambiguous, and it is hardly used for actually iterating through the data structures. So replace it where possible with CTxMemPoolEntryRefs.

Since CTxMemPoolEntry itself refers to its Parents and Children by CTxMemPoolEntryRef and not txiter, this allowed for an overall reduction of calls to iterator_to. See the docs on iterator_to for more guidance.

No change in the performance of the mempool code was observed in my benchmarks.

This also makes the goal of eliminating boost types from the headers as done in #28335 more feasible.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 15, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK theuni

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29325 (consensus: Store transaction nVersion as uint32_t by achow101)
  • #29306 (policy: enable sibling eviction for v3 transactions by glozow)
  • #26593 (tracing: Only prepare tracepoint arguments when actually tracing by 0xB10C)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

EXCLUSIVE_LOCKS_REQUIRED(m_mempool->cs, ::cs_main) {
AssertLockHeld(m_mempool->cs);
AssertLockHeld(::cs_main);
const CTransaction& tx = it->GetTx();
const CTransaction& tx = entry.GetTx();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! really think entry is much better.

@maflcko
Copy link
Member

maflcko commented Nov 16, 2023

 test  2023-11-15T23:32:10.957000Z TestFramework (ERROR): Assertion failed 
                                   Traceback (most recent call last):
                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 132, in main
                                       self.run_test()
                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 320, in run_test
                                       self.replaced_test()
                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/interface_usdt_mempool.py", line 263, in replaced_test
                                       assert_equal(event.replacement_fee, replacement_fee)
                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 56, in assert_equal
                                       raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
                                   AssertionError: not(40000 == 45000)

@TheCharlatan TheCharlatan marked this pull request as draft November 16, 2023 08:25
@TheCharlatan TheCharlatan marked this pull request as ready for review November 16, 2023 12:07
@@ -468,10 +465,10 @@ void CTxMemPool::addUnchecked(const CTxMemPoolEntry &entry, setEntries &setAnces
// to clean up the mess we're leaving here.

// Update ancestors with information about this tx
for (const auto& pit : GetIterSet(setParentTransactions)) {
UpdateParent(newit, pit, true);
for (const CTxMemPoolEntry& pentry : GetEntrySet(setParentTransactions)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (const CTxMemPoolEntry& pentry : GetEntrySet(setParentTransactions)) {
for (const CTxMemPoolEntry& entry : GetEntrySet(setParentTransactions)) {

nit: Not a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The p is for parent? Just like before in pit?

@maflcko maflcko self-requested a review November 16, 2023 13:16
@maflcko
Copy link
Member

maflcko commented Nov 16, 2023

I wonder if it is possible to avoid copies of the entry. Currently one one copy is created when inserting into mapTx.

However, now that things are moving toward references, which are easy to copy accidentally. It may lead to bugs such as the one mentioned in the boost docs:

// The following, though similar to the previous code,
// does not work: iterator_to accepts a reference to
// the element in the container, not a copy.
int x=c.back();
c.erase(c.iterator_to(x)); // run-time failure ensues

So disallowing copies would be a good belt-and-suspenders?

@TheCharlatan
Copy link
Contributor Author

Re #28886 (comment)

However, now that things are moving toward references, which are easy to copy accidentally. It may lead to bugs such as the one mentioned in the boost docs:

Thanks for highlighting this.

So disallowing copies would be a good belt-and-suspenders?

I did a quick POC where mapTx.insert is replaced with mapTx.emplace_back and the CTxMemPoolEntry constructor arguments are passed to emplace_back. Then I could delete the CTxMemPoolEntry copy constructor. It takes a bit of shuffling around in the fuzzing and test code, as well as the addUnchecked method to forward the required arguments, so not sure if it should go into this PR or a follow up. Let me know what you think.

@maflcko
Copy link
Member

maflcko commented Nov 17, 2023

There shouldn't be any conflicts, so maybe a separate/independent/parallel pull?

@theuni
Copy link
Member

theuni commented Nov 22, 2023

Concept ACK on avoiding exposing iterators externally.

Exposing references instead seems like a bit of a lateral move though, and I think it's going to take a good bit of convincing to prove that this is at least as safe than what we have now. Are references guaranteed by Boost to stay valid the same as iters are?

@TheCharlatan
Copy link
Contributor Author

Re #28886 (comment)

and I think it's going to take a good bit of convincing to prove that this is at least as safe than what we have now.

We're using the references now to keep track of the parents and children. I don't see what new assumption this patch is introducing compared to this existing practice.

Are references guaranteed by Boost to stay valid the same as iters are?

I'd be interested in getting an answer to this though. I'm guessing, since we can pass objects without copy or move ctors to the multiindex map, they can't be moved in memory, so the references should be safe, right?

@TheCharlatan
Copy link
Contributor Author

Rebased 7248c0a -> fe8c85e (setEntryRefs_0 -> setEntryRefs_1, compare)

@sdaftuar
Copy link
Member

Though #28676 is still in draft (and likely will remain that way for a while), my plan is to completely rewrite most of the mempool logic. To minimize rebase effort, I wonder if it would be reasonable to defer mempool refactors for the time being, and just try to ensure that these goals are reflected in that work when it becomes ready for full review?

@sdaftuar
Copy link
Member

sdaftuar commented Dec 7, 2023

However, using the iterator type in the setEntries set provides little benefit in practice as applied currently. Its purpose, ownership, and safety semantics often remain ambiguous, and it is hardly used for actually iterating through the data structures. So replace it where possible with CTxMemPoolEntryRefs.

I was thinking about this issue of ownership and safety semantics some more -- it seems like that most of the issues around iterators exist with CTxMemPoolEntryRefs as well, which is that they are only valid to hold onto as long as the mempool is locked. Once the lock is released, it's no longer safe to use mapTx iterators or CTxMemPoolEntryRef's, because they can be deleted out from under you.

So if we really want to address ownership and safety semantics, I think we need to do something a bit different? (Unfortunately I don't have any great suggestions.)

@TheCharlatan
Copy link
Contributor Author

Currently the mempool returns and consumes sets of multiindex iterators
in its public API. A likely motivation for this over working with
references to the actual values is that the multi index interface works
with these iterators and not with pointers or references to the actual
values.

However, the iterator type provides little benefit in practice as
applied currently. Its purpose, ownership, and safety semantics often
remain ambiguous, and it is hardly used for actually iterating through
the data structures. So replace it where possible with
CTxMemPoolEntryRefs.

Since CTxMemPoolEntry itself refers to its Parents and Children by
CTxMemPoolEntryRef and not txiter, this allowed for an overall
reduction of calls to iterator_to.

This also makes the long term goal of eliminating boost types from the
headers more feasible.
@TheCharlatan
Copy link
Contributor Author

Rebased c34dcea -> 7882748 (setEntryRefs_4 -> setEntryRefs_5, compare)

@TheCharlatan
Copy link
Contributor Author

Going to close this while work on cluster mempool is happening. Hopefully a solution for leaking the boost types can be found there. Will keep the parent #28335 open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done or Closed or Rethinking
Development

Successfully merging this pull request may close these issues.

None yet

6 participants