New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Super Duper MemPool Limiter #6470
Conversation
@@ -852,6 +877,12 @@ bool AcceptToMemoryPool(CTxMemPool& pool, CValidationState &state, const CTransa | |||
hash.ToString(), nSigOps, MAX_STANDARD_TX_SIGOPS), | |||
REJECT_NONSTANDARD, "bad-txns-too-many-sigops"); | |||
|
|||
// Expire old transactions before trying to replace low-priority ones. | |||
int expired = pool.Expire(GetTime() - GetArg("-mempoolexpiry", DEFAULT_MEMPOOL_EXPIRY) * 60 * 60); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't - GetArg("-mempoolexpiry", DEFAULT_MEMPOOL_EXPIRY) * 60 * 60
be inside pool.Expire?, it may become an attribute of the txmempool in the future or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not doing that, and keeping CTxMempool as much as possible a dumb data structure - the decisions about what happens to it (policy?) should stay out of it, IMHO.
EDIT: We're already failing at that pretty badly anyway, it seems, with the feerate index and the trim code inside CTxMempool. Too bad, but disregard this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just as easy to move things from txmempool to the policy dir than it is from main (if not easier, given that main is always the part with more development conflicts [unrelated things in theory conflict in code there]).
But whatever, I guess it will be a new global in main...
re: GetRand slowness, wouldn't insecure_rand() be good enough here? |
I like the approach; I'll review the code in detail soon. |
The idea that a transaction must be "paid for" when evicted by an unrelated transaction is mistaken. It creates zero incentive for the evicted author to pay more fees in the future. He got what he wanted -- his tx was relayed. If the decision does not affect the cost, the cost should not affect the decision. |
The assumption is that when a transaction is evicted, it won't be mined, thus it won't have paid anything at all. |
@sipa In that case, it also received no benefit. |
It got relayed, which has network costs. The purpose of this payment requirement is to prevent someone from spamming the network by constantly replacing the lowest priority one, and only paying once. |
For DoS protection, all that's needed is (new fee/kb) - (old fee/kb) >= (minimum fee/kb). |
@dgenr8 No, that would let a small transaction erase a large transaction, making space for several new transactions to be cheaply relayed. |
|
||
if (!mempool.StageTrimToSize(softcap, entry, stagedelete, nFeesRequired, nFeesDeleted)) { | ||
size_t expsize = mempool.DynamicMemoryUsage() + mempool.GuessDynamicMemoryUsage(entry); | ||
if (expsize > hardcap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use { } around the then block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objection to changing, but just want to understand what the style request is related to this. All then blocks should have braces or only because the else block is multi-line? A single-line then block without braces appears as example code in developer-notes.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact is that braces are not necessary in our clang style. And btw, when they're used, the shouldn't be in the next line but in the same line as the if/for/while. I'm happy changing the style, but that's in clang-format.
@jtimon I'd like to add some sanity checking for the command line arguments that are passed in. I assume that should go in init.cpp? Where is the appropriate place for me to store variables such as hard cap, mempool expiry time, etc.. so they aren't recalculated every time. |
indexed_transaction_set::nth_index<1>::type::reverse_iterator it = mapTx.get<1>().rbegin(); | ||
int fails = 0; // Number of mempool transactions iterated over that were not included in the stage. | ||
int itertotal = 0; | ||
int iterextra = mustTrimAllSize ? 10 : 100; //Allow many more iterations to find large size during SurplusTrim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind turning these into arguments, that get passed from the 2 callsites instead?
Needs rebase. |
@sipa That is good. You don't want a 500KB spam-monster tx paying (minimum fee/kb) to require a regular-sized tx to pay 1001 * (minimum fee/kb) to dislodge it. Bad guy could get 1000 little txes relayed anyway by sending them first -- the pay-for-evicted rule creates a race to be first or pay double. |
@dgenr8 While that is a potential problem for optimizing what gets into the mempool, note that this PR is attempting to provide a solution so that many small transactions can be used to evict large transaction packages. |
@dgenr8 This conversation has been spread over a couple of PR's and IRC, so I apologize if I'm repeating a prior argument, but I think it would be good to explain the reasoning behind the logic in this pull. Prior to limiting the mempool (or having RBF implemented), we had protection against bandwidth spamming attacks by requiring all transactions that were relayed to be accepted into the mempool and have either fee or priority. Since those transactions are now in the mempool and can't be recalled, those fees are now at risk for being consumed by being included in a block. This pull isn't meant to provide some new protection against that attack, but is only meant to protect against OOM attacks by ever expanding mempools. However, it must be sure not to break that pre-existing protection. Once we create a way for transactions to be removed from the mempool, we have to realize that the fees placed on those transactions are no longer at risk. The approach we've been using is to say that (modulo transactions that have already been mined) the mempool must contain enough fees to pay for the relay cost of all transactions that have been historically broadcast whether they are still in the mempool or not. What this prevents is someone replacing or evicting their own transactions "cheaply" and thereby achieving free relay of their original transactions. The fact that there might be a race to get into the mempool before it fills up is a slight aberration that occurs as a one off. We still need a mechanism by which to adjust the minimum relay fee over time in the event that it is too small to dissuade spam. Your concern about a monster transaction of low fees preventing a small transaction from getting in the mempool is addressed in 4 separate ways in this pull.
No doubt further improvements are possible, but I think this is a good step in the right direction. |
@morcos Thanks. I feel the capping part of this pull is too complex, and could be simplifed (a little) by removing the "pay for eviction" idea, which has no good incentive effect. The person paid is the miner, who doesn't need to be paid ~double to mine the evictor tx. The person who pays is the unlucky evictor. Evictee gets a chance at cheap fees, and others who come after eviction could pay even less, since there might then be extra space for them. The case where evictor == evictee could be handled explicitly, if it were actually possible to target this kind of replacement with all the factors that affect the selection of the evictees, such as propagation variation, node start time variation, mining, and nodes having different mempool size limits. |
I rebased, addressed many of the comments and did some various cleanups in place. Please note I changed the defaults for |
@morcos everybody is creating new globals in main, but I hate that. I would at the very least move them to globals/server ot something, but whatever... Even if you don't do it "the right way" ( @luke-jr wanted to create a dynamic GUI form for command line options using something like jtimon@5a42e27#diff-01e64f27a2a21a3116825fa22aee0537R30 ), for the case of -maxmempool and -mempoolexpiry, you could at least put them in CTxMempool. But, whatever, there's many things to cleanup already and everybody is doing it wrong, even for new policy options (like fRequireStandard), so why would you spend any "mental power" on putting new things in the right place when you're actually solving an urgent problem? |
@jtimon OK thanks. Yes I'm going to leave them as locals for now and if we end up adding sanity checking it can just be done separately in init.cpp for now. But I do like your idea of moving them to a policy class. It puts all the logic of what are the reasonable parameters for these arguments in the same place.. |
I have a few questions,
|
Testing & benchmarking. |
@ABISprotocol, I still believe we need a floating relay fee. I just think it needs to act over a much longer time horizon and be less abusable than the one implemented in the commit I removed. |
@ABISprotocol This was the commit I removed, sipa@6498673. But I think the correct answer is TBD. |
@morcos @sipa Please refer to the number of the pull request for the the floating relay fee in this one at such time when it is created so that the discussion / progress on this can be followed. So far I have been following this as follows: |
Rebased now that #6498 has been merged |
Indexes on: - Tx Hash - Fee Rate (fee-per-kb)
The mempool will now have a soft cap set below its hard cap. After it fills up to the soft cap, transactions much first try using StageTrimToSize to evict the amount of size they are adding. If they fail they can still be let into the mempool if they pass a higher relay minimum. It doubles 10 times between the soft cap and hard cap.
StageTrimToSize will make several attempts to find a set of transactions it can evict from the mempool to make room for the new transaction. It should be aware of any required minimum relay fee that needs to be paid for by the new transaction after accounting for the fees of the deleted transactions.
Use reserve space between soft cap and hard cap as a reservoir of surplus fees that have been paid above the minRelayTxFee and occasionally use the aggregate usage there to trim from the bottom of the mempool.
Improve logging Use insecure_rand in TrimMempool Tweak logic of TrimMempool Add occasional larger SurplusTrim. Bypass eviction on disconnected block txs Additional SurplusTrim for bypassed size Acquire locks appropriately
Rebased and squashed various cleanups and small changes into the last commit. |
Closing in lieu of #6557, will reopen if we decide for some reason we want the mempool limiting without the descendant package tracking... |
I'm beginning to wonder just how far down the rabbit hole this all goes. And what happened to the floating relay fee bit? |
@sipa @sdaftuar @jtimon @petertodd
OK, this is the best combination of approaches I could put together.
I took #6455 and I removed the floating relay fee commit. I really like that idea, but it needs to be much slower acting I think and not subject to potential abuse. That can be a later improvement.
I added 3 things:
The reject rates in my test setup have dropped to 0.3% for 30k feerate tx's and 0.05% for 60k feerate tx's. (See other results in #6455).
I made an attempt to tweak the looping parameters in StageTrimToSize to something that I think made sense, but with the contrived test setup, and only one set of simulation data, they are probably best evaluated on the basis of intuition and not relying entirely on the resulting rejection rates.
It turns out the slowest part of StageTrimToSize was GetRand() by a long shot, so I hacked it out, but I'm sure @sipa will want to replace my hack with something nicer.
The code works as is, but could still use some work, but I think its time to get more eyes on this suggestion for a plan forward.