New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fee estimation: extend bucket ranges consistently #21161
Conversation
This is currently giving me somewhat lower feerate estimates on mainnet -- 173sat/vb becomes 142sat/vb for 4 blocks, 82sat/vb becomes 65sat/vb for 25 blocks. Not sure if that's fixing a bug, or introducing a bug though -- I don't think it's noise, because I'm also seeing a reduction when running the estimates against the fixed fee_estimates dat I had in #13990. EDIT: at first glance looks like these fee differences were due to the mishandling of fail buckets morcos describes below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Nice find on the bug and I think you've more or less identified the problem (I think I recapped it below) However I believe there are 2 issues with your potential fix.
I'll try to check IRC over the next few days if you want to ping me to discuss further. A solution is not immediately apparent to me. It's possible that the best solution is to just relax the requirement that fee rates have to be monotonically decreasing as confirmation target increases. These are edge cases where it's not clear which answer would be better anyway. I think the monotonic requirement was good as a QA check, but at this point may be causing more harm than good. Problem recap: The estimate for 3-block target fails at 150 sat/vb but passes when you add in the 100 sat/vb bucket, so it returns a median fee somewhere in the middle of those 2 buckets. On the other hand the estimate for the 4-block target passes at 150 sat/vb, but then does not have enough data at 100 sat/vb alone and fails. So it returns a median fee in the 150 sat/vb bucket (HIGHER than the rate for the 3-block target) |
@morcos Yes, that makes sense! I think I'm only looking at each bucket once ( Agree that the new code is ignoring high feerate failures now and that that is wrong; I think I've got an idea how to fix that, will see about updating the PR over the weekend. |
f7d5e9c
to
a5e33d3
Compare
Updated -- I traded in my hatchet for a scalpel; think this should exactly fix the problem. |
I think in this case the 4-block target would calculate:
and 3-block target would say:
but that just gives a higher fee to the lower target which is correct. I think the actual bug needs more than 2 buckets, so that you get:
at which point they start locking at separate bucket sets, and get:
at which point 3-blocks is locking for a median in 180-250 sat/vb while 4 blocks is looking for a median in 190-250 sat/vb and the higher target can get a higher fee rate. |
Concept ACK on initial, non-expert look and |
Ah, perhaps I misread on the quadratic issues. Not that it really matters, but I think both of our recaps of the problem were not quite right. In the first part of your recap, the count for sufficient number of transactions resets when a passing bucket is found so 100 sat/vb would not be passing for a 4-block target. but you are right that in my version the 3-block target would find the median fee in the 150 bucket which would still be ok. So yeah something like your more complicated scenario is probably what happens. What you've done with a5e33d3 looks like it should work. I'd be a little worried there is some unintended consequence though. What I would have done in the past with a change like this is to run the fee estimator through historical transaction and block data and then look at places where the new and old differed and make sure that I thought the difference made sense. Not sure how easy that is for anyone to do right now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good gob
Don't have historical data but i've been running an Here are my findings so far, this branch seems to give constantly increasing and higher estimates and i believe there is definitely something wrong with estimates for large targets. 1 block target3 blocks target6 blocks12 blocks24 blocks48 blocks100 blockspointsGithub won't let me upload a csv, so here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK!
When calculating a median fee for a confirmation target at a particular threshold, we analyse buckets in ranges rather than individually in case some buckets have very little data. This patch ensures the breaks between ranges are independent of the the confirmation target.
a5e33d3
to
a5e39d3
Compare
Rebased so tests can update. @darosior I'm not seeing the behaviour you did (and what you saw doesn't make sense to me). For me, duplicating node state, and starting two nodes, one with this pr, and one without it, I see fee rates tracking each other pretty precisely. I've put my data in a google sheet (the pr21161 log has ~12 less entries than the master one, so they're up to 10 minutes out of sync by the end). |
There hasn't been much activity lately. What is the status here? Finding reviewers may take time. However, if the patch is no longer relevant, please close this pull request. If the author lost interest or time to work on this, please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a ton of background on the fee estimator but understanding here is:
- In
EstimateMedianVal
we group buckets together until they meetsufficientTxVal
i.e. have enough data before we check if this range meets success rate or not. We accumulate the total txns for the current range intototalNum
and check if there's enough. - Before, we would reset the stats if we did meet the success rate. Now, we reset every time we meet
sufficientTxVal
.- Past bucket ranges would be dependent on what the confirmation target is because resetting
totalNum
depends on whether success rate is met. - These bucket ranges are consistent regardless of what the target is, since the total number of transactions isn't different.
- Past bucket ranges would be dependent on what the confirmation target is because resetting
Here are my findings so far, this branch seems to give constantly increasing and higher estimates and i believe there is definitely something wrong with estimates for large targets.
I'm not seeing the behaviour you did (and what you saw doesn't make sense to me). For me, duplicating node state, and starting two nodes, one with this pr, and one without it, I see fee rates tracking each other pretty precisely.
Fwiw I hacked together a branch that implements both EstimateMedianVal
s (i.e. master and this branch) and returns feerates using both implementations from estimatesmartfee, to eliminate differences from having slightly different mempools, timing, etc. They are giving me the same feerates every time. I cleared fee_estimates.dat, same thing. I could have implemented it wrong, but I also don't see how the huge differences could happen based on my understanding of the code.
I don't have a ton of background but if this doesn't have a significant impact on fee estimates in practice, it seems worth trying. The test still fails semi-regularly (#25179, #20725, #23165).
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue this is solving occurs intermittently.
The test failed on recent master c325f0f with seed from @mzumsande and passed on this PR rebased on master head c325f0f.
Reading through to understand the patch in progress
Approach ACK What I understand from the discussion and the code, the issue this PR is solving is that intermittently This occurs because of the way the
Hence the the number of fee rates (bucket ranges) that we use to determine the success threshold sometimes overlaps the I will expand on the example @ajtowns gave to show how this occurs Assuming Our fee rate buckets data points are:
Confirmation Target: 3
Median Fee: 180-250 Fee rate bucket Confirmation Target: 4
Median Fee: 190-250 Fee rate bucket Confirmation target 4 fee rate will end up being higher than 3. With this PR Confirmation Target 3 will then be
Median Fee: 180-199 Confirmation Target 4 will also be
Median Fee: 180-199 Fee rate bucket Hence it is will not be higher because.
The above example also indicates that the number of transactions will not change [Txs column] irrespective of confirmation target, the bucket range and the data points are the same and will not overlap. Also originally for e.g given a bucket range of 200-250 with 40 transactions, the confirmed transactions of target 2 is 32, then the number of confirmed transactions for target 3 above must be 32 above. This PR ensures the fee estimate will decrease monotonically as the confirmation target increases. I will test the fee estimate in this branch against the fee estimate on master to see if there is much difference, but I don't think this will bring any unintended results though given this extensive test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial ACK a5e39d3
Running both of these locally
./test/functional/feature_fee_estimation.py --randomseed 7288975409214300634
./test/functional/feature_fee_estimation.py --randomseed 5585159971690227248
fails for me on master and works with this branch rebased to master.
The test itself looks correct and the issue appears to involve the code in question here.
Per the test output with the following diff, it looks like the only feerates changed in the test with this PR (rebased to master) compared to master are related to the failing fee estimate cases. And also that the fee reasons remain the same.
diff --git a/src/rpc/fees.cpp b/src/rpc/fees.cpp
index 62396d4c589..d8d82921210 100644
--- a/src/rpc/fees.cpp
+++ b/src/rpc/fees.cpp
@@ -47,6 +47,7 @@ static RPCHelpMan estimatesmartfee()
RPCResult::Type::OBJ, "", "",
{
{RPCResult::Type::NUM, "feerate", /*optional=*/true, "estimate fee rate in " + CURRENCY_UNIT + "/kvB (only present if no errors were encountered)"},
+ {RPCResult::Type::STR, "fee_reason", /*optional=*/true, "fee reason returned by estimator (only present if no errors were encountered)"},
{RPCResult::Type::ARR, "errors", /*optional=*/true, "Errors encountered during processing (if there are any)",
{
{RPCResult::Type::STR, "", "error"},
@@ -91,6 +92,7 @@ static RPCHelpMan estimatesmartfee()
errors.push_back("Insufficient data or no feerate found");
result.pushKV("errors", errors);
}
+ result.pushKV("fee_reason", StringForFeeReason(feeCalc.reason));
result.pushKV("blocks", feeCalc.returnedTarget);
return result;
},
diff --git a/test/functional/feature_fee_estimation.py b/test/functional/feature_fee_estimation.py
index 4f56d585d3f..47b86704740 100755
--- a/test/functional/feature_fee_estimation.py
+++ b/test/functional/feature_fee_estimation.py
@@ -95,6 +95,7 @@ def check_smart_estimates(node, fees_seen):
mempoolMinFee = node.getmempoolinfo()["mempoolminfee"]
minRelaytxFee = node.getmempoolinfo()["minrelaytxfee"]
for i, e in enumerate(all_smart_estimates): # estimate is for i+1
+ print(e)
feerate = float(e["feerate"])
assert_greater_than(feerate, 0)
assert_greater_than_or_equal(feerate, float(mempoolMinFee))
@@ -283,7 +289,14 @@ double TxConfirmStats::EstimateMedianVal(int confTarget, double sufficientTxVal, | |||
// we can test for success | |||
// (Only count the confirmed data points, so that each confirmation count | |||
// will be looking at the same amount of data and same bucket breaks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this comment could be updated to be more clear as an introduction to the partialNum
conditional that is replacing the previous totalNum
one.
Code review ACK a5e39d3 |
When calculating a median fee for a confirmation target at a particular threshold, we analyse buckets in ranges rather than individually in case some buckets have very little data. This patch ensures the breaks between ranges are independent of the the confirmation target.
Fixes #20725