Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feerate histogram to getmempoolinfo #15836

Closed
wants to merge 2 commits into from

Conversation

jonasschnelli
Copy link
Contributor

This follows the approach of adding statistical information to Bitcoin Core that would otherwise be inefficient to calculate outside of the codebase.

Adds an optional feerate histogram to getmempoolinfo.

The concept and code is heavily inspired by the stats @jhoenicke runs (https://github.com/jhoenicke/mempool).

If someone has a good idea how to make the feerate-groups dynamic but also semi-constant for similar fee environments, please comment.

If this is feature we'd like to have in master (concept ACKs), I'd continue this with writing tests.

A simple plot of the data is here.
RPC output sample is here.

Copy link
Member

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

I think that out-of-the-box we can expose some stats like this.

I think it may be useful to include the current timestamp in the response - some client can just run a cron script to call getmempoolinfo and store it (without changing the JSON response).

throw std::runtime_error(
RPCHelpMan{"getmempoolinfo",
"\nReturns details on the active state of the TX memory pool.\n",
{},
{
{"with_fee_histogram", RPCArg::Type::BOOL, /* default */ "false", "True for including the fee histogram in the response"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this parameter, it could have fee_histogram_bins (that defaults to [] which means no histogram is included in the response). This would replace the above feelimits and also avoids breaking clients implementation.

std::vector<uint64_t> count(feelimits.size(), 0);
std::vector<uint64_t> fees(feelimits.size(), 0);

LOCK(pool.cs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should move this up (done in #15474).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That pull was merged, please rebase and remove this lock.

@@ -629,7 +639,8 @@ static const struct {
{"/rest/block/notxdetails/", rest_block_notxdetails},
{"/rest/block/", rest_block_extended},
{"/rest/chaininfo", rest_chaininfo},
{"/rest/mempool/info", rest_mempool_info},
{"/rest/mempool/info", rest_mempool_info_basic},
{"/rest/mempool/info/with_fee_histogram", rest_mempool_info_with_fee_histogram},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just start to use query parameters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would eventually be better but not scope of this PR (following the current scheme).

Copy link
Member

@laanwj laanwj Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still voting against query parameters. With REST the general preference seems to be to turn parameters into URL segments, and query parameters tend to be avoided because they look ugly and are hard to remember.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a "standard" here but with REST usually the URL path identifies a resource, a collection of resources, or an action - the verb is also relevant. But parameters are usually set in the URL query, order independent and can be optional. I also think this is more flexible, for instance, you could support ...?verbose=true in all endpoints (just an example).

Copy link
Member

@promag promag Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be before the above line (order is important) otherwise rest_mempool_info_with_fee_histogram is never called.


// distribute feerates into feelimits
for (size_t i = 0; i < feelimits.size(); i++) {
if (feeperbyte >= feelimits[i] && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong but if feelimits is sorted then && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1]) is not necessary.

Copy link
Member

@promag promag Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beside, it could avoid linear search by using std::find.

Copy link
Contributor

@kiminuo kiminuo Mar 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong but if feelimits is sorted then && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1]) is not necessary.

Yes, but then then for loop on line 1532 should be in reverse order: for (int i = feelimits.size() - 1; i >= 0; i--) {

Copy link
Contributor

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. Useful addition 👍 . Tested RPC output and help man output. Agree with @promag on the current timestamp. Perhaps go with a name-based argument e.g. fee_histogram=true from the start?


// distribute feerates into feelimits
for (size_t i = 0; i < feelimits.size(); i++) {
if (feeperbyte >= feelimits[i] && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be efficient to memoize feelimits.size() - 1 ? (if the compiler doesn't optimize it automatically, my C++ is rusty)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1]) can be removed, the dependency on feelimits being sorted would need a regression test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be efficient to memoize feelimits.size() - 1 ? (if the compiler doesn't optimize it automatically, my C++ is rusty)

It shouldn't impact performance either way.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 18, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #17564 (rpc: Use mempool from node context instead of global by MarcoFalke)
  • #16365 (Log RPC parameters (arguments) if -debug=rpcparams by LarryRuane)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@@ -1496,16 +1496,76 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool)
ret.pushKV("mempoolminfee", ValueFromAmount(std::max(pool.GetMinFee(maxmempool), ::minRelayTxFee).GetFeePerK()));
ret.pushKV("minrelaytxfee", ValueFromAmount(::minRelayTxFee.GetFeePerK()));

if (with_fee_histogram) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this directly into getmempoolinfo? Or another helper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about another call, but extending mempoolinfo with an option for "more data" seems to be most allied with other calls where one can get more extended infos on option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean just have the code outside this function. The RPC would then call both MempoolInfoToJSON and also JSONMempoolInfoAddHistogram (or whatever this code gets called)

@kristapsk
Copy link
Contributor

Concept ACK / tACK 9ef9325

@laanwj
Copy link
Member

laanwj commented Jun 7, 2019

Concept ACK. This gives me a new warning on build:

/home/user/src/bitcoin/src/rpc/blockchain.cpp:1550:9: warning: acquiring mutex 'pool.cs' that is already held [-Wthread-safety-analysis]
        LOCK(pool.cs);
        ^
/home/user/src/bitcoin/src/sync.h:182:42: note: expanded from macro 'LOCK'
#define LOCK(cs) DebugLock<decltype(cs)> PASTE2(criticalblock, __COUNTER__)(cs, #cs, __FILE__, __LINE__)
                                         ^
/home/user/src/bitcoin/src/sync.h:180:22: note: expanded from macro 'PASTE2'
#define PASTE2(x, y) PASTE(x, y)
                     ^
/home/user/src/bitcoin/src/sync.h:179:21: note: expanded from macro 'PASTE'
#define PASTE(x, y) x ## y
                    ^
<scratch space>:138:1: note: expanded from here
criticalblock23
^
1 warning generated.

@jonasschnelli
Copy link
Contributor Author

Fixed the lock issue.
Rebased.

Copy link
Member

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, here's a test for your consideration 2509daf 0b6ba66.

@@ -0,0 +1,2 @@
// Add predefined macros for your project here. For example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these files and maybe update .gitignore?

@@ -629,7 +639,8 @@ static const struct {
{"/rest/block/notxdetails/", rest_block_notxdetails},
{"/rest/block/", rest_block_extended},
{"/rest/chaininfo", rest_chaininfo},
{"/rest/mempool/info", rest_mempool_info},
{"/rest/mempool/info", rest_mempool_info_basic},
{"/rest/mempool/info/with_fee_histogram", rest_mempool_info_with_fee_histogram},
Copy link
Member

@promag promag Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be before the above line (order is important) otherwise rest_mempool_info_with_fee_histogram is never called.

@DrahtBot
Copy link
Contributor

Needs rebase

info_sub.pushKV("from_feerate", feelimits[i]);
info_sub.pushKV("to_feerate", i == feelimits.size() - 1 ? std::numeric_limits<int64_t>::max() : feelimits[i + 1]);
total_fees += fees[i];
info.pushKV(std::to_string(feelimits[i]), info_sub);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should change this to use ToString

@NicolasDorier
Copy link
Contributor

Concept ACK, it would be super useful

Copy link
Member

@dergoegge dergoegge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. This would be useful.

I tested the rpc, help man output and created a plot for the output from my node: mempool.png
If anyone else wants to try this i used this script: https://gist.github.com/dergoegge/ec73e0de2d858d2e75bf31a6a7b3e6b2

@jonasschnelli I also rebased this for you here: https://github.com/dergoegge/bitcoin/tree/histogram_rebase

}
}
CAmount total_fees = 0; //track total amount of available fees in mempool
UniValue info(UniValue::VOBJ);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON objects are unordered collections, so maybe using an array for "fee_histogram" would make more sense since it would always stay sorted.

@shesek
Copy link
Contributor

shesek commented Jan 19, 2021

Concept ACK, I will be using this if available for both bwt and esplora/electrs. Electrum Personal Server can also benefit from it.

@kiminuo
Copy link
Contributor

kiminuo commented Mar 6, 2021

@jonasschnelli This https://github.com/kiminuo/bitcoin/tree/feature/2021-03-Feerate-histogram is an attempt to do the rebase work and apply a few review comments:

Applied review comments

Test commands

$ ./bitcoin-cli -testnet getmempoolinfo true # To test the new behavior
$ test/functional/test_runner.py mempool_fee_histogram.py # To run the new test
$ ./bitcoin-cli -testnet help getmempoolinfo # bitcoind has to run for this command to succeed :(
getmempoolinfo ( with_fee_histogram )

Returns details on the active state of the TX memory pool.

Arguments:
1. with_fee_histogram    (boolean, optional, default=false) True for including the fee histogram in the response

Result:
{                            (json object)
  "loaded" : true|false,     (boolean) True if the mempool is fully loaded
  "size" : n,                (numeric) Current tx count
  "bytes" : n,               (numeric) Sum of all virtual transaction sizes as defined in BIP 141. Differs from actual serialized size because witness data is discounted
  "usage" : n,               (numeric) Total memory usage for the mempool
  "total_fee" : n,           (numeric) Total fees for the mempool in BTC, ignoring modified fees through prioritizetransaction
  "maxmempool" : n,          (numeric) Maximum memory usage for the mempool
  "mempoolminfee" : n,       (numeric) Minimum fee rate in BTC/kB for tx to be accepted. Is the maximum of minrelaytxfee and minimum mempool fee
  "minrelaytxfee" : n,       (numeric) Current minimum relay fee for transactions
  "unbroadcastcount" : n,    (numeric) Current number of transactions that haven't passed initial broadcast yet
  "fee_histogram" : {        (json object)
    "<feerate-group>" : {    (json object) Object per feerate group
      "sizes" : n,           (numeric) Cumulated size of all transactions in feerate group
      "count" : n,           (numeric) Amount of transactions in feerate group
      "fees" : n,            (numeric) Cumulated fee of all transactions in feerate group
      "from_feerate" : n,    (numeric) Group contains transaction with feerates equal or greater than this value
      "to_feerate" : n       (numeric) Group contains transaction with feerates less than than this value
    },
    "total_fees" : n         (numeric) Total available fees in mempool
  }
}

Examples:
> bitcoin-cli getmempoolinfo
> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "getmempoolinfo", "params": []}' -H 'content-type: text/plain;' http://127.0.0.1:8332/

Output on testnet (2021-03-07)

./bitcoin-cli -testnet getmempoolinfo true
JSON output
  {
    "loaded": true,
    "size": 73,
    "bytes": 19620,
    "usage": 108816,
    "total_fee": 0.00833952,
    "maxmempool": 300000000,
    "mempoolminfee": 0.00001000,
    "minrelaytxfee": 0.00001000,
    "unbroadcastcount": 0,
    "fee_histogram": {
      "1": {
        "sizes": 6615,
        "count": 38,
        "fees": 7817,
        "from_feerate": 1,
        "to_feerate": 2
      },
      "2": {
        "sizes": 1553,
        "count": 5,
        "fees": 3852,
        "from_feerate": 2,
        "to_feerate": 3
      },
      "3": {
        "sizes": 251,
        "count": 2,
        "fees": 784,
        "from_feerate": 3,
        "to_feerate": 4
      },
      "4": {
        "sizes": 285,
        "count": 2,
        "fees": 1356,
        "from_feerate": 4,
        "to_feerate": 5
      },
      "5": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 5,
        "to_feerate": 6
      },
      "6": {
        "sizes": 166,
        "count": 1,
        "fees": 1130,
        "from_feerate": 6,
        "to_feerate": 7
      },
      "7": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 7,
        "to_feerate": 8
      },
      "8": {
        "sizes": 225,
        "count": 1,
        "fees": 2000,
        "from_feerate": 8,
        "to_feerate": 10
      },
      "10": {
        "sizes": 168,
        "count": 1,
        "fees": 1808,
        "from_feerate": 10,
        "to_feerate": 12
      },
      "12": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 12,
        "to_feerate": 14
      },
      "14": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 14,
        "to_feerate": 17
      },
      "17": {
        "sizes": 1581,
        "count": 1,
        "fees": 31200,
        "from_feerate": 17,
        "to_feerate": 20
      },
      "20": {
        "sizes": 332,
        "count": 2,
        "fees": 8040,
        "from_feerate": 20,
        "to_feerate": 25
      },
      "25": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 25,
        "to_feerate": 30
      },
      "30": {
        "sizes": 2037,
        "count": 4,
        "fees": 64410,
        "from_feerate": 30,
        "to_feerate": 40
      },
      "40": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 40,
        "to_feerate": 50
      },
      "50": {
        "sizes": 2768,
        "count": 4,
        "fees": 143913,
        "from_feerate": 50,
        "to_feerate": 60
      },
      "60": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 60,
        "to_feerate": 70
      },
      "70": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 70,
        "to_feerate": 80
      },
      "80": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 80,
        "to_feerate": 100
      },
      "100": {
        "sizes": 1079,
        "count": 7,
        "fees": 110042,
        "from_feerate": 100,
        "to_feerate": 120
      },
      "120": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 120,
        "to_feerate": 140
      },
      "140": {
        "sizes": 1998,
        "count": 3,
        "fees": 300000,
        "from_feerate": 140,
        "to_feerate": 170
      },
      "170": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 170,
        "to_feerate": 200
      },
      "200": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 200,
        "to_feerate": 250
      },
      "250": {
        "sizes": 371,
        "count": 1,
        "fees": 100000,
        "from_feerate": 250,
        "to_feerate": 300
      },
      "300": {
        "sizes": 191,
        "count": 1,
        "fees": 57600,
        "from_feerate": 300,
        "to_feerate": 400
      },
      "400": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 400,
        "to_feerate": 500
      },
      "500": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 500,
        "to_feerate": 600
      },
      "600": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 600,
        "to_feerate": 700
      },
      "700": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 700,
        "to_feerate": 800
      },
      "800": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 800,
        "to_feerate": 1000
      },
      "1000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1000,
        "to_feerate": 1200
      },
      "1200": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1200,
        "to_feerate": 1400
      },
      "1400": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1400,
        "to_feerate": 1700
      },
      "1700": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1700,
        "to_feerate": 2000
      },
      "2000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 2000,
        "to_feerate": 2500
      },
      "2500": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 2500,
        "to_feerate": 3000
      },
      "3000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 3000,
        "to_feerate": 4000
      },
      "4000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 4000,
        "to_feerate": 5000
      },
      "5000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 5000,
        "to_feerate": 6000
      },
      "6000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 6000,
        "to_feerate": 7000
      },
      "7000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 7000,
        "to_feerate": 8000
      },
      "8000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 8000,
        "to_feerate": 10000
      },
      "10000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 10000,
        "to_feerate": 9223372036854775807
      },
      "total_fees": 833952
    }
  }

@kiminuo
Copy link
Contributor

kiminuo commented Mar 12, 2021

I have forked this PR: #21422 and I'm willing to continue working on that.

UniValue info(UniValue::VOBJ);
for (size_t i = 0; i < feelimits.size(); i++) {
UniValue info_sub(UniValue::VOBJ);
info_sub.pushKV("sizes", sizes[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is sizes? I've noticed that when adding up all the sizes when the mempool is full, that the number doesn't stay fixed as I would have expected when the mempool is full - so it's not the number of bytes used in memory to store the tx - so, what is it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the histogram is based on fee rates intervals. The histogram is modeled using three vectors:

        std::vector<uint64_t> sizes(feelimits.size(), 0);
        std::vector<uint64_t> count(feelimits.size(), 0);
        std::vector<uint64_t> fees(feelimits.size(), 0);

where sizes[0] represents cumulative size of txs belonging to the first fee rate interval [1, 2), sizes[1] represents cumulative size of txs belonging to the second fee rate interval [2, 3), etc.

Line 1527 shows that we add size to sizes[i] where size is defined as int size = (int)e.GetTxSize(); which is defined as:

size_t CTxMemPoolEntry::GetTxSize() const
{
    return GetVirtualTransactionSize(nTxWeight, sigOpCost);
}

What is virtual size of a transaction? This is explained here: https://bitcoin.stackexchange.com/questions/92689/how-is-the-size-of-a-bitcoin-transaction-calculated.

HTH!

promo: You may have a look at #21422 too. :)

@kiminuo
Copy link
Contributor

kiminuo commented Apr 14, 2021

I'm chasing "concept ACK"s for the reborn version of this PR - namely #21422.

Any other feedback is welcome, I have time to do modifications if needed to increase the chance of getting the PR to be merged.

@rebroad
Copy link
Contributor

rebroad commented Apr 18, 2021

@jonasschnelli the simple plot example doesn't display (seems the website is down).

@kiminuo
Copy link
Contributor

kiminuo commented Apr 18, 2021

@rebroad The link redirects to https://bitcoin.jonasschnelli.ch/mempool-histogram/ but there is a bug (missing slash). Anyway, there are no data at the moment.

@fanquake
Copy link
Member

Closing this, given it's been taken over in #21422.

@fanquake fanquake closed this Aug 18, 2021
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet