Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feerate histogram to getmempoolinfo #15836

Closed
wants to merge 2 commits into from

Conversation

@jonasschnelli
Copy link
Member

@jonasschnelli jonasschnelli commented Apr 17, 2019

This follows the approach of adding statistical information to Bitcoin Core that would otherwise be inefficient to calculate outside of the codebase.

Adds an optional feerate histogram to getmempoolinfo.

The concept and code is heavily inspired by the stats @jhoenicke runs (https://github.com/jhoenicke/mempool).

If someone has a good idea how to make the feerate-groups dynamic but also semi-constant for similar fee environments, please comment.

If this is feature we'd like to have in master (concept ACKs), I'd continue this with writing tests.

A simple plot of the data is here.
RPC output sample is here.

@jonasschnelli jonasschnelli force-pushed the jonasschnelli:2019/04/feeinfo branch from 2f27f15 to 80fbf80 Apr 17, 2019
Copy link
Member

@promag promag left a comment

Concept ACK.

I think that out-of-the-box we can expose some stats like this.

I think it may be useful to include the current timestamp in the response - some client can just run a cron script to call getmempoolinfo and store it (without changing the JSON response).

throw std::runtime_error(
RPCHelpMan{"getmempoolinfo",
"\nReturns details on the active state of the TX memory pool.\n",
{},
{
{"with_fee_histogram", RPCArg::Type::BOOL, /* default */ "false", "True for including the fee histogram in the response"},

This comment has been minimized.

@promag

promag Apr 17, 2019
Member

Instead of this parameter, it could have fee_histogram_bins (that defaults to [] which means no histogram is included in the response). This would replace the above feelimits and also avoids breaking clients implementation.

std::vector<uint64_t> count(feelimits.size(), 0);
std::vector<uint64_t> fees(feelimits.size(), 0);

LOCK(pool.cs);

This comment has been minimized.

@promag

promag Apr 17, 2019
Member

I believe we should move this up (done in #15474).

This comment has been minimized.

@promag

promag Apr 17, 2019
Member

That pull was merged, please rebase and remove this lock.

@@ -629,7 +639,8 @@ static const struct {
{"/rest/block/notxdetails/", rest_block_notxdetails},
{"/rest/block/", rest_block_extended},
{"/rest/chaininfo", rest_chaininfo},
{"/rest/mempool/info", rest_mempool_info},
{"/rest/mempool/info", rest_mempool_info_basic},
{"/rest/mempool/info/with_fee_histogram", rest_mempool_info_with_fee_histogram},

This comment has been minimized.

@promag

promag Apr 17, 2019
Member

Can't we just start to use query parameters?

This comment has been minimized.

@jonasschnelli

jonasschnelli Apr 17, 2019
Author Member

Would eventually be better but not scope of this PR (following the current scheme).

This comment has been minimized.

@laanwj

laanwj Jun 7, 2019
Member

Still voting against query parameters. With REST the general preference seems to be to turn parameters into URL segments, and query parameters tend to be avoided because they look ugly and are hard to remember.

This comment has been minimized.

@promag

promag Jun 7, 2019
Member

I don't think there's a "standard" here but with REST usually the URL path identifies a resource, a collection of resources, or an action - the verb is also relevant. But parameters are usually set in the URL query, order independent and can be optional. I also think this is more flexible, for instance, you could support ...?verbose=true in all endpoints (just an example).

This comment has been minimized.

@promag

promag Jul 17, 2019
Member

This must be before the above line (order is important) otherwise rest_mempool_info_with_fee_histogram is never called.


// distribute feerates into feelimits
for (size_t i = 0; i < feelimits.size(); i++) {
if (feeperbyte >= feelimits[i] && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])) {

This comment has been minimized.

@promag

promag Apr 17, 2019
Member

Correct me if I'm wrong but if feelimits is sorted then && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1]) is not necessary.

This comment has been minimized.

@promag

promag Apr 17, 2019
Member

Beside, it could avoid linear search by using std::find.

This comment has been minimized.

@kiminuo

kiminuo Mar 6, 2021
Contributor

Correct me if I'm wrong but if feelimits is sorted then && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1]) is not necessary.

Yes, but then then for loop on line 1532 should be in reverse order: for (int i = feelimits.size() - 1; i >= 0; i--) {

Copy link
Member

@jonatack jonatack left a comment

Concept ACK. Useful addition 👍 . Tested RPC output and help man output. Agree with @promag on the current timestamp. Perhaps go with a name-based argument e.g. fee_histogram=true from the start?


// distribute feerates into feelimits
for (size_t i = 0; i < feelimits.size(); i++) {
if (feeperbyte >= feelimits[i] && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])) {

This comment has been minimized.

@jonatack

jonatack Apr 17, 2019
Member

Would it be efficient to memoize feelimits.size() - 1 ? (if the compiler doesn't optimize it automatically, my C++ is rusty)

This comment has been minimized.

@jonatack

jonatack Apr 17, 2019
Member

If && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1]) can be removed, the dependency on feelimits being sorted would need a regression test.

This comment has been minimized.

@promag

promag Apr 18, 2019
Member

Would it be efficient to memoize feelimits.size() - 1 ? (if the compiler doesn't optimize it automatically, my C++ is rusty)

It shouldn't impact performance either way.

@DrahtBot
Copy link
Contributor

@DrahtBot DrahtBot commented Apr 18, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #17564 (rpc: Use mempool from node context instead of global by MarcoFalke)
  • #16365 (Log RPC parameters (arguments) if -debug=rpcparams by LarryRuane)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@@ -1496,16 +1496,76 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool)
ret.pushKV("mempoolminfee", ValueFromAmount(std::max(pool.GetMinFee(maxmempool), ::minRelayTxFee).GetFeePerK()));
ret.pushKV("minrelaytxfee", ValueFromAmount(::minRelayTxFee.GetFeePerK()));

if (with_fee_histogram) {

This comment has been minimized.

@luke-jr

luke-jr Apr 18, 2019
Member

Maybe move this directly into getmempoolinfo? Or another helper?

This comment has been minimized.

@jonasschnelli

jonasschnelli Apr 24, 2019
Author Member

I thought about another call, but extending mempoolinfo with an option for "more data" seems to be most allied with other calls where one can get more extended infos on option.

This comment has been minimized.

@luke-jr

luke-jr Apr 24, 2019
Member

I mean just have the code outside this function. The RPC would then call both MempoolInfoToJSON and also JSONMempoolInfoAddHistogram (or whatever this code gets called)

@jonasschnelli jonasschnelli force-pushed the jonasschnelli:2019/04/feeinfo branch from 80fbf80 to c97a9dd Apr 24, 2019
@jonasschnelli jonasschnelli force-pushed the jonasschnelli:2019/04/feeinfo branch from c97a9dd to 5d47656 May 9, 2019
@kristapsk
Copy link
Contributor

@kristapsk kristapsk commented May 14, 2019

Concept ACK / tACK 9ef9325

@laanwj
Copy link
Member

@laanwj laanwj commented Jun 7, 2019

Concept ACK. This gives me a new warning on build:

/home/user/src/bitcoin/src/rpc/blockchain.cpp:1550:9: warning: acquiring mutex 'pool.cs' that is already held [-Wthread-safety-analysis]
        LOCK(pool.cs);
        ^
/home/user/src/bitcoin/src/sync.h:182:42: note: expanded from macro 'LOCK'
#define LOCK(cs) DebugLock<decltype(cs)> PASTE2(criticalblock, __COUNTER__)(cs, #cs, __FILE__, __LINE__)
                                         ^
/home/user/src/bitcoin/src/sync.h:180:22: note: expanded from macro 'PASTE2'
#define PASTE2(x, y) PASTE(x, y)
                     ^
/home/user/src/bitcoin/src/sync.h:179:21: note: expanded from macro 'PASTE'
#define PASTE(x, y) x ## y
                    ^
<scratch space>:138:1: note: expanded from here
criticalblock23
^
1 warning generated.

@jonasschnelli jonasschnelli force-pushed the jonasschnelli:2019/04/feeinfo branch from 5d47656 to 2f97b31 Jul 17, 2019
@jonasschnelli
Copy link
Member Author

@jonasschnelli jonasschnelli commented Jul 17, 2019

Fixed the lock issue.
Rebased.

@jonasschnelli jonasschnelli force-pushed the jonasschnelli:2019/04/feeinfo branch from 2f97b31 to b94292a Jul 17, 2019
Copy link
Member

@promag promag left a comment

Looks good, here's a test for your consideration 2509daf 0b6ba66.

@@ -0,0 +1,2 @@
// Add predefined macros for your project here. For example:

This comment has been minimized.

@promag

promag Jul 17, 2019
Member

Remove these files and maybe update .gitignore?

@@ -629,7 +639,8 @@ static const struct {
{"/rest/block/notxdetails/", rest_block_notxdetails},
{"/rest/block/", rest_block_extended},
{"/rest/chaininfo", rest_chaininfo},
{"/rest/mempool/info", rest_mempool_info},
{"/rest/mempool/info", rest_mempool_info_basic},
{"/rest/mempool/info/with_fee_histogram", rest_mempool_info_with_fee_histogram},

This comment has been minimized.

@promag

promag Jul 17, 2019
Member

This must be before the above line (order is important) otherwise rest_mempool_info_with_fee_histogram is never called.

@DrahtBot
Copy link
Contributor

@DrahtBot DrahtBot commented Dec 16, 2019

Needs rebase

info_sub.pushKV("from_feerate", feelimits[i]);
info_sub.pushKV("to_feerate", i == feelimits.size() - 1 ? std::numeric_limits<int64_t>::max() : feelimits[i + 1]);
total_fees += fees[i];
info.pushKV(std::to_string(feelimits[i]), info_sub);

This comment has been minimized.

@luke-jr

luke-jr Jun 10, 2020
Member

Should change this to use ToString

@NicolasDorier
Copy link
Contributor

@NicolasDorier NicolasDorier commented Oct 12, 2020

Concept ACK, it would be super useful

Copy link
Contributor

@dergoegge dergoegge left a comment

Concept ACK. This would be useful.

I tested the rpc, help man output and created a plot for the output from my node: mempool.png
If anyone else wants to try this i used this script: https://gist.github.com/dergoegge/ec73e0de2d858d2e75bf31a6a7b3e6b2

@jonasschnelli I also rebased this for you here: https://github.com/dergoegge/bitcoin/tree/histogram_rebase

}
}
CAmount total_fees = 0; //track total amount of available fees in mempool
UniValue info(UniValue::VOBJ);

This comment has been minimized.

@dergoegge

dergoegge Nov 22, 2020
Contributor

JSON objects are unordered collections, so maybe using an array for "fee_histogram" would make more sense since it would always stay sorted.

@shesek
Copy link
Contributor

@shesek shesek commented Jan 19, 2021

Concept ACK, I will be using this if available for both bwt and esplora/electrs. Electrum Personal Server can also benefit from it.

@kiminuo
Copy link
Contributor

@kiminuo kiminuo commented Mar 6, 2021

@jonasschnelli This https://github.com/kiminuo/bitcoin/tree/feature/2021-03-Feerate-histogram is an attempt to do the rebase work and apply a few review comments:

Applied review comments

  • #15836 (comment) - "Remove these files and maybe update .gitignore?"
  • #15836 (comment) - "This must be before the above line (order is important) otherwise rest_mempool_info_with_fee_histogram is never called."
  • #15836 (comment) - "This gives me a new warning on build: [...]" This is already addressed, I believe.
  • #15836 (comment) - Simplify if (feeperbyte >= feelimits[i] && (i == feelimits.size() - 1 || feeperbyte < feelimits[i + 1])) {
  • 0b6ba66 - Test proposed by @promag

Test commands

$ ./bitcoin-cli -testnet getmempoolinfo true # To test the new behavior
$ test/functional/test_runner.py mempool_fee_histogram.py # To run the new test
$ ./bitcoin-cli -testnet help getmempoolinfo # bitcoind has to run for this command to succeed :(
getmempoolinfo ( with_fee_histogram )

Returns details on the active state of the TX memory pool.

Arguments:
1. with_fee_histogram    (boolean, optional, default=false) True for including the fee histogram in the response

Result:
{                            (json object)
  "loaded" : true|false,     (boolean) True if the mempool is fully loaded
  "size" : n,                (numeric) Current tx count
  "bytes" : n,               (numeric) Sum of all virtual transaction sizes as defined in BIP 141. Differs from actual serialized size because witness data is discounted
  "usage" : n,               (numeric) Total memory usage for the mempool
  "total_fee" : n,           (numeric) Total fees for the mempool in BTC, ignoring modified fees through prioritizetransaction
  "maxmempool" : n,          (numeric) Maximum memory usage for the mempool
  "mempoolminfee" : n,       (numeric) Minimum fee rate in BTC/kB for tx to be accepted. Is the maximum of minrelaytxfee and minimum mempool fee
  "minrelaytxfee" : n,       (numeric) Current minimum relay fee for transactions
  "unbroadcastcount" : n,    (numeric) Current number of transactions that haven't passed initial broadcast yet
  "fee_histogram" : {        (json object)
    "<feerate-group>" : {    (json object) Object per feerate group
      "sizes" : n,           (numeric) Cumulated size of all transactions in feerate group
      "count" : n,           (numeric) Amount of transactions in feerate group
      "fees" : n,            (numeric) Cumulated fee of all transactions in feerate group
      "from_feerate" : n,    (numeric) Group contains transaction with feerates equal or greater than this value
      "to_feerate" : n       (numeric) Group contains transaction with feerates less than than this value
    },
    "total_fees" : n         (numeric) Total available fees in mempool
  }
}

Examples:
> bitcoin-cli getmempoolinfo
> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "getmempoolinfo", "params": []}' -H 'content-type: text/plain;' http://127.0.0.1:8332/

Output on testnet (2021-03-07)

./bitcoin-cli -testnet getmempoolinfo true
JSON output
  {
    "loaded": true,
    "size": 73,
    "bytes": 19620,
    "usage": 108816,
    "total_fee": 0.00833952,
    "maxmempool": 300000000,
    "mempoolminfee": 0.00001000,
    "minrelaytxfee": 0.00001000,
    "unbroadcastcount": 0,
    "fee_histogram": {
      "1": {
        "sizes": 6615,
        "count": 38,
        "fees": 7817,
        "from_feerate": 1,
        "to_feerate": 2
      },
      "2": {
        "sizes": 1553,
        "count": 5,
        "fees": 3852,
        "from_feerate": 2,
        "to_feerate": 3
      },
      "3": {
        "sizes": 251,
        "count": 2,
        "fees": 784,
        "from_feerate": 3,
        "to_feerate": 4
      },
      "4": {
        "sizes": 285,
        "count": 2,
        "fees": 1356,
        "from_feerate": 4,
        "to_feerate": 5
      },
      "5": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 5,
        "to_feerate": 6
      },
      "6": {
        "sizes": 166,
        "count": 1,
        "fees": 1130,
        "from_feerate": 6,
        "to_feerate": 7
      },
      "7": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 7,
        "to_feerate": 8
      },
      "8": {
        "sizes": 225,
        "count": 1,
        "fees": 2000,
        "from_feerate": 8,
        "to_feerate": 10
      },
      "10": {
        "sizes": 168,
        "count": 1,
        "fees": 1808,
        "from_feerate": 10,
        "to_feerate": 12
      },
      "12": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 12,
        "to_feerate": 14
      },
      "14": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 14,
        "to_feerate": 17
      },
      "17": {
        "sizes": 1581,
        "count": 1,
        "fees": 31200,
        "from_feerate": 17,
        "to_feerate": 20
      },
      "20": {
        "sizes": 332,
        "count": 2,
        "fees": 8040,
        "from_feerate": 20,
        "to_feerate": 25
      },
      "25": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 25,
        "to_feerate": 30
      },
      "30": {
        "sizes": 2037,
        "count": 4,
        "fees": 64410,
        "from_feerate": 30,
        "to_feerate": 40
      },
      "40": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 40,
        "to_feerate": 50
      },
      "50": {
        "sizes": 2768,
        "count": 4,
        "fees": 143913,
        "from_feerate": 50,
        "to_feerate": 60
      },
      "60": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 60,
        "to_feerate": 70
      },
      "70": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 70,
        "to_feerate": 80
      },
      "80": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 80,
        "to_feerate": 100
      },
      "100": {
        "sizes": 1079,
        "count": 7,
        "fees": 110042,
        "from_feerate": 100,
        "to_feerate": 120
      },
      "120": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 120,
        "to_feerate": 140
      },
      "140": {
        "sizes": 1998,
        "count": 3,
        "fees": 300000,
        "from_feerate": 140,
        "to_feerate": 170
      },
      "170": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 170,
        "to_feerate": 200
      },
      "200": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 200,
        "to_feerate": 250
      },
      "250": {
        "sizes": 371,
        "count": 1,
        "fees": 100000,
        "from_feerate": 250,
        "to_feerate": 300
      },
      "300": {
        "sizes": 191,
        "count": 1,
        "fees": 57600,
        "from_feerate": 300,
        "to_feerate": 400
      },
      "400": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 400,
        "to_feerate": 500
      },
      "500": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 500,
        "to_feerate": 600
      },
      "600": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 600,
        "to_feerate": 700
      },
      "700": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 700,
        "to_feerate": 800
      },
      "800": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 800,
        "to_feerate": 1000
      },
      "1000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1000,
        "to_feerate": 1200
      },
      "1200": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1200,
        "to_feerate": 1400
      },
      "1400": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1400,
        "to_feerate": 1700
      },
      "1700": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 1700,
        "to_feerate": 2000
      },
      "2000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 2000,
        "to_feerate": 2500
      },
      "2500": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 2500,
        "to_feerate": 3000
      },
      "3000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 3000,
        "to_feerate": 4000
      },
      "4000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 4000,
        "to_feerate": 5000
      },
      "5000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 5000,
        "to_feerate": 6000
      },
      "6000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 6000,
        "to_feerate": 7000
      },
      "7000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 7000,
        "to_feerate": 8000
      },
      "8000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 8000,
        "to_feerate": 10000
      },
      "10000": {
        "sizes": 0,
        "count": 0,
        "fees": 0,
        "from_feerate": 10000,
        "to_feerate": 9223372036854775807
      },
      "total_fees": 833952
    }
  }

@kiminuo
Copy link
Contributor

@kiminuo kiminuo commented Mar 12, 2021

I have forked this PR: #21422 and I'm willing to continue working on that.

UniValue info(UniValue::VOBJ);
for (size_t i = 0; i < feelimits.size(); i++) {
UniValue info_sub(UniValue::VOBJ);
info_sub.pushKV("sizes", sizes[i]);

This comment has been minimized.

@rebroad

rebroad Apr 6, 2021
Contributor

what is sizes? I've noticed that when adding up all the sizes when the mempool is full, that the number doesn't stay fixed as I would have expected when the mempool is full - so it's not the number of bytes used in memory to store the tx - so, what is it?

This comment has been minimized.

@kiminuo

kiminuo Apr 7, 2021
Contributor

So the histogram is based on fee rates intervals. The histogram is modeled using three vectors:

        std::vector<uint64_t> sizes(feelimits.size(), 0);
        std::vector<uint64_t> count(feelimits.size(), 0);
        std::vector<uint64_t> fees(feelimits.size(), 0);

where sizes[0] represents cumulative size of txs belonging to the first fee rate interval [1, 2), sizes[1] represents cumulative size of txs belonging to the second fee rate interval [2, 3), etc.

Line 1527 shows that we add size to sizes[i] where size is defined as int size = (int)e.GetTxSize(); which is defined as:

size_t CTxMemPoolEntry::GetTxSize() const
{
    return GetVirtualTransactionSize(nTxWeight, sigOpCost);
}

What is virtual size of a transaction? This is explained here: https://bitcoin.stackexchange.com/questions/92689/how-is-the-size-of-a-bitcoin-transaction-calculated.

HTH!

promo: You may have a look at #21422 too. :)

@kiminuo
Copy link
Contributor

@kiminuo kiminuo commented Apr 14, 2021

I'm chasing "concept ACK"s for the reborn version of this PR - namely #21422.

Any other feedback is welcome, I have time to do modifications if needed to increase the chance of getting the PR to be merged.

@rebroad
Copy link
Contributor

@rebroad rebroad commented Apr 18, 2021

@jonasschnelli the simple plot example doesn't display (seems the website is down).

@kiminuo
Copy link
Contributor

@kiminuo kiminuo commented Apr 18, 2021

@rebroad The link redirects to https://bitcoin.jonasschnelli.ch/mempool-histogram/ but there is a bug (missing slash). Anyway, there are no data at the moment.

@fanquake
Copy link
Member

@fanquake fanquake commented Aug 18, 2021

Closing this, given it's been taken over in #21422.

@fanquake fanquake closed this Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet