Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
RPC: Introduce getblockstats to plot things #10757
It returns per block statistics about several things. It should be easy to add more if people think of other things to add or remove some if I went too far (but once written, why not keep it? EDIT: answer: not to test or maintain them).
The currently available options are: minfee,maxfee,totalfee,minfeerate,maxfeerate,avgfee,avgfeerate,txs,ins,outs (EDIT: see updated list in the rpc call documentation)
For the x axis, one can use height or block.nTime (I guess I could add mediantime if there's interest [EDIT: nobody showed interest but I implemented mediantime nonetheless, in fact there's no distinction between x or y axis anymore, that's for the caller to judge]).
To calculate fees, -txindex is required.
referenced this pull request
Jul 6, 2017
Some ideas for additions:
I would prefer to see both
Should we return non-independent fields, such as
I find that for bitcoin-related data, the median is often more useful than the average of a distribution. Including
Because more code => more bugs and more maintenance effort. I prefer:
If it's not really needed, why add it?
This is perhaps a nice-to-have, but since #8704,
Is there a compelling use-case I'm missing here? This seems like a feature only a small subset of users would be interested in, in which case an offline tools seems more appropriate.
Sorry - not meaning to be negative, but my default reaction to new RPCs/arguments tends towards NACK unless I can see a compelling and widespread use-case.
This code pulls each transaction input's previous outpoint in order to compute transaction fees. Replicating that in RPC would require thousands of calls for most blocks.
Ah yes, of course.
EDIT: I'm going to reverse myself again: I don't think +700 lines is worthwhile for something with limited usage for most users. I'm -0 on this.
Sure, but I mean, removing for example the avgfee or avgfeerate won't safe much code or testing code, just a few lines. Forget I said this, if there's specific functions to remove because nobody will want them, let's remove those and focus on the ones people want. Adding specific things only a few people want can also happen in their own branches, so it's no big deal.
The only use case is gather statistics, presumably to plot things, create charts. That is, at least, compelling to me, but I don't think that will have widespread usage. I also don't think all rpc calls have it. Is getchaintxstats, for example, a widespread use case?
If that's enough reason not to merge this, it's fine, I can maintain it as a separate branch that I periodically rebase, it is simple enough, so that won't be a big deal. On the other hand, if I can get it reviewed and merged it'll be less work for me in the long run and I also get the review.
Mhmm, it would be simpler to calculate here from start to end here than from genesis. But it's pretty trivial to write a function in any language that returns the total supply for a given height without access to any historic data. Unless you are talking about discounting op_return outputs or something like that. I don't think this is very interesting here. Perhaps that can be done in getchaintxstats ?
In fact I'm using weight for everything. I should s/size/weight/ and probably also show size separately.
Yeah, the mediantime takes a little bit longer to be calculated but not much and one can always disable anything. In fact, the height and time shouldn't be treated in any special way for being "the x axis" and should be allowed to be disabled like the rest.
This is a good question. This is mostly what I meant by "why not if it's this easy?".
re median: yeah, that sounds interesting too, good idea!
I was thinking of the more trivial version, rather than the
Partial review, suggestion to use
getperblockstats to just
Thanks again for the great feedback!
@clarkmoody I think I added most of your suggestions, explicitly excluding anything that involved accumulations neither from height=1 nor from height=start.
And for the rest of the redundancies, @jnewbery and @clarkmoody - thanks again for pointing it out -, it's never too late to remove them before merging like a trivial squash and it's never too soon to start saying which ones you would bikesay* out first. Also bikesay the names for the curves and even the order in the list (duplicated for c++ and python).
In the meantime, I embraced redundancy since, as said, it will be trivial for me to remove later. And also the pertinent optimizations to skip calculations when plot_values.count("minfee") == 0 or actually only when the extra calculation is more expensive than the searching in plot_values which is a set of strings.
For example, we have blockfees, reward, subsidy, complying with consensus rule
Which one seems bikesaying in principle. But not in this case.
But it is more interesting to propose new ones than to rename or vote for removal IMO. I believe the most interesting addition to this point was utxo_size_inc, which would welcomed some review from people who measures sizes more carefully like @sipa , since this doesn't use GetSerializeSize for Coin intentionally, independently of the optimization to read Coin if available in the utxo before calling RpcGetTx. I'm still not sure what to do with pre/post segwit feerates, does anybody care about the pre ones? which one needs the scale factor? none?
REM CalculateTruncatedMedian doesn't need to be a template at this point, but there's no harm being static IMO
EDIT: still some TODOs, mostly documentation and pending decisions
changed the title from
RPC: Introduce getperblockstats to plot things
RPC: Introduce getblockstats to plot things
Jul 11, 2017
Without the documentation for the result it was impossible to distinguish a weird choice to spring discussion from an implementation mistake. Removed the other TODO comments.
More cleanups can be done, specially in the tests if we go further with #10757 (comment) and not calculate in inverse order (there's no point if we don't get the slight optimization).
Reversed the order of the values to the natural one, since as discussed the optimization of doing fetching the blocks in reverse order is not worth the loss in clarity of the code.
Perhaps a better name for <stat>_old is <stat>_virtual, _virt or _v. Or perhaps prepend it with "v" just like the tx size in the output of
I just finished calling
Btw, if anyone is interested in the dataset I can share it. Just convo me at freenode irc (nick: "trippysalmon"). It includes some other stats as well, like rolling average hashrates.
@ajtowns now it seems to work 24 hours after generating the data
@TheBlueMatt yeah, it looks simpler now without using CFeeRate or CFeeRate::GetTruncatedFee. Thanks
Independently of that, if we want more precision for feerates (say, move from sat/vbyte to sat/vKB or whatever), now it's the right time to decide so.