Add mempool statistics collector #8501

Open
wants to merge 2 commits into
from

Projects

None yet

6 participants

@jonasschnelli
Member
jonasschnelli commented Aug 12, 2016 edited

This PR adds a statistics collector class which aims to collect various types of statistics up to the configurable maximum memory target. At the moment, only mempool statistics will be collected.

Instead of constant polling, I think there should be a core class that manages stats collecting that aims for a very small impact on performance.

Motivation

Adding more statistics and visualization to the GUI would leverage its usage. To do so, we need stats that are collected even when the visualization is not visible (example: the GUI network graph will only draw data when it's visible which is kinda unusable)

How it works

This PR adds a simple stats manager that does the sample collecting without an additional thread.
Avoiding a thread should reduce the locked states and indirectly improves performance.

Instead, it will hook in at the point where the observed data-set will be mutated (for mempool at the point where transactions are added or removed). This will result in not explicit locking for stats collecting. To not overcorrect, the addSample routine will ensure a min time delta.

This will result in non-fixed-timespan samples which are easy to handle during visualization (either draw non-linear or interpolate).

Two new startup arguments are added:

  • -statsenable default disabled
  • -statsmaxmemorytarget 10MB default maximal memory target to use for statistics.

What's includes

This PR does not include UI changes (currently working on those) but it includes a simple RPC dump (cs_main lock free).
Adding the RPC stats API does allow third party apps to visualize data without being responsible for endless polling of the data (which could result in performance issue if you poll to aggressive or with multiple applications).

@isle2983 isle2983 commented on an outdated diff Aug 13, 2016
src/stats/stats.cpp
+ }
+ }
+
+ mempoolSamples_t subset(fromSample, toSample + 1);
+
+ // set the fromTime and toTime pass-by-ref parameters
+ fromTime = mempoolStats.startTime + (*fromSample).timeDelta;
+ toTime = mempoolStats.startTime + (*toSample).timeDelta;
+
+ // return subset
+ return subset;
+ }
+
+ // return all available samples
+ fromTime = mempoolStats.startTime + mempoolStats.vSamples.front().timeDelta;
+ ;
@isle2983
isle2983 Aug 13, 2016 Contributor

is this semicolon unintentional?

@isle2983 isle2983 commented on an outdated diff Aug 14, 2016
src/stats/stats.h
+// Distributed under the MIT software license, see the accompanying
+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
+
+#ifndef BITCOIN_STATS_H
+#define BITCOIN_STATS_H
+
+#include <sync.h>
+
+#include <atomic>
+#include <stdlib.h>
+#include <vector>
+
+#include <boost/signals2/signal.hpp>
+
+struct CStatsMempoolSample {
+ uint32_t timeDelta; //use 32bit time delta to save memmory
@isle2983
isle2983 Aug 14, 2016 Contributor

s/memmory/memory/

@isle2983 isle2983 commented on an outdated diff Aug 14, 2016
src/stats/stats.h
+ static std::atomic<bool> statsEnabled;
+ static CStats* DefaultStats(); //shared instance
+
+ /* signals */
+ boost::signals2::signal<void(void)> MempoolStatsDidChange; //mempool signal
+
+ /* add a mempool stats sample */
+ void addMempoolSample(int64_t txcount, int64_t dynUsage, int64_t currentMinRelayFee);
+
+ /* get all mempool samples (non interpolated) */
+ mempoolSamples_t mempoolGetValuesInRange(uint64_t& fromTime, uint64_t& toTime);
+
+ /* set the target for the maximum memory consuption (in bytes) */
+ void setMaxMemoryUsageTarget(size_t maxMem);
+
+ /* get the statictis module help strings */
@isle2983
isle2983 Aug 14, 2016 Contributor

s/statictis/statistics/

@isle2983 isle2983 commented on an outdated diff Aug 14, 2016
src/stats/stats.h
+ static const size_t DEFAULT_MAX_STATS_MEMORY; //default maximum of memory to use
+ static const bool DEFAULT_STATISTICS_ENABLED; //default value for enabling statistics
+
+ static std::atomic<bool> statsEnabled;
+ static CStats* DefaultStats(); //shared instance
+
+ /* signals */
+ boost::signals2::signal<void(void)> MempoolStatsDidChange; //mempool signal
+
+ /* add a mempool stats sample */
+ void addMempoolSample(int64_t txcount, int64_t dynUsage, int64_t currentMinRelayFee);
+
+ /* get all mempool samples (non interpolated) */
+ mempoolSamples_t mempoolGetValuesInRange(uint64_t& fromTime, uint64_t& toTime);
+
+ /* set the target for the maximum memory consuption (in bytes) */
@isle2983
isle2983 Aug 14, 2016 Contributor

s/consuption/consumption/

@isle2983 isle2983 commented on an outdated diff Aug 14, 2016
src/stats/stats.cpp
+ if (!statsEnabled)
+ return;
+
+ uint64_t now = GetTime();
+ {
+ LOCK(cs_stats);
+
+ // set the mempool stats start time if this is the first sample
+ if (mempoolStats.startTime == 0)
+ mempoolStats.startTime = now;
+
+ // ensure the minimum time delta between samples
+ if (mempoolStats.vSamples.size() && mempoolStats.vSamples.back().timeDelta + SAMPLE_MIN_DELTA_IN_SEC >= now - mempoolStats.startTime)
+ return;
+
+ // calculate the current time detla and add a sample
@isle2983
isle2983 Aug 14, 2016 Contributor

s/detla/delta/

@isle2983 isle2983 commented on an outdated diff Aug 14, 2016
src/stats/stats.cpp
+
+#include "stats/stats.h"
+
+#include "memusage.h"
+#include "utiltime.h"
+
+#include "util.h"
+
+const uint32_t CStats::SAMPLE_MIN_DELTA_IN_SEC = 2;
+const int CStats::CLEANUP_SAMPLES_THRESHOLD = 100;
+size_t CStats::maxStatsMemory = 0;
+const size_t CStats::DEFAULT_MAX_STATS_MEMORY = 10 * 1024 * 1024; //10 MB
+const bool CStats::DEFAULT_STATISTICS_ENABLED = false;
+std::atomic<bool> CStats::statsEnabled(false); //disable stats by default
+
+CStats defaultStats;
@isle2983
isle2983 Aug 14, 2016 Contributor

This 'defaultStats' instance is not used.

@isle2983
Contributor

Hi Jonas,

I understand the concern in your description for the addMempoolSample() stat
bookkeeping designed to be as lightweight as possible in the critical execution
path. However, I have a few (perhaps under-informed, neophyte) questions which
would help me understand the design considerations better:

  1. the comment in rpc_stats.cpp hints that the overhead of the JSON string
    generation is best optimized to be this 'flat' encoding as opposed to some
    encoding like:

{ "fieldNames" : ["delta_in_secs", "tx_count", "dynamic_mem_usage", "min_fee_per_k"], "samples" : [[val1, val2, val3, val4], [val1, val2, val3, val4], ] }

Is the 'flat' encoding strictly needed? or is there some other concern with
outputting a slightly more convenient format than 'flat'?

  1. It appears possible to set the maximum memory target very hight such that
    many, many samples are collected and the overhead of the computation
    mempoolGetValuesInRange() inside the lock might become onerus (assuming I am
    correctly understanding how the lock works and the implications of holding it
    too long). Have you considered taking a copy of 'mempoolStats' in a way that
    lets you return the lock earlier, and doing the dataset computation outside the
    lock? (Is that even currently possible under the current execution model?)

Cheers,

Isle

@jonasschnelli
Member

@isle2983 Welcome to github.
Thanks for your feedback and your nitpicks. I really appreciate this and i'll process them during the next hours.

For your questions/inputs:

  1. My idea with the JSON flat output was to bypass the JSON encoding/decoding.[val1, val2, val3, val4], [val1, val2, val3, val4], should also work. I just though a single string would result in faster encoding and decoding performance. But your approach seems to be the better choice, although not sure if we want to use UniValue for encoding or just appending strings... maybe we should start with the first and use a more optimized encoding if the JSON overhead is a problem.

  2. Yes. That's a good point. Copying the samples vector could result in a memory peak when using large amount of -maxmemorytarget.

@MarcoFalke MarcoFalke commented on an outdated diff Aug 19, 2016
src/stats/rpc_stats.cpp
+ "\nExamples:\n" +
+ HelpExampleCli("getmempoolstats", "") + HelpExampleRpc("getmempoolstats", ""));
+
+ // get stats from the core stats model
+ uint64_t timeFrom = 0;
+ uint64_t timeTo = 0;
+ mempoolSamples_t samples = CStats::DefaultStats()->mempoolGetValuesInRange(timeFrom, timeTo);
+
+ // use "flat" json encoding for performance reasons
+ std::string flatData;
+ for (struct CStatsMempoolSample& sample : samples) {
+ flatData += std::to_string(sample.timeDelta) + ",";
+ flatData += std::to_string(sample.txCount) + ",";
+ flatData += std::to_string(sample.dynMemUsage) + ",";
+ flatData += std::to_string(sample.minFeePerK) + ",";
+ }
@MarcoFalke
MarcoFalke Aug 19, 2016 Member

Would it make sense to add a line break after each sample's values?

@MarcoFalke
MarcoFalke Aug 19, 2016 Member

Or just add the names of the columns as another entry in the dict.

Otherwise I fail to see how this rpc call is useful.

@isle2983
Contributor

I have been playing around making my own changes off these commits (isle2983:getmempoolstats). Mostly to just to get some hands on with the code and try to get my C++ up to par.

But anyway, I made the rpc output of the samples full JSON:

{
  "enabled": true,
  "maxmemorytarget": 10485760,
  "currentusage": 1734416,
  "time_from": 1471573271,
  "time_to": 1471657376,
  "sampleCount": 27131,
  "sampleFieldNames": [
    "timeDelta", 
    "txCount", 
    "dynMemUsage", 
    "minFeePerK"
  ],
  "samples": [
    [
      0, 
      1, 
      1728, 
      0
    ], 
    [
      4, 
      11, 
      15232, 
      0
    ], 
    ...
    (snip)
    ]
}

The JSON 'pretty' print through bitcoin-cli is definitely unwieldy. However, the computational overhead in doing the wrangling doesn't seem so bad.

The 1.7MM of stat data is from collecting just overnight. With that data, I can pull it off the node, parse and convert the JSON into CSV with a python script and plot it in gnuplot in under a second.

$ time myJunk/plotTxCount.sh 

real    0m0.966s
user    0m0.460s
sys     0m0.128s

Not sure what the comparable is with the qt gui stuff branch that is running, but this doesn't seem too bad on the face of it.

Also, if getting this info from the node to the UI quickly is a concern, perhaps a more dense, binary-like format is worth considering i.e:

{"stats_blob":"8b292cf....."}

One could imagine it being more efficient than even the 'flat' format, depending on the sophistication.

@jonasschnelli
Member

Thanks @isle2983 for the testing, benchmarks and improvements.
I have switched to the proposed array format for the samples (rather then the flat structure). A more performant binary format (inside of the JSON format) would be a hack. More performance would probably be possible over ZMQ.. but its currently a push only channel.

I also though again about copy the samples hash before filtering it. I came to the conclusion that it's not worth generating a memory peak (by a copy of the whole samples vector) in order to allow a faster release of the LOCK. The filtering should be very fast because it only compares some uint32 and does construct a new vector with a from-/to-iterators (should also preform fast).

@jonasschnelli
Member

Needed rebase.

@jonasschnelli
Member

Rebased.

@MarcoFalke MarcoFalke added this to the 0.14.0 milestone Nov 10, 2016
@MarcoFalke
Member

Assigning "would be nice to have" for 0.14 per meeting today.

@morcos
Contributor
morcos commented Dec 5, 2016

Just saw @gmaxwell's comment on #8550 (which I completely agree with) and it reminded to look at that PR and this one. Sorry for not getting involved sooner, but I really like the idea. Unfortunately I can think of many many stats (dozens and dozens) that we might want to collect, both to potentially show to users in whiz-bangy gui's and also would be useful for developers and businesses trying to understand unusual behavior on the network.

If we envision that there might be 1 KB of different stats data, then maybe rather than just saving sample data points and trimming when they hit memory usage, we should be smart about saving it along different time frames. For instance we could have second, minute, hour, day sampling intervals and we could save 1000 points or more on each and still have quite reasonable memory usage, but they could be auto trimmed.. so if you wanted to look at data from hours ago, you couldn't look at it on the second time frame...

@jonasschnelli
Member

@morcos: thanks for the comment. Yes. I completely agree. I think this is a first start and the current design allows features like you mentioned.
I once started with interpolating values instead of just trimming the back, you could in theory just reduce the "density" of the sample and interpolate the in-between values (to a point where this could make sense).
But yes, adding more stats probably require individual limits and trim-behaviours.

@morcos
Contributor
morcos commented Dec 5, 2016

@jonasschnelli Well I guess what I was thinking was that one general framework might fit all stats. You log it with whatever frequency you want. And it's stored in up to 4 different histories (by second, minute, hour, day) and each of those is trimmed to some limit (say 1000 or 2000 data points each). Is there any type of stat that such a general framework might not work well with?

@jonasschnelli
Member

@morcos: the original idea I was trying to follow was to not collect in a fixed frequency. A) To avoid locking a thread for just collecting stats samples. B) To not collect over and over the same value if it was unchanged.
Take the traffic report as an example. If you like to collect stats of all peers traffic segmented into all available p2p commands, then you would probably "loose" plenty of memory by storing samples with identical values.

I had the idea of recording samples in the most restrained way possible. Collect lock free and only if values have changes; collect the according timestamp.
If you want to retrieve data with a fixed frequency/step-size, interpolate.

But not sure if this is stupid.

jonasschnelli added some commits Aug 12, 2016
@jonasschnelli jonasschnelli Add mempool statistics collector c0af366
@jonasschnelli jonasschnelli [RPC] Use JSON array for mempool samples b7c021d
@jonasschnelli
Member

Rebased (main split)

@luke-jr
Member
luke-jr commented Dec 10, 2016

c0af366 has unresolved conflicts :(

@paveljanik
Contributor

@luke-jr All checks have passed ;-)

@luke-jr
Member
luke-jr commented Dec 10, 2016

@paveljanik They're removed in the subsequent commit.

@paveljanik
Contributor

Yes (I noticed that), this is why we should take Travis' results as a help only.

@luke-jr luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Dec 21, 2016
@jonasschnelli @luke-jr jonasschnelli + luke-jr Add mempool statistics collector
Github-Pull: #8501
Rebased-From: 71c402b
039d1d5
@luke-jr luke-jr added a commit to bitcoinknots/bitcoin that referenced this pull request Dec 21, 2016
@jonasschnelli @luke-jr jonasschnelli + luke-jr [RPC] Use JSON array for mempool samples
Github-Pull: #8501
Rebased-From: 5722e27
2619df5
@morcos
Contributor
morcos commented Jan 5, 2017

@jonasschnelli whatever happened to this plan?

09:19:49 < jonasschnelli> I think you convinced me to do the 1000s 1000m 1000h 1000d approach.
09:19:59 < jonasschnelli> maybe the 1000 is configurable.
09:20:04 < morcos> doesn't matter to me how we do it... i think a delta version coudl be just as good
09:20:14 < morcos> and you could just be smart about trimming the delta list or something
09:20:28 < morcos> yes, 1000 should be configurable i thik... actually maybe isn't enough for a default
09:20:45 < jonasschnelli> also, I liked the configurability of the buffer in MB.
09:20:52 < jonasschnelli> That's what you probably care about.
09:20:56 < jonasschnelli> Not the 1000
09:20:59 < morcos> 1000 secs is just 16 minutes...  you would not want to have to only have 16 data points
09:21:12 < jonasschnelli> You would say, I reserve 300MB for stats.
09:21:37 < jonasschnelli> Right... just en 
09:21:45 < jonasschnelli> just as an example
09:22:14 < jonasschnelli> So,.. you convinced me for high frequency recent and low frequency long time horizon,...
09:22:21 < morcos> ok cool...   any approach that automatically keeps both recent fine-grained and long time horizon bigger step, is fine with me
@jonasschnelli jonasschnelli removed this from the 0.14.0 milestone Jan 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment