Benchmark for calling ConnectBlock on a block with many inputs #2372

str4d · 2017-05-12T22:46:10Z

Requires placing block-107134.tar.gz (containing the block, and a fake CoinsDB containing its inputs) into the base directory of the repository.

To facilitate generation of the fake CoinsDB, an additional field valuesZat has been added to getrawtransaction containing the integer number of zatoshis instead of a decimal number of ZEC.

Closes #2355.

str4d · 2017-05-16T04:18:43Z

Oddly, it seems that when I run the time benchmark on my laptop, sometimes the time for each round gets progressively slower across the 10 rounds in the benchmark (a spread of around a second for around 8.5s of building). IDK if that is being influenced by the fact that this test relies on globals or not.

str4d · 2017-05-16T04:21:02Z

qa/zcash/performance-measurements.sh

+function extract_benchmark_data {
+    if [ ! -f "block-107134.tar.gz" ]; then
+        zcashd_stop
+        echo "block-107134.tar.gz not found."


Once we have the file hosted somewhere, I plan to link to it here.

Or perhaps the script should just download the file if it isn't present? But if doing that, we'd need to somehow explain to the user what is going on, and what the additional file is, without polluting stdout (which is interpreted by CI).

Let's figure out hosting/distribution later (outside of this PR). However, for this PR, can we do a sha256sum check so that the hash is in this script (and revision control)? That mght save us some headache in diagnosing issues in the future and ensures we have some revision history for the test vector.

Echoing the sha256sum request again so that it's included in the github review.

nathan-at-least

The only change is to bake in the sha256sum of the test vector. If this is infeasible because of nondeterminism in create_fake_coinsdb_from_block.py, I'd like to learn where that nondeterminism comes in (and in that case I might relax this requirement).

For the benchmark itself, it looks good enough, although there's a lot going on I don't understand. I think we can refine this benchmark as we learn, or create new ones.

nathan-at-least · 2017-05-17T17:15:04Z

qa/zcash/create_fake_coinsdb_from_block.py

+    db.close()
+
+if __name__ == '__main__':
+    get_input_coins('0000000007cdb809e48e51dd0b530e8f5073e0a9e9bd7ae920fe23e874658c74')


Does this script generate the test vector block-107134.tar.gz? How long does that process take?

No, it only generates the LevelDB stored inside. It takes about ten minutes (on my laptop), and requires a running zcashd with txindex=1 (so arbitrary transactions can be fetched).

nathan-at-least · 2017-05-17T17:16:57Z

qa/zcash/performance-measurements.sh

+function extract_benchmark_data {
+    if [ ! -f "block-107134.tar.gz" ]; then
+        zcashd_stop
+        echo "block-107134.tar.gz not found."


Echoing the sha256sum request again so that it's included in the github review.

nathan-at-least · 2017-05-17T17:32:43Z

src/zcbenchmarks.cpp

+                    CAnchorsMap &mapAnchors,
+                    CNullifiersMap &mapNullifiers) {
+        return false;
+    }


Is BatchWrite called during the benchmark? If so, it seems like an uncaptured cost. It's fine to merge this with uncaptured costs as long as we know what they are, perhaps by documenting them in comments for benchmark_connectblock_slow.

No; because fJustCheck is set to true, no write-effects are triggered. This was necessary because I couldn't mock out those globals, both because of the assumptions ConnectBlock is written with and because the benchmarks as-written run on a live (regtest) zcashd.

nathan-at-least · 2017-05-17T17:35:15Z

src/zcbenchmarks.cpp

+    if (!fp) throw new std::runtime_error("Failed to open block data file");
+    CAutoFile blkFile(fp, SER_DISK, CLIENT_VERSION);
+    blkFile >> block;
+    blkFile.fclose();


This is an 'instrumentation cost' of loading the vector from disk which doesn't represent a cost in production performance. These will be unavoidable though we could minimize them ideally and document the ones we're aware of. Presumably if this function were called repeatedly the FS layer might cache this vector, so I would expect the time-to-execute to do down slightly due to this, if my model is accurate.

I could shift this out of the benchmark function, ie. benchmark_connectblock_slow(CBlock& block). However, note that benchmark timing is only measured around the actual ConnectBlock call, so I don't think this should have any effect.

I do however see a general trend of the benchmark taking longer each iteration (spread of ~1s in 8.5s), which suggests that there is some global state affecting the measurement that I haven't identified.

Ok. It's interesting that the trend is opposite to my prediction. Let's continue with this PR as-is and learn over time how to refine the measurement.

nathan-at-least · 2017-05-17T19:13:22Z

Can I tack on one more change request? That would be: instructions for running it locally.

bitcartel · 2017-05-17T22:30:24Z

@nathan-at-least In pairing with @str4d we discussed the fact that since the test can be invoked via RPC, and since the test changes global state, e.g. mapBlockIndex, there is a risk that changes to the global state impact the user's running node which may be connected to mainnet, and any changes could then be persisted to disk. So we discussed adding a check to ensure that this benchmark test is only ever run when the node is in regtest mode. Since the benchmark script already runs individual tests in regtest, we could also apply this check to all tests.

nathan-at-least · 2017-05-17T22:51:43Z

@bitcartel ACK. I didn't realize the existing benchmarks were regtest mode already. In that case I have no objections to requiring it for safety.

Requires placing block-107134.tar.gz (containing the block, and a fake CoinsDB containing its inputs) into the base directory of the repository. This can be generated using qa/zcash/create_benchmark_archive.py (see the script for usage details). To facilitate generation of the fake CoinsDB, an additional field 'valueZat' has been added to 'getrawtransaction' containing the integer number of zatoshis instead of a decimal number of ZEC. Closes zcash#2355.

str4d · 2017-05-17T23:09:03Z

To run the benchmark, use the same command as for any other benchmark of ours:

./qa/zcash/performance-measurements.sh time connectblockslow

You should see instructions from that about how to progress with setting up the benchmark.

daira

utACK modulo comments.

daira · 2017-05-18T09:12:21Z

qa/zcash/create_benchmark_archive.py

+        if (ord(v[ch]) & 0x80):
+            n += 1
+        else:
+            return n


I confirmed that this matches https://github.com/zcash/zcash/blob/master/src/serialize.h#L309L375 .

daira · 2017-05-18T09:14:32Z

qa/zcash/create_benchmark_archive.py

+
+    os.mkdir('benchmark')
+    with open('benchmark/block-%d.dat' % blk['height'], 'wb') as f:
+        f.write(binascii.unhexlify(subprocess.check_output([ZCASH_CLI, 'getblock', blk_hash, 'false']).strip()))


Split this onto multiple lines for better exception line reporting. (As it is, either f.write or subprocess.check_output could cause I/O errors that would be reported as the same line, for instance.)

daira · 2017-05-18T09:16:56Z

qa/zcash/create_benchmark_archive.py

+    wb = db.write_batch()
+    bar = progressbar.ProgressBar(redirect_stdout=True)
+    print 'Collecting input coins for block'
+    for tx in bar(unique_inputs.keys()):


Rename tx to txid.

daira · 2017-05-18T09:18:03Z

qa/zcash/create_benchmark_archive.py

+        while 2+b*8 < len(rawtx['vout']):
+            zero = True
+            i = 0
+            while i < 8 and 2+b*8+i < len(rawtx['vout']):


Nit: 2 + b*8 + i for better readability (also on the next line).

daira · 2017-05-18T09:32:30Z

src/zcbenchmarks.cpp

+    assert(ConnectBlock(block, state, &index, view, true));
+    auto duration = timer_stop(tv_start);
+
+    // Undo alterations to global state


Hmm, what happens if an exception causes this cleanup to be bypassed? Or if other code accesses the state concurrently?

If an exception caused the cleanup to be bypassed, the benchmark would be ended (via the catch block outside the function), so in all expected usage of it (bearing in mind this benchmark is only available in regtest), nothing would happen.

Concurrent state access is a potential problem; it's one hypothesis I have for why running the benchmark multiple times usually results in increasing results. Unconfirmed as yet.

arielgabizon · 2017-05-18T09:10:38Z

src/zcbenchmarks.cpp

+    ZCIncrementalMerkleTree t;
+
+public:
+    FakeCoinsViewDB(std::string dbName, uint256& hash) : CCoinsViewDB(dbName, 100, false, false), hash(hash) {}


Perhaps add a short documentation on how this class and constructor is different from CCoinsViewDB?
e.g. What is the meaning of calling the CCoinsViewDB constructuor with the two last parameters fMemory and fWipe always false?

Those parameters are already documented in CCoinsViewDB; setting them to false here is just copying the default parameters for a non-reindexing node (which has fMemory = false and fWipe = fReindex). The purpose of this class and constructor is solely to select a different database file, because the existing Bitcoin code assumes there is only ever one CoinsDB location.

But the filename seems to be just passed as a parameter to the CCoinsViewDB class.
So it seems this class can also work with a file that you chose.

arielgabizon · 2017-05-18T09:12:07Z

src/zcbenchmarks.cpp

+        return false;
+    }
+
+    bool GetNullifier(const uint256 &nf) const {


Why does GetNullifier always immediately return false?
If we are benchmarking time don't you need to simulate the time it takes to search for something in the nullifier set?

There is no need to fake it, because the block being benchmarked does not contain any JoinSplits. I specifically selected a block without them, for two reasons:

Checking JoinSplits would require the benchmarks to have access to the mainnet parameters, which is significant additional development effort.

The goal of this PR is to benchmark the slowest blocks we know about, which at this point are the blocks containing many-input transactions. A block completely full of JoinSplits would in theory be slower, but we have yet to see such a block, and the majority of the time cost is already captured in the existing verifyjoinsplit benchmark.

arielgabizon · 2017-05-18T09:29:34Z

src/zcbenchmarks.cpp

+double benchmark_connectblock_slow()
+{
+    // Test for issue 2017-05-01.a
+    SelectParams(CBaseChainParams::MAIN);


Why is it using mainnet params here, when in rpcwallet.cpp it says it should on regtest

Because this benchmark is operating on a real mainnet block, which must be validated against mainnet parameters. The reason for the regtest check is that the benchmark itself is running inside a zcashd and alters global state, and we don't want users running this benchmark on their main nodes and accidentally corrupting their on-disk state. The benchmark script sets up a transient regtest node solely for benchmarking which gets deleted afterwards, so there is less concern.

Ohh so you can tell a regtest node to use mainnet parameters?

arielgabizon · 2017-05-18T09:41:44Z

qa/zcash/performance-measurements.sh

@@ -141,6 +167,10 @@ case "$1" in
            incnotewitnesses)
                zcash_rpc zcbenchmark incnotewitnesses 1 "${@:3}"
                ;;
+            connectblockslow)
+                extract_benchmark_data
+                zcash_rpc zcbenchmark connectblockslow 1


what does the parameter mean? why are we checking on 1 and 10?

That parameter is the number of times to run the benchmark (and therefore the number of timing measurements the call will return). For timing benchmarks, we want several data points so that statistics can be drawn (which is done automatically by the CI system). For memory benchmarks and valgrind, we only run once because the instrumentation greatly slows the execution time.

bitcartel

Following all instructions to try and run the test but ran into a problem:

block-107134.tar.gz

sha256: 96748b599920f9b01b0e17c872a6b7776785307a8e68a41148624b1901710795
2578980 bytes in size

qa/zcash/performance-measurements.sh time connectblockslow

block-107134.tar.gz: FAILED
/usr/bin/sha256sum: WARNING: 1 computed checksum did NOT match

nathan-at-least

utACK.

For testing I'm considering this plan:

merge this PR.
deploy a change to CI to add this benchmark.
trigger a CI run against master to test this code and produce the first baseline measurement.

nathan-at-least · 2017-05-17T23:44:52Z

src/zcbenchmarks.cpp

+    if (!fp) throw new std::runtime_error("Failed to open block data file");
+    CAutoFile blkFile(fp, SER_DISK, CLIENT_VERSION);
+    blkFile >> block;
+    blkFile.fclose();


Ok. It's interesting that the trend is opposite to my prediction. Let's continue with this PR as-is and learn over time how to refine the measurement.

nathan-at-least · 2017-05-17T23:45:24Z

qa/zcash/performance-measurements.sh

+    if [ -f "block-107134.tar.gz" ]; then
+        # Check the hash of the archive:
+        "$SHA256CMD" $SHA256ARGS -c <<EOF
+299a36b3445a9a0631eb9eb0b9e76c3e9e7493a98d6621ffd6dc362d3d86cbe8  block-107134.tar.gz


nathan-at-least · 2017-05-18T20:16:46Z

@bitcartel Make sure to save your copy of block-107134.tar.gz which appears to have sha256 96748b599920f9b01b0e17c872a6b7776785307a8e68a41148624b1901710795 off to the side, then remove the hash check or change the hash to match and retest to see if everything else works functionally for you.

Let's timebox making the test vector generation deterministic. If we can't find and remove the source of non-determinism by end-of-today, let's merge this without determinism. In that case, I think we still want the hash check, and we'll just have this one special tarball that's the one true source of benchmarking for the time being.

bitcartel · 2017-05-19T02:41:14Z

@str4d Can you sanity check my running times, give that my test data has a different hash? Thx.

[
  {
    "runningtime": 3.547354
  },
  {
    "runningtime": 3.452053
  },
  {
    "runningtime": 3.736808
  },
  {
    "runningtime": 3.629379
  },
  {
    "runningtime": 3.538181
  },
  {
    "runningtime": 3.61263
  },
  {
    "runningtime": 3.476884
  },
  {
    "runningtime": 3.595423
  },
  {
    "runningtime": 3.46568
  },
  {
    "runningtime": 3.53467
  }
]

str4d · 2017-05-19T02:42:37Z

@bitcartel what machine are you running this on? My times are over twice that, but that's on my laptop (i7-6600U CPU @ 2.60GHz)

str4d · 2017-05-19T02:43:32Z

The fact that you are getting results means that the benchmark is running correctly though, because there's an assertion that ConnectBlock returns true.

nathan-at-least · 2017-05-19T03:12:48Z

I'm calling the timebox closed by timeout. I filed #2388 to follow up on the non-determinism issue. Meanwhile, we've already deployed the CI changes for this with @str4d's special-case instance of testdata, so:

@zkbot r+

zkbot · 2017-05-19T03:12:49Z

📌 Commit c66c731 has been approved by nathan-at-least

zkbot · 2017-05-19T03:12:53Z

⌛ Testing commit c66c731 with merge 8214ebc...

Benchmark for calling ConnectBlock on a block with many inputs Requires placing `block-107134.tar.gz` (containing the block, and a fake CoinsDB containing its inputs) into the base directory of the repository. To facilitate generation of the fake CoinsDB, an additional field `valuesZat` has been added to `getrawtransaction` containing the integer number of zatoshis instead of a decimal number of ZEC. Closes #2355.

zkbot · 2017-05-19T04:05:55Z

☀️ Test successful - pr-merge
Approved by: nathan-at-least
Pushing 8214ebc to master...

str4d added I-dos Problems and improvements with respect to Denial-of-Service. I-performance Problems and improvements with respect to performance I-SECURITY Problems and improvements related to security. labels May 12, 2017

nathan-at-least added this to the 1.0.9 milestone May 12, 2017

str4d force-pushed the 2355-connectblock-bench branch 5 times, most recently from 5992cc0 to 6c02a6c Compare May 16, 2017 03:45

str4d changed the title ~~[WIP] ConnectBlock benchmark~~ Benchmark for calling ConnectBlock on a block with many inputs May 16, 2017

str4d requested a review from nathan-at-least May 16, 2017 03:46

str4d force-pushed the 2355-connectblock-bench branch from 6c02a6c to 46806ba Compare May 16, 2017 04:19

str4d commented May 16, 2017

View reviewed changes

nathan-at-least mentioned this pull request May 16, 2017

Add memory benchmarks for worst-case validation tests. #2377

Closed

nathan-at-least added this to In Progress in Security and Stability May 16, 2017

nathan-at-least requested review from bitcartel, daira, ebfull, arielgabizon and arcalinea May 17, 2017 17:09

nathan-at-least suggested changes May 17, 2017

View reviewed changes

str4d force-pushed the 2355-connectblock-bench branch from 46806ba to a5fa208 Compare May 17, 2017 22:44

str4d force-pushed the 2355-connectblock-bench branch from a5fa208 to c66c731 Compare May 17, 2017 23:04

nathan-at-least mentioned this pull request May 17, 2017

Add connectblockslow benchmark to CI. #2383

Closed

daira approved these changes May 18, 2017

View reviewed changes

arielgabizon reviewed May 18, 2017

View reviewed changes

arielgabizon approved these changes May 18, 2017

View reviewed changes

bitcartel suggested changes May 18, 2017

View reviewed changes

nathan-at-least approved these changes May 18, 2017

View reviewed changes

zkbot merged commit c66c731 into zcash:master May 19, 2017

str4d deleted the 2355-connectblock-bench branch May 19, 2017 04:35

nathan-at-least mentioned this pull request May 21, 2017

Fix the connectblockslow testvector generator script to be deterministic. #2388

Closed

nathan-at-least moved this from In Progress to Awaiting Review in Security and Stability May 30, 2017

nathan-at-least moved this from Awaiting Review to Complete in Security and Stability May 30, 2017

Benchmark for calling ConnectBlock on a block with many inputs #2372

Benchmark for calling ConnectBlock on a block with many inputs #2372

Conversation

str4d commented May 12, 2017 • edited Loading

str4d commented May 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathan-at-least left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

str4d May 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathan-at-least commented May 17, 2017

bitcartel commented May 17, 2017

nathan-at-least commented May 17, 2017

str4d commented May 17, 2017

daira left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

str4d May 18, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bitcartel left a comment

Choose a reason for hiding this comment

nathan-at-least left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathan-at-least commented May 18, 2017

bitcartel commented May 19, 2017

str4d commented May 19, 2017

str4d commented May 19, 2017

nathan-at-least commented May 19, 2017

zkbot commented May 19, 2017

zkbot commented May 19, 2017

zkbot commented May 19, 2017

str4d commented May 12, 2017 •

edited

Loading

str4d May 17, 2017 •

edited

Loading

str4d May 18, 2017 •

edited

Loading

nathan-at-least left a comment •

edited

Loading