Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark for calling ConnectBlock on a block with many inputs #2372

Merged
merged 1 commit into from
May 19, 2017

Conversation

str4d
Copy link
Contributor

@str4d str4d commented May 12, 2017

Requires placing block-107134.tar.gz (containing the block, and a fake CoinsDB containing its inputs) into the base directory of the repository.

To facilitate generation of the fake CoinsDB, an additional field valuesZat has been added to getrawtransaction containing the integer number of zatoshis instead of a decimal number of ZEC.

Closes #2355.

@str4d str4d added I-dos Problems and improvements with respect to Denial-of-Service. I-performance Problems and improvements with respect to performance I-SECURITY Problems and improvements related to security. labels May 12, 2017
@nathan-at-least nathan-at-least added this to the 1.0.9 milestone May 12, 2017
@str4d str4d force-pushed the 2355-connectblock-bench branch 5 times, most recently from 5992cc0 to 6c02a6c Compare May 16, 2017 03:45
@str4d str4d changed the title [WIP] ConnectBlock benchmark Benchmark for calling ConnectBlock on a block with many inputs May 16, 2017
@str4d
Copy link
Contributor Author

str4d commented May 16, 2017

Oddly, it seems that when I run the time benchmark on my laptop, sometimes the time for each round gets progressively slower across the 10 rounds in the benchmark (a spread of around a second for around 8.5s of building). IDK if that is being influenced by the fact that this test relies on globals or not.

function extract_benchmark_data {
if [ ! -f "block-107134.tar.gz" ]; then
zcashd_stop
echo "block-107134.tar.gz not found."
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have the file hosted somewhere, I plan to link to it here.

Or perhaps the script should just download the file if it isn't present? But if doing that, we'd need to somehow explain to the user what is going on, and what the additional file is, without polluting stdout (which is interpreted by CI).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's figure out hosting/distribution later (outside of this PR). However, for this PR, can we do a sha256sum check so that the hash is in this script (and revision control)? That mght save us some headache in diagnosing issues in the future and ensures we have some revision history for the test vector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing the sha256sum request again so that it's included in the github review.

Copy link
Contributor

@nathan-at-least nathan-at-least left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only change is to bake in the sha256sum of the test vector. If this is infeasible because of nondeterminism in create_fake_coinsdb_from_block.py, I'd like to learn where that nondeterminism comes in (and in that case I might relax this requirement).

For the benchmark itself, it looks good enough, although there's a lot going on I don't understand. I think we can refine this benchmark as we learn, or create new ones.

db.close()

if __name__ == '__main__':
get_input_coins('0000000007cdb809e48e51dd0b530e8f5073e0a9e9bd7ae920fe23e874658c74')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this script generate the test vector block-107134.tar.gz? How long does that process take?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it only generates the LevelDB stored inside. It takes about ten minutes (on my laptop), and requires a running zcashd with txindex=1 (so arbitrary transactions can be fetched).

function extract_benchmark_data {
if [ ! -f "block-107134.tar.gz" ]; then
zcashd_stop
echo "block-107134.tar.gz not found."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing the sha256sum request again so that it's included in the github review.

CAnchorsMap &mapAnchors,
CNullifiersMap &mapNullifiers) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is BatchWrite called during the benchmark? If so, it seems like an uncaptured cost. It's fine to merge this with uncaptured costs as long as we know what they are, perhaps by documenting them in comments for benchmark_connectblock_slow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; because fJustCheck is set to true, no write-effects are triggered. This was necessary because I couldn't mock out those globals, both because of the assumptions ConnectBlock is written with and because the benchmarks as-written run on a live (regtest) zcashd.

if (!fp) throw new std::runtime_error("Failed to open block data file");
CAutoFile blkFile(fp, SER_DISK, CLIENT_VERSION);
blkFile >> block;
blkFile.fclose();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an 'instrumentation cost' of loading the vector from disk which doesn't represent a cost in production performance. These will be unavoidable though we could minimize them ideally and document the ones we're aware of. Presumably if this function were called repeatedly the FS layer might cache this vector, so I would expect the time-to-execute to do down slightly due to this, if my model is accurate.

Copy link
Contributor Author

@str4d str4d May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could shift this out of the benchmark function, ie. benchmark_connectblock_slow(CBlock& block). However, note that benchmark timing is only measured around the actual ConnectBlock call, so I don't think this should have any effect.

I do however see a general trend of the benchmark taking longer each iteration (spread of ~1s in 8.5s), which suggests that there is some global state affecting the measurement that I haven't identified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It's interesting that the trend is opposite to my prediction. Let's continue with this PR as-is and learn over time how to refine the measurement.

@nathan-at-least
Copy link
Contributor

Can I tack on one more change request? That would be: instructions for running it locally.

@bitcartel
Copy link
Contributor

@nathan-at-least In pairing with @str4d we discussed the fact that since the test can be invoked via RPC, and since the test changes global state, e.g. mapBlockIndex, there is a risk that changes to the global state impact the user's running node which may be connected to mainnet, and any changes could then be persisted to disk. So we discussed adding a check to ensure that this benchmark test is only ever run when the node is in regtest mode. Since the benchmark script already runs individual tests in regtest, we could also apply this check to all tests.

@nathan-at-least
Copy link
Contributor

@bitcartel ACK. I didn't realize the existing benchmarks were regtest mode already. In that case I have no objections to requiring it for safety.

Requires placing block-107134.tar.gz (containing the block, and a fake CoinsDB
containing its inputs) into the base directory of the repository. This can be
generated using qa/zcash/create_benchmark_archive.py (see the script for usage
details).

To facilitate generation of the fake CoinsDB, an additional field 'valueZat' has
been added to 'getrawtransaction' containing the integer number of zatoshis
instead of a decimal number of ZEC.

Closes zcash#2355.
@str4d
Copy link
Contributor Author

str4d commented May 17, 2017

To run the benchmark, use the same command as for any other benchmark of ours:

./qa/zcash/performance-measurements.sh time connectblockslow

You should see instructions from that about how to progress with setting up the benchmark.

Copy link
Contributor

@daira daira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK modulo comments.

if (ord(v[ch]) & 0x80):
n += 1
else:
return n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


os.mkdir('benchmark')
with open('benchmark/block-%d.dat' % blk['height'], 'wb') as f:
f.write(binascii.unhexlify(subprocess.check_output([ZCASH_CLI, 'getblock', blk_hash, 'false']).strip()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split this onto multiple lines for better exception line reporting. (As it is, either f.write or subprocess.check_output could cause I/O errors that would be reported as the same line, for instance.)

wb = db.write_batch()
bar = progressbar.ProgressBar(redirect_stdout=True)
print 'Collecting input coins for block'
for tx in bar(unique_inputs.keys()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename tx to txid.

while 2+b*8 < len(rawtx['vout']):
zero = True
i = 0
while i < 8 and 2+b*8+i < len(rawtx['vout']):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: 2 + b*8 + i for better readability (also on the next line).

assert(ConnectBlock(block, state, &index, view, true));
auto duration = timer_stop(tv_start);

// Undo alterations to global state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what happens if an exception causes this cleanup to be bypassed? Or if other code accesses the state concurrently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an exception caused the cleanup to be bypassed, the benchmark would be ended (via the catch block outside the function), so in all expected usage of it (bearing in mind this benchmark is only available in regtest), nothing would happen.

Concurrent state access is a potential problem; it's one hypothesis I have for why running the benchmark multiple times usually results in increasing results. Unconfirmed as yet.

ZCIncrementalMerkleTree t;

public:
FakeCoinsViewDB(std::string dbName, uint256& hash) : CCoinsViewDB(dbName, 100, false, false), hash(hash) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a short documentation on how this class and constructor is different from CCoinsViewDB?
e.g. What is the meaning of calling the CCoinsViewDB constructuor with the two last parameters fMemory and fWipe always false?

Copy link
Contributor Author

@str4d str4d May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those parameters are already documented in CCoinsViewDB; setting them to false here is just copying the default parameters for a non-reindexing node (which has fMemory = false and fWipe = fReindex). The purpose of this class and constructor is solely to select a different database file, because the existing Bitcoin code assumes there is only ever one CoinsDB location.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the filename seems to be just passed as a parameter to the CCoinsViewDB class.
So it seems this class can also work with a file that you chose.

return false;
}

bool GetNullifier(const uint256 &nf) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does GetNullifier always immediately return false?
If we are benchmarking time don't you need to simulate the time it takes to search for something in the nullifier set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to fake it, because the block being benchmarked does not contain any JoinSplits. I specifically selected a block without them, for two reasons:

  • Checking JoinSplits would require the benchmarks to have access to the mainnet parameters, which is significant additional development effort.
  • The goal of this PR is to benchmark the slowest blocks we know about, which at this point are the blocks containing many-input transactions. A block completely full of JoinSplits would in theory be slower, but we have yet to see such a block, and the majority of the time cost is already captured in the existing verifyjoinsplit benchmark.

double benchmark_connectblock_slow()
{
// Test for issue 2017-05-01.a
SelectParams(CBaseChainParams::MAIN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it using mainnet params here, when in rpcwallet.cpp it says it should on regtest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this benchmark is operating on a real mainnet block, which must be validated against mainnet parameters. The reason for the regtest check is that the benchmark itself is running inside a zcashd and alters global state, and we don't want users running this benchmark on their main nodes and accidentally corrupting their on-disk state. The benchmark script sets up a transient regtest node solely for benchmarking which gets deleted afterwards, so there is less concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh so you can tell a regtest node to use mainnet parameters?

@@ -141,6 +167,10 @@ case "$1" in
incnotewitnesses)
zcash_rpc zcbenchmark incnotewitnesses 1 "${@:3}"
;;
connectblockslow)
extract_benchmark_data
zcash_rpc zcbenchmark connectblockslow 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the parameter mean? why are we checking on 1 and 10?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That parameter is the number of times to run the benchmark (and therefore the number of timing measurements the call will return). For timing benchmarks, we want several data points so that statistics can be drawn (which is done automatically by the CI system). For memory benchmarks and valgrind, we only run once because the instrumentation greatly slows the execution time.

Copy link
Contributor

@bitcartel bitcartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following all instructions to try and run the test but ran into a problem:

block-107134.tar.gz

  • sha256: 96748b599920f9b01b0e17c872a6b7776785307a8e68a41148624b1901710795
  • 2578980 bytes in size

qa/zcash/performance-measurements.sh time connectblockslow

block-107134.tar.gz: FAILED
/usr/bin/sha256sum: WARNING: 1 computed checksum did NOT match

Copy link
Contributor

@nathan-at-least nathan-at-least left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK.

For testing I'm considering this plan:

  1. merge this PR.
  2. deploy a change to CI to add this benchmark.
  3. trigger a CI run against master to test this code and produce the first baseline measurement.

if (!fp) throw new std::runtime_error("Failed to open block data file");
CAutoFile blkFile(fp, SER_DISK, CLIENT_VERSION);
blkFile >> block;
blkFile.fclose();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It's interesting that the trend is opposite to my prediction. Let's continue with this PR as-is and learn over time how to refine the measurement.

if [ -f "block-107134.tar.gz" ]; then
# Check the hash of the archive:
"$SHA256CMD" $SHA256ARGS -c <<EOF
299a36b3445a9a0631eb9eb0b9e76c3e9e7493a98d6621ffd6dc362d3d86cbe8 block-107134.tar.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@nathan-at-least
Copy link
Contributor

@bitcartel Make sure to save your copy of block-107134.tar.gz which appears to have sha256 96748b599920f9b01b0e17c872a6b7776785307a8e68a41148624b1901710795 off to the side, then remove the hash check or change the hash to match and retest to see if everything else works functionally for you.

Let's timebox making the test vector generation deterministic. If we can't find and remove the source of non-determinism by end-of-today, let's merge this without determinism. In that case, I think we still want the hash check, and we'll just have this one special tarball that's the one true source of benchmarking for the time being.

@bitcartel
Copy link
Contributor

@str4d Can you sanity check my running times, give that my test data has a different hash? Thx.

[
  {
    "runningtime": 3.547354
  },
  {
    "runningtime": 3.452053
  },
  {
    "runningtime": 3.736808
  },
  {
    "runningtime": 3.629379
  },
  {
    "runningtime": 3.538181
  },
  {
    "runningtime": 3.61263
  },
  {
    "runningtime": 3.476884
  },
  {
    "runningtime": 3.595423
  },
  {
    "runningtime": 3.46568
  },
  {
    "runningtime": 3.53467
  }
]

@str4d
Copy link
Contributor Author

str4d commented May 19, 2017

@bitcartel what machine are you running this on? My times are over twice that, but that's on my laptop (i7-6600U CPU @ 2.60GHz)

@str4d
Copy link
Contributor Author

str4d commented May 19, 2017

The fact that you are getting results means that the benchmark is running correctly though, because there's an assertion that ConnectBlock returns true.

@nathan-at-least
Copy link
Contributor

I'm calling the timebox closed by timeout. I filed #2388 to follow up on the non-determinism issue. Meanwhile, we've already deployed the CI changes for this with @str4d's special-case instance of testdata, so:

@zkbot r+

@zkbot
Copy link
Contributor

zkbot commented May 19, 2017

📌 Commit c66c731 has been approved by nathan-at-least

@zkbot
Copy link
Contributor

zkbot commented May 19, 2017

⌛ Testing commit c66c731 with merge 8214ebc...

zkbot added a commit that referenced this pull request May 19, 2017
Benchmark for calling ConnectBlock on a block with many inputs

Requires placing `block-107134.tar.gz` (containing the block, and a fake CoinsDB containing its inputs) into the base directory of the repository.

To facilitate generation of the fake CoinsDB, an additional field `valuesZat` has been added to `getrawtransaction` containing the integer number of zatoshis instead of a decimal number of ZEC.

Closes #2355.
@zkbot
Copy link
Contributor

zkbot commented May 19, 2017

☀️ Test successful - pr-merge
Approved by: nathan-at-least
Pushing 8214ebc to master...

@zkbot zkbot merged commit c66c731 into zcash:master May 19, 2017
@str4d str4d deleted the 2355-connectblock-bench branch May 19, 2017 04:35
@nathan-at-least nathan-at-least moved this from In Progress to Awaiting Review in Security and Stability May 30, 2017
@nathan-at-least nathan-at-least moved this from Awaiting Review to Complete in Security and Stability May 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-dos Problems and improvements with respect to Denial-of-Service. I-performance Problems and improvements with respect to performance I-SECURITY Problems and improvements related to security.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants