Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low memory prover #2670

Merged
merged 6 commits into from
Nov 6, 2017
Merged

Low memory prover #2670

merged 6 commits into from
Nov 6, 2017

Conversation

arielgabizon
Copy link
Contributor

This PR integrates @ebfull 's low memory changes: https://github.com/zcash/zcash/pull/2243/commits
on top of @str4d 's work bringing in libsnark as a subtree
4699d0e

@str4d
Copy link
Contributor

str4d commented Oct 19, 2017

Nice! I'm going to review the PR later this week, but one initial comment: please rebase this PR to edit the first commit to restore its original commit message (use git rebase -i master, change the first commit from pick to r, change the commit message, and then git push -f to update this PR).

@str4d str4d added this to the 1.0.13 milestone Oct 19, 2017
@arielgabizon
Copy link
Contributor Author

Done.

@daira
Copy link
Contributor

daira commented Oct 21, 2017

The commit message "Delete libsnark tests that we've merged into gtest." does not match the contents of that commit.

@arielgabizon
Copy link
Contributor Author

arielgabizon commented Oct 21, 2017

My mistake. Should be fine now. Though rebase still makes me a little uncomfortable so would be happy if someone verified.

const size_t chunks = omp_get_max_threads(); // to override, set OMP_NUM_THREADS env var or call omp_set_num_threads()
#else
const size_t chunks = 1;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for another PR: refactor this to const size_t chunks = MULTICORE_CHUNKS(); and move the preprocessor conditional to a header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace

@@ -439,10 +513,6 @@ r1cs_ppzksnark_proof<ppT> r1cs_ppzksnark_prover(const r1cs_ppzksnark_proving_key
{
enter_block("Call to r1cs_ppzksnark_prover");

#ifdef DEBUG
assert(constraint_system.is_satisfied(primary_input, auxiliary_input));
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please retain this; we will remove it in #2005 (when verification succeeds), but I want to see the effect of the rest of the PR on performance/memory usage independently of the effect of #2005.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some git trick to add this to the commit where it was deleted? Or should I make a new commit?

Copy link
Contributor

@str4d str4d Oct 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • git rebase -i <COMMIT_ID_BEFORE_THIS>
  • Select this commit to be edited (change pick to e)
  • Edit the file to add this back in (copy from this diff, and remove the leading -s)
  • Add this change
  • git commit --amend
  • git diff HEAD^..HEAD and check this part of the diff is no longer there.
  • git rebase --continue to finish.

Alternatively:

  • Make a new commit that exactly reverses this one change.
  • git rebase -i <COMMIT_ID_BEFORE_THIS>
  • Move the new commit line underneath this commit's line, and change it from pick to s (squash)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great explanation!

daira
daira previously requested changes Oct 22, 2017
assert(pk.C_query.domain_size() == qap_wit.num_variables()+2);
assert(pk.H_query.size() == qap_wit.degree()+1);
assert(pk.K_query.size() == qap_wit.num_variables()+4);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These debug assertions have been dropped. Is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could be left in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can't be left in cause it needs all the proving key at once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this could be split across the various functions. The for loop would be harder though, and given that we don't set DEBUG, I'm fine with leaving these out.

const Fr<ppT> t = Fr<ppT>::random_element();
qap_instance_evaluation<Fr<ppT> > qap_inst = r1cs_to_qap_instance_map_with_evaluation(constraint_system, t);
assert(qap_inst.is_satisfied(qap_wit));
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is dropping this debug assertion intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentional when we did it. I can't remember the reason why though. It could be left in -- after all, we don't enable DEBUG anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added back


return ZCProof(r1cs_ppzksnark_prover<ppzksnark_ppT>(
pk,
// TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is to do here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember, it should be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

pk,
// TODO
if(!fh.is_open()) {
throw std::runtime_error((boost::format("could not load param file at %s") % pkPath).str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use boost::format, we use tinyformat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also second instance

Copy link
Contributor

@str4d str4d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK and preliminary utACK. I've checked that the refactor appears to be a no-op. Daira's comments need addressing.

ebfull
ebfull previously requested changes Oct 26, 2017
const Fr<ppT> t = Fr<ppT>::random_element();
qap_instance_evaluation<Fr<ppT> > qap_inst = r1cs_to_qap_instance_map_with_evaluation(constraint_system, t);
assert(qap_inst.is_satisfied(qap_wit));
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentional when we did it. I can't remember the reason why though. It could be left in -- after all, we don't enable DEBUG anyway.

assert(pk.C_query.domain_size() == qap_wit.num_variables()+2);
assert(pk.H_query.size() == qap_wit.degree()+1);
assert(pk.K_query.size() == qap_wit.num_variables()+4);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could be left in.

@ebfull
Copy link
Contributor

ebfull commented Oct 26, 2017

The math is good, the code and logic is good, I like the API changes, nice and clean.

Just might as well put those DEBUG assertions back even if we're not enabling them ever. (I consider that kind of pre-processor-macro-debug-assertion to be an anti-pattern though, since it doesn't expand templates and create static assertions and things like that.)

@bitcartel
Copy link
Contributor

bitcartel commented Oct 27, 2017

There might be a memory leak. Tops out with virtual memory at 9GB and actual usage spiking to 7GB.

VIRT      RES
978988  26544  (after ./zcashd) 
2936904 1.471g (after ./zcash-cli zcbenchmark createjoinsplit 1)
4960356 3.080g (repeat...)
6983808 4.689g
9007260 6.298g
9007260 6.462g
9007260 6.462g (repeat... final 6th invocation)

The above was conducted with zcash in regtest mode.

@arielgabizon
Copy link
Contributor Author

@bitcartel I wonder if it's related to this comment 929b9db#r120993316

If I understand what @daira said there, it's not a problem?

@arielgabizon
Copy link
Contributor Author

Btw, wouldn't you get a similar thing with the regular joinsplit before these changes? I remember memory not going down at the end of a joinsplit when checking with htop

@arielgabizon
Copy link
Contributor Author

corrected link to Daira's comment #2243 (comment)

@tromer
Copy link
Contributor

tromer commented Oct 27, 2017

@arielgabizon This depends on the libc implementation (did the allocator use the heap pool or dedicated pages? if it's the heap pool, how fragmented is it?). This is non-deterministic platform-specific voodoo, and I would not be surprised if "often" the memory remains reserved as far as the OS can see. To be sure it's really freed up, we'd have to take over and do the mmap() by ourselves.

@arielgabizon
Copy link
Contributor Author

@tromer I don't know this stuff, and not sure how you do it yourself; would be happy to fix it if someone explained it to me. @daira said here #2679 that it currently allocates on the stack and we should change that. Though she also said in the comment I linked above that it can wait for a future PR.

@arielgabizon
Copy link
Contributor Author

Ohh you perhaps were talking about the joinsplit before this code change.

@tromer
Copy link
Contributor

tromer commented Oct 27, 2017

Yes. @daira is right that it should be on the heap, and that's all that matters for the current PR.

An additional issue, which should not be addressed in the current PR, is that even when using the heap (whether it's the old prover or the new prover), the memory may remain reserved unless you use a special allocator. Fixing that would probably be platform-specific and a bit tricky.

@arielgabizon
Copy link
Contributor Author

@bitcartel you might want to try this version I wrote. https://github.com/arielgabizon/zcash/tree/lowmemmovetoheap
It moves the main variable to the heap, and you can see the more (though not all) memory reclaimed after z_sendmany

@bitcartel
Copy link
Contributor

bitcartel commented Oct 27, 2017

@arielg I tried the other branch, same results with zcbenchmark createjoinsplit. Anyway, I don't think this is a blocker, given that this issue already exists and has been documented.

@arielgabizon
Copy link
Contributor Author

Well glad it's not a blocker. I see more memory released in the other branch when using z_sendmany directly. I wonder if the memory increase is cause the benchmark uses a stack variable.
https://github.com/arielgabizon/zcash/blob/lowmemmovetoheap/src/zcbenchmarks.cpp#L118

@zkbot
Copy link
Contributor

zkbot commented Nov 2, 2017

💔 Test failed - pr-merge

@str4d
Copy link
Contributor

str4d commented Nov 2, 2017

One of the Boost tests is failing. The error message:

unknown location(0): fatal error: in "rpc_wallet_tests/rpc_z_sendmany_internals": memory access violation at address: 0x00000848: no mapping at fault address
test/rpc_wallet_tests.cpp(1171): last checkpoint

@daira
Copy link
Contributor

daira commented Nov 2, 2017

Ugh, address looks like an offset from NULL.

@bitcartel
Copy link
Contributor

Reproducible. test/test_bitcoin -t rpc_wallet_tests fails in this branch, but passes fine on master branch.

@arielgabizon
Copy link
Contributor Author

Looks like that test fails only with the last commit that puts the ifstream object on the head, which increases running time by 0.5 seconds according to @str4d , maybe just abort that commit for now?

@str4d
Copy link
Contributor

str4d commented Nov 5, 2017

Force-pushed to remove 2f598a5 from this PR.

@zkbot try

@zkbot
Copy link
Contributor

zkbot commented Nov 5, 2017

⌛ Trying commit 4305a56 with merge 94e19023bc8ca82661ae015c3b898dd6b5fbde48...

@str4d
Copy link
Contributor

str4d commented Nov 5, 2017

Still seeing that failure:

unknown location(0): fatal error: in "rpc_wallet_tests/rpc_z_sendmany_internals": memory access violation at address: 0x00000270: no mapping at fault address
test/rpc_wallet_tests.cpp(1171): last checkpoint
*** 1 failure is detected in the test module "Bitcoin Test Suite"
test_bitcoin: /home/zcbbworker/latent-0/debian8/build/depends/x86_64-unknown-linux-gnu/share/../include/boost/thread/pthread/condition_variable_fwd.hpp:102: boost::condition_variable::~condition_variable(): Assertion `!ret' failed.

@zkbot
Copy link
Contributor

zkbot commented Nov 5, 2017

💔 Test failed - pr-try

BasicTestingSetup::BasicTestingSetup()
{
assert(init_and_check_sodium() != -1);
ECC_Start();
pzcashParams = ZCJoinSplit::Unopened();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the cause of the crash. Previously TestingSetup had access to pzcashParams because it inherits from BasicTestingSetup. With this change, it doesn't have that defined any more, leading to a null pointer error in rpc_wallet_tests (which uses TestingSetup). I think the best solution is to have TestingSetup inherit from JoinSplitTestingSetup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and if we do this, the test that was triggering this bug then fails:

test/rpc_wallet_tests.cpp(1377): error: in "rpc_wallet_tests/rpc_z_shieldcoinbase_internals": check string(e.what()).find("JoinSplit verifying key not loaded")!= string::npos has failed

@@ -154,10 +110,6 @@ class JoinSplitCircuit : public JoinSplit<NumInputs, NumOutputs> {
uint64_t vpub_new,
const uint256& rt
) {
if (!vk || !vk_precomp) {
throw std::runtime_error("JoinSplit verifying key not loaded");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to just alter the test to check for the error we actually get now, given that this error has been removed.

@str4d
Copy link
Contributor

str4d commented Nov 5, 2017

@zkbot try

@zkbot
Copy link
Contributor

zkbot commented Nov 5, 2017

⌛ Trying commit bef1b5c with merge 9e03db9...

zkbot added a commit that referenced this pull request Nov 5, 2017
Low memory prover

This PR integrates @ebfull 's low memory changes:  https://github.com/zcash/zcash/pull/2243/commits
on top of @str4d 's work bringing in libsnark as a subtree
4699d0e
@zkbot
Copy link
Contributor

zkbot commented Nov 5, 2017

☀️ Test successful - pr-try
State: approved= try=True

Copy link
Contributor

@daira daira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK bef1b5c

@daira
Copy link
Contributor

daira commented Nov 6, 2017

@zkbot r+

@zkbot
Copy link
Contributor

zkbot commented Nov 6, 2017

📌 Commit bef1b5c has been approved by daira

@zkbot
Copy link
Contributor

zkbot commented Nov 6, 2017

⌛ Testing commit bef1b5c with merge 6f9f09d...

zkbot added a commit that referenced this pull request Nov 6, 2017
Low memory prover

This PR integrates @ebfull 's low memory changes:  https://github.com/zcash/zcash/pull/2243/commits
on top of @str4d 's work bringing in libsnark as a subtree
4699d0e
@zkbot
Copy link
Contributor

zkbot commented Nov 6, 2017

☀️ Test successful - pr-merge
Approved by: daira
Pushing 6f9f09d to master...

@zkbot zkbot merged commit bef1b5c into zcash:master Nov 6, 2017
@karelbilek
Copy link
Contributor

Thanks for merging! I will be able to run actual release soon :)

@arielgabizon
Copy link
Contributor Author

Interestingly, the memory peak now and previously, is not during proof generation but during creating the joinsplit gadget https://github.com/zcash/zcash/blob/master/src/zcash/JoinSplit.cpp#L276.
@madars and other libsnark authors might want to know about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants