Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MerkleTree::gen_nth_proof and Proof::index. #36

Merged
merged 5 commits into from
Aug 9, 2018

Conversation

afck
Copy link
Contributor

@afck afck commented May 17, 2018

Also fixes Clippy warnings in the test code, and runs Clippy with --tests --all-features in Travis.

Fixes #31

.travis.yml Outdated
@@ -16,7 +16,7 @@ matrix:
- env: NAME='nightly'
rust: nightly
- env: NAME='rustfmt'
rust: nightly
rust: nightly-2018-05-14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to keep just nightly here, and add rust: nightly-2018-05-14 to the clippy build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the rustfmt version is tied to a particular Rust nightly version, so if this is just nightly and in the clippy build the Rust version is fixed to nightly-2018-05-14, they diverge.

This has already happened: I used nightly-2018-05-14 locally to make sure I can use the same Clippy version, and I used the corresponding rustfmt. That made the build fail because the latest rustfmt already wanted to make some other changes!

It's really inconvenient if for every commit you make you have to switch between two Rust versions to make sure you apply the right rustfmt and clippy.

Copy link
Contributor Author

@afck afck May 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I can upgrade the clippy build's Rust nightly version to a later one, and use a later Clippy, if you prefer.

(I'm also happy to revert this change and keep both versions as they are, and apply rustfmt with latest nightly, and clippy with 2018-05-14, of course.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really inconvenient if for every commit you make you have to switch between two Rust versions to make sure you apply the right rustfmt and clippy.

That is a good point! Maybe it's best then to pin the same nightly version for both rustfmt and clippy? Should release pinning also be done for rustfmt?

Another idea is just to keep both tools on their respective latest releases and building with nightly, and if clippy breaks, just temporarily add it to allow_failures.

What do you think @romac?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should release pinning also be done for rustfmt?

I think that's automatic: Since (unlike clippy) rustfmt is installed via rustup, its version will always be the same for any given version of Rust itself. So with nightly-2018-05-14, we should always get rustfmt 0.7.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking for documentation on how rustup handled this actually earlier today. I didn't see it in the rustup README. Can you point me there? And how can I see exactly what version of a component will be built against a given Rust versiont? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also can't find it in the documentation, but if I add the rustfmt-preview component to different toolchains, I get different versions of it:

$ cargo +nightly-2018-02-05 fmt --version
0.3.8-nightly (346238f 2018-02-04)
$ cargo +nightly-2018-04-19 fmt --version
rustfmt 0.4.2-nightly (dd807e2 2018-04-18)
$ cargo +nightly-2018-05-19 fmt --version
rustfmt 0.7.0-nightly (bf2581b 2018-05-17)

(Note that I installed the first one today, to try it out, and it still installed old version 0.3.8, because that is part of the nightly-2018-02-05 toolchain.)

@afck
Copy link
Contributor Author

afck commented Jun 4, 2018

Have you already decided how to proceed with rustfmt? I'm happy to update the PR either way.

@psivesely
Copy link
Contributor

I've been dealing with other stuff and I know @romac is quite busy as a grad student, so apologies for the late response. Since this is something you could use, it seems like we just shouldn't worry about rustfmt for now and figure out a strategy later.

@afck
Copy link
Contributor Author

afck commented Jun 9, 2018

Thank you, it would be great to get this merged!
And I'm happy with any of the above (possibly temporary) solutions: I can keep or revert that Rust nightly line, and/or apply the latest rustfmt and/or upgrade Clippy — anything that doesn't break CI.

Copy link
Contributor

@psivesely psivesely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recursion here is pretty complicated, but as far as I can tell staring at it for a while it's correct. Nice work!

I noticed you are basically recursing the tree twice here: first to find what the value is, and second to generate the inclusion proof. It would be more efficient if you could do this in a single step. The way that nth_leaf moves down the tree seems more efficient because it more or less navigates down the tree directly to our value, whereas the inclusion proof generation code actually visits each leaf sequentially until getting to the nth. So if you were to combine them, I think you could also change the way the inclusion proof generation code works to be more like the nth_leaf code. I would say this optimization is all optional, and am in favor of merging this how it's written now, taking a note of this suggestion in an issue we can open.

Noting its complexity, I think it would be good to include some more thorough testing including some failure conditions to ensure that neither of us missed anything.

One more small note: where you use bitshift I would use / 2 (as you did elsewhere) as it's more clear and the LLVM IR should be the same.

@afck
Copy link
Contributor Author

afck commented Jun 11, 2018

I merged the value getter and lemma generation to traverse the tree only once. Not sure whether it's cleaner, but it's shorter and probably a bit more efficient. The >> 1 is also gone.

I added two test cases where None is expected.

you could also change the way the inclusion proof generation code works

You mean: first find the value's index, then call Lemma::new_by_index? That might be faster than the current code, yes.
But regarding efficiency, I wonder whether a much bigger change like #4 should be reconsidered.

@psivesely
Copy link
Contributor

psivesely commented Jun 11, 2018

If a tree is well-formed you know exactly how it's going to be structured. I think this code should assume the tree is well structured and act accordingly. Then there's no need to traverse all the way down the left side until we hit a leaf, realize count != 1, go back up one node and check the other leaf, see that still count != 1, go back up two nodes, etc.. I don't think there's even a need for match at all. We should be able to determine the exact sequences of left and right steps down the tree to get to a given index when given the leaf count of a tree, navigating directly to our index/value and creating the inclusion proof along the way.

@afck
Copy link
Contributor Author

afck commented Jun 11, 2018

Isn't that pretty much what Lemma::new_by_index already does? It never backtracks and checks other leaves: the path it takes depends only on the index and count. Just when it hits a leaf, I added the sanity check that count is indeed 1 now: If it isn't, the tree isn't structured correctly. Note that count isn't the size of the whole tree, but the expected size of the subtree at the current node.

(For the existing by-value proof constructor, we still have to check all the leaves sequentially, of course, since we don't know the index of that value.)

@psivesely
Copy link
Contributor

psivesely commented Jun 11, 2018 via email

@psivesely
Copy link
Contributor

Isn't that pretty much what Lemma::new_by_index already does? It never backtracks and checks other leaves: the path it takes depends only on the index and count.

It is now because of how you branch using this predicate. But if you look at the code before you combined value getter and inclusion proof generation code into a single traversing function, you'll see here that you were traversing sequentially as I described before. At least I think I have this all right 😅.

Just having skimmed it--looks good--but I'll take a closer look and give a proper follow-up review tomorrow.

@afck
Copy link
Contributor Author

afck commented Jun 12, 2018

I see what you mean. But also the old code would never backtrack more than one step: It's just that it would do the index vs. count check inside the new_by_index call, and immediately return if it needs to take the other branch. I agree that wasn't a good way to do it.
(But it was still O(log n) and not sequential.)

@psivesely
Copy link
Contributor

I see what you mean. But also the old code would never backtrack more than one step: It's just that it would do the index vs. count check inside the new_by_index call, and immediately return if it needs to take the other branch. I agree that wasn't a good way to do it.

Ah, you're right. When I first read this if statement I mentally wrote it off as only being significant the first time new_by_index was called (by gen_nth_proof)--an initial check--and not an essential part of traversal logic that works in tandem with new_tree_proof_by_index. Still, it is more efficient as you pointed out by a linear factor now, and as another plus to the new version, I think it's easier to follow.

Copy link
Contributor

@psivesely psivesely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about just adding a index field to Proof struct? It would still be good have something like the index function code here to be used as a validation function, but if you need to find the index of a leaf, you may as well just save it upfront and not have to do the computation more than once. Since this would modify a decent amount of code outside this new functionality and represent slightly larger changes, I think it'd be work for another PR. Just wanted to put the idea out there for now.

I think this looks good, will approve pending the one small question via the in-line review comment.

src/proof.rs Outdated
/// Returns the index of this lemma's value, given the total number of items in the tree.
pub fn index(&self, count: usize) -> usize {
let left_count = count.next_power_of_two() / 2;
match (self.sub_lemma.as_ref(), self.sibling_hash.as_ref()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is that (None, Some(_)) will ever match here? In new_by_index every Lemma created will match (Some(_), Some(_)) (here) or (None, None) (here). Am I missing something again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think you're right; I wasn't sure about that when I wrote it…
Would you prefer it to panic on (Some(_), None) and (None, Some(_))?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered whether it should return a Result for a little bit, but I think it's a fair assumption that users won't manually edit Lemmas, and thus something is wrong with our code if it ever hits these conditions, so we should panic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I think of it, I think panicking here would be dangerous: If this is used in any networking application, an attacker could send us a Lemma that breaks this invariant over the wire. That would successfully deserialize and then cause our node to panic here.

If these combinations really don't make sense, that should be reflected in the type itself, e.g.:

pub struct Lemma {
    pub node_hash: Vec<u8>,
    pub sub_lemma: Option<(Positioned<Vec<u8>>, Box<Lemma>)>,
}

If you prefer to keep the current Lemma structure, we should instead make sure that the invariant (either both None or both Some) cannot be broken: We should adapt the serde and protobuf deserialization methods so that they return an error on None+Some, and add the invariant to the type's documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. How about we do panic, but in the docstring for the method we make that clear, and tell the user they should run Proof::validate on the input before ever running Proof::index?

I do like as much as possible tying being an object of a certain type with that object being valid, but there's limits to this when writing in anything that isn't a formal proving language. I think the suggestion to restructure Lemma would be appropriate for an issue, some further discussion, and perhaps a followup PR. For now, however, I think we should keep this scoped and worked with the structs there already are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the panic and documented it. I'll open a new issue for this.

@psivesely
Copy link
Contributor

Okay, so just that last small change, and then I think this is good to merge.

Now I just got to learn about this whole BFT business. Looks important!

@afck
Copy link
Contributor Author

afck commented Jun 14, 2018

BFT business. Looks important!

It's certainly fun and interesting. 🙂

@psivesely
Copy link
Contributor

Blocked by #40.

@psivesely
Copy link
Contributor

Update: blocked by #44, not #40 anymore.

@afck afck force-pushed the master branch 4 times, most recently from ebc3062 to fd676cb Compare July 30, 2018 16:37
@afck
Copy link
Contributor Author

afck commented Jul 30, 2018

I merged it with latest master.

I was also meaning to add --tests to the Clippy call in .travis.yml, but either I'm hallucinating (I blame the hot weather!) or I ran into a false positive.

@romac
Copy link
Member

romac commented Aug 9, 2018

@afck Can you please rebase it on top of master, so that I can merge it?

And use the same Rust nightly version for Clippy and Rustfmt.
@afck
Copy link
Contributor Author

afck commented Aug 9, 2018

Done. (I think — there were plenty of conflicts.)

This currently fails because of a false positive:
rust-lang/rust-clippy#2979
@romac romac merged commit 36fb645 into SpinResearch:master Aug 9, 2018
@romac
Copy link
Member

romac commented Aug 9, 2018

Sweet, thanks :) Expect a v1.9.0 release sometime this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate proof by index.
3 participants