Binary Merkle Tree Hash #14334

zelig · 2017-04-15T17:17:27Z

bmt is a new package that provides hashers for binary merkle tree hashes on size-limited chunks
the main motivation is that using BMT hash as the chunk hash of the swarm hash offers logsize inclusion proofs for arbitrary files on a 32-byte resolution completely viable to use in challenges on the blockchain.

lmars

Some general comments on the Go code, I haven't really reviewed the algorithm thoroughly yet though.

lmars · 2017-08-07T19:04:25Z

bmt/bmt.go

+		select {
+		case <-self.c:
+			self.count--
+		default:


This default clause is redundant, it will just lead to the next loop iteration trying to select on the channel again, so it may as well just be:

for len(self.c) > n { <-self.c self.count-- }

true as the lock makes sure it is called nonconcurrently.

lmars · 2017-08-07T19:09:31Z

bmt/bmt.go

+
+// blocks until it returns an available BMTree
+// it reuses free BMTrees or creates a new one if size is not reached
+func (self *BMTreePool) Reserve() *BMTree {


Could this benefit from using sync.Pool instead?

I thought of that but it is ok

lmars · 2017-08-07T19:19:56Z

bmt/bmt.go

+		return NewEOC(nil)
+	}
+	rightmost := i == int(max-1)
+	last := atomic.AddInt32(&self.max, 1) == max


This look like a race: two goroutines atomically read self.max, both find int(max) > i, both potentially calculate rightmost to be true and race to set self.segment?

this code is unused, will remove for now

lmars · 2017-08-07T19:25:39Z

bmt/bmt_test.go

+}
+
+func TestBMTHasherReuseWithRelease(t *testing.T) {
+	testBMTHasherReuse(t)


This test is identical to TestBMTHasherReuseWithoutRelease above.

* remove fmt prints in test * remove unused segmentWriter related code * added missing comments and package premable doc * fix pool reuse benchmarks * rename functions

lmars · 2017-08-13T12:33:57Z

bmt/bmt_r.go

+	l := len(d)
+	left := d
+	var right []byte
+	if l > rh.section {


Should we not be handling the case where hashsize < l <= 2*hashsize here and calculating the hash of both left and right?

For example, consider the BMT of 64 zero bytes. I would expect it to be the sha3 of the concatenation of two sha3 hashes of 32 zero bytes:

sha3 := func(data ...[]byte) []byte { h := sha3.NewKeccak256() for _, v := range data { h.Write(v) } return h.Sum(nil) } zeroes := make([]byte, 32) fmt.Printf("BMT of 64 bytes: %x\n", sha3(sha3(zeroes), sha3(zeroes)))

Running that I get:

$ go run bmt.go BMT of 64 bytes: 633dc4d7da7256660a892f8f1604a44b5432649cc8ec5cb3ced4c4e6ac94dd1d

However this reference implementation gives a different hash (which is just sha3(zeroes + zeroes)):

h := bmt.NewRefHasher(sha3.NewKeccak256, 128) fmt.Printf("Ref BMT of 64 bytes: %x\n", h.Hash(make([]byte, 64)))

$ go run ref-bmt.go Ref BMT of 64 bytes: ad3228b676f7d3cd4284a5443f17f1962b36e491b30a40b2405849e597ba5fb5

Am I missing something here?

yes it is defined as per the reference implementation.
Did you find inconsistency between the ref result and the concurrent one?
or do you disagree with the definition?

I am disagreeing with the definition.

To summarise, the ref implementation generates the BMT of 64 bytes as just sha3(64 bytes) whereas I think it should be sha3(sha3(first 32 bytes) + sha3(second 32 bytes))

I side with @zelig here: the change proposed by @lmars would slightly decrease the complexity of the control logic of verification, but at the cost of having to calculate more hashes. There is no way that it will result in less gas use in the verification contract, which is one of the most important design considerations.

Ok that's fine, so I think we need to explicitly document that this is the intended implementation to avoid any confusion, and perhaps add an explicit test demonstrating the behaviour.

nagydani

It is not clear from the test source that important corner cases are explcitly tested such as

Empty chunk (eight zero bytes)
Chunks shorter than one hash length
Chunks of lengths that are not integer multiples of one hash length

lmars · 2017-08-14T19:11:17Z

I agree with @nagydani, we need some exhaustive tests for data of various lengths.

zelig · 2017-08-16T12:03:33Z

@nagydani @lmars the tests cover all these: https://github.com/ethereum/go-ethereum/pull/14334/files#diff-b6e39ee34f8b8d3a7c7b4d8515bf94aaR49

lmars · 2017-08-16T13:24:37Z

@zelig so two comments on the tests:

there seems to be a lot of indirection in the tests with the numerous testXXX functions, and also not a single comment of what is being tested, which I believe is leading us to ask these questions
I think the reference implementation itself should be thoroughly tested, it looks like it is currently untested

lmars · 2017-08-16T15:53:14Z

@zelig I decided to just write some tests and I'm still not sure if I know what the reference implementation is supposed to be doing.

For example, I tried to hash 65 bytes of data and expected it to be:

sha3(sha3(data[0:64]), sha3(data[64:65]))

but I get a different result? Is my expectation again incorrect?

Here is the test: https://gist.github.com/lmars/67f232dfbdbf8635364ec1901343e51b

I really think it would be beneficial to be explicit and document the expected hashes and how they are constructed just using sha3 for say 0 <= length <= 256 (ideally in a test).

zelig · 2017-08-16T16:28:25Z

you assumption is again incorrect indeed.
But your suggestion is completely valid.
You can rewrite the tests if you like and we should indeed document the spec :)
singleton branches (no longer than a segment) are not hashed, just appended to the left branch hash
as per line 76, so

BMTHash(d_65) := sha3(sha3(d[0:64]), d[65:65])

Signed-off-by: Lewis Marshall <lewis@lmars.net>

lmars · 2017-08-18T00:43:17Z

@zelig I've added a test of RefHasher in c179801, it is pretty verbose and exhaustive but I find it avoids any confusion on what the hashes should be.

zelig · 2017-08-18T01:08:27Z

@lmars great stuff like.

nagydani

After an interactive review session with @zelig , approved.

zelig added feature labels Apr 15, 2017

zelig self-assigned this Apr 15, 2017

zelig mentioned this pull request Apr 29, 2017

Merkle Tree Hash ethersphere/swarm#48

Closed

zelig mentioned this pull request Jun 26, 2017

swarm related PRs - Q2 merge plan #14706

Closed

13 tasks

jmozah force-pushed the bmt branch 2 times, most recently from 0200fce to 6170e9e Compare June 28, 2017 20:26

lmars reviewed Aug 7, 2017

View reviewed changes

orenyodfat and others added 3 commits August 10, 2017 14:42

swarm/storage: binary merle tree with proof of inclusion

d2b3fdd

bmt: new package for binary merkle tree hash

f8bc194

bmt: rename, cleanup and fix benchmarks

7985ec3

* remove fmt prints in test * remove unused segmentWriter related code * added missing comments and package premable doc * fix pool reuse benchmarks * rename functions

zelig force-pushed the bmt branch from 1385d2f to 7985ec3 Compare August 11, 2017 03:17

zelig added review and removed in progress labels Aug 11, 2017

zelig mentioned this pull request Aug 11, 2017

swarm: BMT improvements #14962

Closed

2 tasks

lmars reviewed Aug 13, 2017

View reviewed changes

nagydani reviewed Aug 14, 2017

View reviewed changes

zelig mentioned this pull request Aug 16, 2017

swarm/storage: binary merkle tree with proof of inclusion #13879

Closed

ethereum deleted a comment from GitCop Aug 16, 2017

bmt: Add RefHasher tests for numerous byte lengths

c179801

Signed-off-by: Lewis Marshall <lewis@lmars.net>

bmt: test all possible length for correctness

0d04b56

nagydani approved these changes Aug 18, 2017

View reviewed changes

fjl merged commit 2bacf36 into ethereum:master Sep 5, 2017

karalabe added this to the 1.7.0 milestone Sep 5, 2017

zelig mentioned this pull request Sep 5, 2017

swarm binary merkle hash over 32-byte segments with segment proofs #3451

Closed

3 tasks

gbalint deleted the bmt branch May 25, 2018 14:44

zelig mentioned this pull request Jun 26, 2018

BMT outstanding features ethersphere/swarm#749

Closed

2 tasks

adamschmideg added type:feature and removed type:feature labels Dec 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary Merkle Tree Hash #14334

Binary Merkle Tree Hash #14334

zelig commented Apr 15, 2017

lmars left a comment

lmars Aug 7, 2017

zelig Aug 11, 2017

lmars Aug 7, 2017

zelig Aug 11, 2017

lmars Aug 7, 2017

zelig Aug 11, 2017

lmars Aug 7, 2017

zelig Aug 11, 2017

lmars Aug 13, 2017

zelig Aug 13, 2017

lmars Aug 13, 2017

lmars Aug 13, 2017

nagydani Aug 14, 2017

lmars Aug 14, 2017

nagydani left a comment

lmars commented Aug 14, 2017

zelig commented Aug 16, 2017

lmars commented Aug 16, 2017

lmars commented Aug 16, 2017

zelig commented Aug 16, 2017

lmars commented Aug 18, 2017

zelig commented Aug 18, 2017

nagydani left a comment

Binary Merkle Tree Hash #14334

Binary Merkle Tree Hash #14334

Conversation

zelig commented Apr 15, 2017

lmars left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nagydani left a comment

Choose a reason for hiding this comment

lmars commented Aug 14, 2017

zelig commented Aug 16, 2017

lmars commented Aug 16, 2017

lmars commented Aug 16, 2017

zelig commented Aug 16, 2017

lmars commented Aug 18, 2017

zelig commented Aug 18, 2017

nagydani left a comment

Choose a reason for hiding this comment