Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Tiger Tree Hash and Merkle Hash Tree #55

Open
ivan386 opened this issue Oct 27, 2016 · 7 comments
Open

Support for Tiger Tree Hash and Merkle Hash Tree #55

ivan386 opened this issue Oct 27, 2016 · 7 comments

Comments

@ivan386
Copy link

ivan386 commented Oct 27, 2016

@jbenet
The Tiger tree hash is a widely used form of hash tree. It uses a binary hash tree (two child nodes under each node), usually has a data block size of 1024 bytes and uses the cryptographically secure Tiger hash.

Tiger tree hashes are used in Gnutella, Gnutella2, and Direct Connect P2P file sharing protocols and in file sharing applications such as Phex, BearShare, LimeWire, Shareaza, DC++ and Valknut.

Example: RBOEI7UYRYO5SUXGER5NMUOEZ5O6E4BHPP2MRFQ

Short name: TTH
Alternative name: Tree Tiger Hash
URN: urn:tree:tiger:[TTH in base32]
Magnet: magnet:?xt=urn:tree:tiger:[TTH in base32]
Default size: 192 bits
Default encoding: Base32

@OCTAGRAM
Copy link

OCTAGRAM commented Dec 17, 2016

Short summary: we might need to allocate new numbers for:

  1. Hacked "TTH+64bit size" (a hack to match TTH in real p2p networks, suitable for Blob and List only) - need very much
  2. Correct TTH (correct implementation, just as if sha1sum being replaced with tthsum; but does not match TTH calculated by any real p2p network) - just in case let people store protobuf stuff in real p2p networks. Jigdo-style List description can be stored in protobuf format this way in real p2p.
  3. Tiger - don't really need, but may be introduced for completeness

Long comment:
In the current implementation just adding Tiger or TTH won't provide desired interoperability as tree building algorithms differ. TTH does not allow arbitrary block sizes. Instead, different levels of Merkle tree can be used to verify blocks of different size, but the bottom block is 1K. And wrt Jigdo-style files, (list objects), the spirit of existing P2P solutions is to get TTH of Jigdo description and source files, and let some clients represent Jigdo-style file as being virtually combined. The current system processed a hell lot of files, and its combination of flexibility and strictness has advantages that should be preserved. Identical file will always be matched and found. Flexibility of IPFS is somewhat killing the good properties of TTH for no good reason. As I stated before, in normal p2p one can have both TTH and Jigdo, and IPFS is only Jigdo at the moment.

I have investigated the situation and propose the following algorithm. A special hashing function introduced that is only supposed to be able to hash Blob and List. Contrary to normal hashing functions it has ProtoBuf parser built-in to handle Blob and List. If this parser meets octet sequence, that is Blob, and TTH Leaf is being calculated for Blob.data (but not for ProtoBuf bytes describing length of Blob). If this parser meets array, that is List, and it prepares to calculate TTH Node. In case of unexpected byte sequence a fallback is possible.

I have thought about the possibility to provide several ProtoBuf List descriptions with different sizes, and a hacked TTH might match all of them to the same TTH, and that can be used for tricking IPFS, which is not good. So I have decided to extend TTH with size, and fallback should happen when any tampering with sizes detected.

Real tools having TTH often also have size, so they can always construct IPFS hash from TTH and size.

UPD. Oh, I see. So now it's CBOR, not ProtoBuf. Alright, anyway TTH (24 bytes) + size (8 bytes) = 32 bytes, that should be IPFS hash for Merkle tree of properly hashed file. We just need a number to prepend.

@ivan386
Copy link
Author

ivan386 commented Mar 27, 2018

Merkle Hash Tree (1024)

Segment size - 1024 bytes
Data prefix - one byte with value 0 (0x00)
Hash pair prefix - one byte with value 1 (0x01)
Unpaired hashes move on next level unchanged.

++ - concatenation

Leaf hash: Hash( 0x00 ++ ( data 1024 bytes or less if this is last segment ) )
Internal hash: Hash(0x01 ++ ( Fist hash ) ++ ( Second hash ))

Level     Tree Root
|                |
V                V  
4:            __ 21 ___ <-------
             /         \        \
3:       __15__         20 <---- Internal Hashes
        /      \          \          /
2:     7        14         20 <-----/
      / \      /  \        /\      /
1:   3   6    10   13    18  19 <--
    /\   /\   /\   /\    /\   \
0: 1  2 4  5 8  9 11 12 16 17 19 <---- Leaf Hashes
   |_________ file ____________|

Numbers from 1 - 21 is optimized hash calculation order

Links:

Tree Hash EXchange format (THEX)
Tiger:
A Fast New Cryptographic Hash Function (Designed in 1995)

wikipedia: Tiger (cryptography)
wikipedia: Merkle tree#Tiger tree hash

@ivan386 ivan386 changed the title Support for Tiger Tree Hash Support for Tiger Tree Hash and Merkle Hash Tree May 16, 2018
@ivan386
Copy link
Author

ivan386 commented May 16, 2018

@Stebalien
Some one alive here?

@Stebalien
Copy link
Member

(not quite dead, just 6 feet under github notifications ⚰️)

I'm not sure if we need to allocate a range of codecs. I'd just define a new code per hash type and stick all the format-specific options inside the hash itself.

For example, one might want a digest (just the root):

<tiger-digest-mc>-<len>-(<hash-mc>-<len>-<root-hash>)

Or a serialized tree:

<tiger-tree-mc>-<len>-(<hash-mc>-<Len>-<tree-type>-<tree-depth>-<tree...>)

Does that sound reasonable or am I missing something?

@ivan386
Copy link
Author

ivan386 commented Jun 3, 2018

@Stebalien

Merkle Hash Tree multihash

<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-<hash-mc>-<hash-len>-(<hash-value>)-[<hash-value>...])
<0x0400>-<len>-(<0x01>-<hash-mc>-<hash-len>-(<hash-value>)-[<hash-value>...])

0x0400 - Merkle Hash Tree (1024 decimal) multicodec value

0x01 - Hash Tree Type with this settings:

  • Data segment size - 1024 bytes
  • Data prefix - one byte with value 0 (0x00)
  • Hash pair prefix - one byte with value 1 (0x01)
  • Unpaired hashes move on next level unchanged.
  • hashes in tree packed in breadth-first order from root to leafs.

breadth-first order:

Level           Tree Root
|                   |
V                   V  
4:            _____ 1 _____ <-------
             /             \         \
3:       ____2____           3 <---- Internal Hashes
        /         \           \          /
2:     4           5           6 <------/
      /  \       /   \        / \      /
1:   7    8     9     10    11  12 <--
    /\    /\    /\    /\    /\   \
0: 13 14 15 16 17 18 19 20 21 22 23 <---- Leaf Hashes
   |___________ data _____________|

0x02 - Hash Tree Type with this settings:

  • Data segment size - 1024 bytes
  • Data prefix - one byte with value 0 (0x00)
  • Hash pair prefix - one byte with value 1 (0x01)
  • Unpaired hashes move on next level unchanged.
  • Level of hash tree
    1. level - 1 hash = root-hash
    2. level - 2 hashes
    3. level - 3-4 hashes
    4. level - 5-8 hashes
      and so on

Tiger multihash:

<tiger-mc>-<len>-<hash>
<0x7A>-<0x18>-<0x3293ac630c13f0245f92bbb1766e16167a4e58492dde73f3>

0x7A - Tiger-hash multicodec value

Tree Tiger multihash:

<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-<tiger-mc><tiger-len>(<tiger-hash>)[<tiger-hash>...])

Tree sha2-256 multihash:

<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-<sha2-256-mc>-<sha2-256-len>-(<sha2-256-hash>)[<sha2-256-hash>...])

ivan386 added a commit to ivan386/multihash that referenced this issue Jun 13, 2018
multiformats#55

## Merkle Hash Tree multihash

`<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-(<multihash>[<multihash>...]))`
`<0x0400>-<len>-(<0x01>-(<multihash>[<multihash>...]))`

0x0400 - Merkle Hash Tree (1024 decimal) multicodec value 

0x01 - Hash Tree Type with this settings:
* Data segment size - 1024 bytes
* Data prefix - one byte with value 0 (0x00)
* Hash pair prefix - one byte with value 1 (0x01)
* Unpaired hashes move on next level unchanged.
* Multihashes in tree multihash packed in breadth-first order from root to leafs.

## Tiger multihash:

`<tiger-mc>-<len>-<hash>`
`<0x7A>-<0x18>-<0x3293ac630c13f0245f92bbb1766e16167a4e58492dde73f3>`

0x7A - Tiger-hash multicodec value

## Tree Tiger multihash:
`<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-(<tiger-root-multihash>[<tiger-multihash>...]))`
## Tree sha2-256 multihash:
`<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-(<sha2-256-root-multihash>[<sha2-256-multihash>...]))`

## Links:
[Tree Hash EXchange format (THEX)](https://adc.sourceforge.io/draft-jchapweske-thex-02.html#anchor2)
[Tiger:
A Fast New Cryptographic Hash Function (Designed in 1995)](http://www.cs.technion.ac.il/~biham/Reports/Tiger/)
wikipedia: [Tiger (cryptography)](https://en.wikipedia.org/wiki/Tiger_(cryptography))
wikipedia: [Merkle tree#Tiger tree hash](https://en.wikipedia.org/wiki/Merkle_tree#Tiger_tree_hash)

License: MIT
Signed-off-by: Ivan <ivan386@users.noreply.github.com>
ivan386 added a commit to ivan386/multicodec that referenced this issue Jun 13, 2018
multiformats/multihash#55

## Merkle Hash Tree multihash

`<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-(<multihash>[<multihash>...]))`
`<0x0400>-<len>-(<0x01>-(<multihash>[<multihash>...]))`

0x0400 - Merkle Hash Tree (1024 decimal) multicodec value 

0x01 - Hash Tree Type with this settings:
* Data segment size - 1024 bytes
* Data prefix - one byte with value 0 (0x00)
* Hash pair prefix - one byte with value 1 (0x01)
* Unpaired hashes move on next level unchanged.
* Multihashes in tree multihash packed in breadth-first order from root to leafs.

## Tiger multihash:

`<tiger-mc>-<len>-<hash>`
`<0x7A>-<0x18>-<0x3293ac630c13f0245f92bbb1766e16167a4e58492dde73f3>`

0x7A - Tiger-hash multicodec value

## Tree Tiger multihash:
`<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-(<tiger-root-multihash>[<tiger-multihash>...]))`
## Tree sha2-256 multihash:
`<merkle-hash-tree-mc>-<len>-(<hash-tree-type>-(<sha2-256-root-multihash>[<sha2-256-multihash>...]))`

## Links:
[Tree Hash EXchange format (THEX)](https://adc.sourceforge.io/draft-jchapweske-thex-02.html#anchor2)
[Tiger:
A Fast New Cryptographic Hash Function (Designed in 1995)](http://www.cs.technion.ac.il/~biham/Reports/Tiger/)
wikipedia: [Tiger (cryptography)](https://en.wikipedia.org/wiki/Tiger_(cryptography))
wikipedia: [Merkle tree#Tiger tree hash](https://en.wikipedia.org/wiki/Merkle_tree#Tiger_tree_hash)

License: MIT
Signed-off-by: Ivan <ivan386@users.noreply.github.com>
ivan386 added a commit to ivan386/multihash that referenced this issue Mar 9, 2019
multiformats#55

License: MIT
Signed-off-by: Ivan <ivan386@users.noreply.github.com>
@ivan386
Copy link
Author

ivan386 commented Mar 9, 2019

@ivan386
Copy link
Author

ivan386 commented Jul 21, 2019

@Stebalien I write spec for that case: https://github.com/ivan386/multihash-tree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants