Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Spec refining: Multiple links in the link object #5

Closed
nicola opened this issue Jul 26, 2016 · 15 comments
Closed

Spec refining: Multiple links in the link object #5

nicola opened this issue Jul 26, 2016 · 15 comments
Labels
status/deferred Conscious decision to pause or backlog

Comments

@nicola
Copy link
Member

nicola commented Jul 26, 2016

In my original design I had the possibility to have a set pointers that can be used in the link object.
The reason for this is for example: link using multiple hash functions (in case one breaks)

// one hash function
{
  name: {'/': 'SHA2-Hash/test' }
}

// multiple using arrays
{
  name: {'/': ['SHA2-Hash/test', 'SHA3-Hash/test']}
}
// in this case, the parser will know we are talking about a link
// since the property `/` will define the type (in CBOR as in JSON)

Pro:

  • Have a way to specify multiple hashes from different hash functions

Cons:

  • the user has no guarantee that SHA3-Hash points to the same hash as SHA2 (unless they use algebraic hash functions (?))
  • the user must specify the order of priorities of the hash functions

cc @dignifiedquire, @mildred, @Stebalien

@dignifiedquire
Copy link
Member

I don't think this should be added to the spec as it increases complexity without actually giving much more benefit. If you want to provide multiple links, e.g. in different hashes you can do that one level higher like this:

{
  "name": [
    {"/": "SHA2-Hash/test"},
    {"/": "SHA3-Hash/test"}
  ]
}

and handle it in userland

@nicola
Copy link
Member Author

nicola commented Jul 26, 2016

That works if the resolution is in userland, however this would turn into /name/0 and /name/1.
This has some problems: (1) users will not have clean paths, (2) they have to handle crypto breaks themselves (and they will not!) (3) once sha2 breaks, /name/0 will be a broken link

I personally don't see it adding too much complexity: (1) the check for a link is just looking at a tag and the choice of the pointer in the list is just a matter of re-ordering and picking one, (2) if hash functions do break, (2) links are not broken.

However, this can add complexity if there are cases for which developers need to read the link object - in that case they cannot assume that it is a string, but if could be an array.

@dignifiedquire
Copy link
Member

We might be able cope with the complexity, but we will not be able to cope with the uncertainty that the two links might be totally different, which gives much less guarantees to the user.

In addition, if sha2 is actually broken the links are not immediately broken, they are just vulnerable to attacks. But this vulnerability does not go away if you have multiple links as it will still be the first one and might resolve to the content of the attacker.

@nicola
Copy link
Member Author

nicola commented Jul 26, 2016

Great to point out that links are actually not broken, they will still resolve - if you have multiple links, if sha2 breaks and it is known to be broken, your parser will just not resolve sha2 but fallback on sha3. (if it is not known to be broken - you are in troubles)

@nicola nicola mentioned this issue Aug 4, 2016
2 tasks
@jbenet
Copy link
Contributor

jbenet commented Aug 6, 2016

The only way to make this work the way you want it to is to force the tool to resolve all objects and verify all paths are exactly the same. Note that this is hard because i could do:

{ /: [ <sha2-256-256-mh>/a/b/c, <sha3-256-256-mh>/d ] }

So i would have to resolve both and make sure it's the exact same datastruct. This seems impossible in some cases, actually.


Maybe a further restriction is to enforce the paths must be the same and only the hash can be different:

{ /: [ <sha2-256-256-mh>/a/b/c, <sha3-256-256-mh>/a/b/c ] }

therefore can verify the object simply.


  • at first pass, i agree with @dignifiedquire this seems to me a user-land thing.
  • at first pass, i do agree that it would be nice if natively we had the ability to roll up to more hashes.
  • at first pass, it reminds me of my desire to create "mutable & immutable references" (2 links, so you have a snapshot AND have a way to get the newest version)
  • but what happens if i dont have one of the hash functions? do i just trust it? what if it's truly broken? or the hash function is bogus and i dont know it?
  • the expectations created by this addition makes me think that this is way too complicated, and not simple.
  • there may be a way to simplify all these concerns, but im not seeing it atm.

@Stebalien
Copy link
Contributor

Stebalien commented Aug 8, 2016

What if this were built into the multihash spec? That is, given /ipfs/HASH1,HASH2,HASH3 (example syntax), you'd lookup HASH1, then verify against all three hashes. If verification failed, you'd move on to HASH2 etc...

However, IMO, the overhead just isn't worth it...

@jbenet
Copy link
Contributor

jbenet commented Aug 8, 2016

( @Stebalien that's a good idea, and achievable with a multihash function code too -- the multimultihash function? hahaha). but yeah dont think it's that useful.

@Stebalien
Copy link
Contributor

Just to repeat my comment from the meeting here, hashes (and crypto in general) usually don't break all at once so we'll have time to migrate. Also, due to the structured nature of IPLD, this can be done programmatically. It would make it harder to compare object equality but, IMO, that's an inherent issue in transitioning from one hash to another.

@nicola
Copy link
Member Author

nicola commented Aug 9, 2016

@Stebalien are you then advocating for having

{
 "/": [link1, link2]
}

and give no guarantee about the two hashes being equal?

@Stebalien
Copy link
Contributor

No! I'm advocating for never linking with two hashes in the same object. When migrating to a new hash algorithm, just convert all known objects to the new hash algorithm all at once (as much as possible at least). If you're worried about breaking links (and don't just want to keep the old objects around) the IPFS datastore can store two merkeldags (one for each hash algorithm) behind the scenes (but it doesn't have to store the actual data twice).

Again, hashes don't usually break overnight. There will likely be a period where the old hash algorithm is slightly broken and not recommended but still secure enough to trust long enough to transition to a new algorithm.

@nicola
Copy link
Member Author

nicola commented Aug 9, 2016

The problem you are describing is transitioning after the breakage.
The problem I was trying to solve originally is writing multiple hashes in prevention of breakage issues.

I think we have all agreed that this should be handled in user-space (or at least to do some research if this is what the user is fine with)

@Stebalien
Copy link
Contributor

My point is that this just isn't worth designing for at the spec level. Basically, there are three cases:

  1. sha256 is never broken.
  2. sha256 is eventually broken but sha3 never (not ever) broken.
  3. sha256 and sha3 are both broken eventually (more generally, the all members of set of chosen hash functions are eventually broken).

This proposal only applies to case 2 which, personally, I find a highly unlikely case. So, given that:

  1. Hash algorithm breaks are few and far between.
  2. We have no reason to believe that, given a set of hash functions, at least one is likely to last forever.

We might as well just assume that we'll eventually have to transition from one hash function to another and deal with it when we get to it.


Basically, this proposal doesn't fix the problem, it just postpones it.

@nicola
Copy link
Member Author

nicola commented Aug 10, 2016

🎉 Perfect, we are all on the same page to not support multiple links

@jbenet
Copy link
Contributor

jbenet commented Aug 11, 2016

+1

(Someone will complain about this eventually, so let's make sure to have a
good example of how to do it in userland)
On Wed, Aug 10, 2016 at 07:00 Nicola Greco notifications@github.com wrote:

Perfect, we are all on the same page to not support multiple links


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAIcoa4MDkS-t9ko0VcyEX7ARUBrP1TIks5qea8jgaJpZM4JU_zt
.

@daviddias daviddias added the status/deferred Conscious decision to pause or backlog label Mar 19, 2018
@rvagg
Copy link
Member

rvagg commented Aug 14, 2019

Closing due to staleness as per team agreement to clean up the issue tracker a bit (ipld/team-mgmt#28). This doesn't mean this issue is off the table entirely, it's just not on the current active stack but may be revisited in the near future. If you feel there is something pertinent here, please speak up, reopen, or open a new issue. [/boilerplate]

@rvagg rvagg closed this as completed Aug 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status/deferred Conscious decision to pause or backlog
Projects
None yet
Development

No branches or pull requests

6 participants