DagCBOR: be more precise about foreign tags #227

vmx · 2019-11-28T18:07:49Z

In the DagCBOR spec, be more precise on what happens if you parse CBOR and it contains tags.

Should they be ignored?
Should they be preserved?
Should there be an error?

I suggest that we just ignore those tags and when you write DagCBOR, the only tag you ever write will be the one for CID (tag 42).

rvagg · 2019-11-28T21:05:40Z

+1 on your suggested approach

warpfork · 2019-11-30T08:59:49Z

I'd say there should be an error by default, and a "parseLoosely" method as an option.

Preserving the extra data without a way to read or to mutate it doesn't make for a pleasant user experience (nor an efficient library -- btw, cbor tags are technically allowed to contain other tags, and if you try to implement this you will be very very sad at how many allocs this can force you to make if you actually support it).

Ignoring the extra data means we're not really validating that it's DagCBOR. This is potentially dangerous. (Think: how the comment field in PDFs made it much, much easier to create hash collisions in (already weak) hashes. Even if it's not immediately broken, it's wading out towards the sharks in a very unnecessary way.)

Erroring explicitly is the choice that remains standing.

We can still also do the "parseLoosely" features in libraries in practice. This would buy us is being able to load CBOR documents that aren't exactly DagCBOR and manipulate them. But I don't think the specs themselves should hedge on this. And users should need to very much opt into such a usage: the number of cases where a user will want this, versus the number of cases where a user would be surprised how much bending-over-backwards the library is doing to support features they didn't need/want, is very leaning towards the latter, I'd bet.

rvagg · 2020-01-01T03:07:54Z

To connect to other discussions happening elsewhere (no link, I thought it was in a team meeting but perhaps in a private discussion), there's also other parts of CBOR which can be "loose", e.g. the number 255 can be one of 18FF, 1900FF, 1A000000FF, 1B00000000000000FF. So while we're at considering what to do with tags outside of 42 we should also be considering whether we want there to be only one way of encoding data with CBOR for valid DagCBOR and perhaps our parsers all need to have a "strict" mode so that we can enforce that.

I think the remaining discussion on this is regarding how much it's worth the effort compared to everything else we're trying to do? It obviously has dedupe benefits but I'm still a little sceptical that the other reasons for doing this raise above the level of "it's just cleaner!".

/cc @ribasushi

ribasushi · 2020-01-06T15:03:25Z

@rvagg actually deduplication is not the primary reason for wanting this. It is a part of maintaining the symmetry of "this cryptographic id represents this thing, and this thing is only represented by this cryptographic id" ( theoretical limits notwithstanding )

Yes, it is trivial to add a single padding byte somewhere and change every id. But it should not be possible to express bit-for-bit identical content under multiple cryptographic IDs. The simplest use-case I can think of are takedown requests.

Again it comes back to what is actually being built: a product or a platform. If the latter - I strongly believe any low-hanging fruit allowing disambiguation should be baked into the lowest levels of the fabric of said platform.

rvagg · 2020-01-20T22:04:52Z

deterministic map ordering is in this rough category too, discussed in https://github.com/ipld/team-mgmt/blob/8d612e3ce1222de9dd531ca14bdeeb662bd2a3a1/meeting-notes/2020/2020-01-20--ipld-sync.md#notes, we need to make sure our codecs document rules for these while we're documenting these other "strict" encoding/decoding details.

warpfork · 2020-01-22T09:54:32Z

Sounds like we should prepare to have a Data Model Spec Sprint Day or something...? I think we're starting to accumulate a decent number of these things and we should get em all on an agenda and knock em out.

mikeal · 2020-01-22T17:58:56Z

ya, we need another close-a-thon in the specs repo in general, but it would probably be most productive to have at least one session focused on Data Model only.

vmx · 2020-01-22T18:02:12Z

@mikeal it's more a "getting-a-lot-done" rather then closing. Most open things are valid things that need some work (in the previous round we were able to close old/out of scope things).

jonnycrunch · 2020-01-22T23:18:48Z

so, just to give my thoughts on this subject.

In my use-case for the IPID DID spec that uses IPLD, it would be very nice to support tag 98 (COSE signing) of CID as tag 42. This provides me with all of the semantics and tooling necessary to accomplish my goal natively in CBOR and IPLD. I'm sure this could also be a method to help with data integrity for DAG sync. Also, @hsanjuan might find this approach interesting for shared clustering.

98(
  [
    / protected / h'a10300' / {
        \ content type \ 3:0
      } / , 
    / unprotected / {}, 
    / payload /  42: 'bafyreibtkfbzqiwdhd5bv53wrag43jbz4ytyixigde6wa5zuddxciq7s2m', 
    / signatures / [
      [
        / protected / h'a10126' / {
            \ alg \ 1:-7                    \\  alg = ECDSA 256 \\
          } / , 
        / unprotected / {
          / kid / 4:'did:ipid:12D3KooWDXDEZtLW5k16DjeHWkFZEeTKUF7PnzzY7m2UocmduhZV;/publicKey/0'
        }, 
        / signature / h'e2aeafd40d69d19dfe6e52077c5d7ff4e408282cbefb5d06cbf414af2e19d982ac45ac98b8544c908b4507de1e90b717c3d34816fe926a2b98f53afd2fa0f30a'
      ]
    ]
  ]
)

Ref: #227

rvagg · 2020-01-23T06:34:12Z

#236

Ref: #227

Ref: ipld#227

rvagg mentioned this issue Jan 20, 2020

CAR format spec doc #230

Closed

rvagg added a commit that referenced this issue Jan 23, 2020

DagCBOR: tighten up spec, additional strictness requirements

55435dd

Ref: #227

rvagg added a commit that referenced this issue Jan 23, 2020

DagCBOR: tighten up spec, additional strictness requirements

4ddc9d0

Ref: #227

rvagg added a commit that referenced this issue Jan 23, 2020

DagCBOR: tighten up spec, additional strictness requirements

9bf725c

Ref: #227

rvagg added a commit that referenced this issue Jan 23, 2020

DagCBOR: tighten up spec, additional strictness requirements

d5ea2a7

Ref: #227

rvagg mentioned this issue Jan 23, 2020

DagCBOR: tighten up spec, additional strictness requirements #236

Merged

rvagg added a commit that referenced this issue Mar 31, 2020

DagCBOR: tighten up spec, additional strictness requirements

528714d

Ref: #227

rvagg added a commit that referenced this issue Mar 31, 2020

DagCBOR: tighten up spec, additional strictness requirements

51eb2f6

Ref: #227

prataprc pushed a commit to iprs-dev/ipld-specs that referenced this issue Oct 13, 2020

DagCBOR: tighten up spec, additional strictness requirements

7d8b896

Ref: ipld#227

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DagCBOR: be more precise about foreign tags #227

DagCBOR: be more precise about foreign tags #227

vmx commented Nov 28, 2019

rvagg commented Nov 28, 2019

warpfork commented Nov 30, 2019

rvagg commented Jan 1, 2020

ribasushi commented Jan 6, 2020

rvagg commented Jan 20, 2020

warpfork commented Jan 22, 2020

mikeal commented Jan 22, 2020

vmx commented Jan 22, 2020

jonnycrunch commented Jan 22, 2020

rvagg commented Jan 23, 2020

DagCBOR: be more precise about foreign tags #227

DagCBOR: be more precise about foreign tags #227

Comments

vmx commented Nov 28, 2019

rvagg commented Nov 28, 2019

warpfork commented Nov 30, 2019

rvagg commented Jan 1, 2020

ribasushi commented Jan 6, 2020

rvagg commented Jan 20, 2020

warpfork commented Jan 22, 2020

mikeal commented Jan 22, 2020

vmx commented Jan 22, 2020

jonnycrunch commented Jan 22, 2020

rvagg commented Jan 23, 2020