Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use digestMultibase with base64-url-nopad encoding for integrity. #1490

Closed
wants to merge 1 commit into from

Conversation

msporny
Copy link
Member

@msporny msporny commented May 27, 2024

This PR is an attempt to address issue #1489 by choosing a specific digest mechanism (Multihash locked to SHA2 and SHA3) with a specific base encoding format (Multibase locked to base64-url-nopad). IOW, the WG could not come to consensus on Option C, the current spec implements Option A (which does not have consensus), and this PR implements Option B.


Preview | Diff

@brianorwhatever
Copy link

I don't see where sha2 or sha3 is specified. Am I missing it somewhere?

@msporny
Copy link
Member Author

msporny commented May 27, 2024

I don't see where sha2 or sha3 is specified. Am I missing it somewhere?

It's specified in the reference to the Data Integrity spec in this section:

https://pr-preview.s3.amazonaws.com/w3c/vc-data-model/pull/1490.html#integrity-of-related-resources

Which refs this section:

https://www.w3.org/TR/vc-data-integrity/#resource-integrity

Which refs this section:

https://www.w3.org/TR/vc-data-integrity/#multihash

which defines SHA-2 and SHA-3 usage via Multihash header values. I admit that it's a fairly twisty path to get there, but the normative refs do exist.

@iherman
Copy link
Member

iherman commented May 28, 2024

Some comments on things that should also be done (eventually) if the PR is accepted (possibly as part of this PR):

  • There is no need to refer to SRI anymore. This means
    • Remove references to SRI in §B1. The definition for the context's digest should rely on digestMultibase
    • The whole section on sriString dataype (i.e., §B3.1 ) removed
  • The vocabulary (and its documentation) should also change:
    • Removal of the sriString datatype from the vocabulary (ie, §4.3)
    • Removal of the digestSRI property (including from the vocabulary diagram), i.e., §4.1.4

I also wonder whether the definition of the digestMultibase property should not be moved to the controller document, rather than being kept in the DI document. The role of the controller document is to provide a clean pattern of normative references, and we agreed that the VCDM should not normatively depend on the DI spec.

@iherman
Copy link
Member

iherman commented May 28, 2024

I don't see where sha2 or sha3 is specified. Am I missing it somewhere?

It's specified in the reference to the Data Integrity spec in this section:

https://pr-preview.s3.amazonaws.com/w3c/vc-data-model/pull/1490.html#integrity-of-related-resources

Which refs this section:

https://www.w3.org/TR/vc-data-integrity/#resource-integrity

Which refs this section:

https://www.w3.org/TR/vc-data-integrity/#multihash

which defines SHA-2 and SHA-3 usage via Multihash header values. I admit that it's a fairly twisty path to get there, but the normative refs do exist.

See also my remark in #1490 (comment): this dependency on the DI spec may not be what we want.

Also, and to refer to @brianorwhatever's question: I think adding a non-normative reference (e.g., as part of a note) to the SHA algorithms in the VCDM would be helpful, IMHO. For a general reader, going down this "twisty path" should be avoided, IMHO.

@msporny
Copy link
Member Author

msporny commented May 28, 2024

@iherman wrote:

we agreed that the VCDM should not normatively depend on the DI spec.

We definitely DID NOT agree to that. :)

We agreed to the opposite, which is why this normative "RECOMMENDED" text exists in the VCDM around securing mechanisms:

https://w3c.github.io/vc-data-model/#securing-mechanisms

VCDM has a normative dependency on VC-JOSE-COSE and VC-DATA-INTEGRITY, on purpose, because we wanted to RECOMMEND at least the securing mechanisms that the group was normatively defining. The group felt that if we didn't do that, there was no normative way to actually make the credential "verifiable". We did not do this for the v1.0 and v1.1 work and did not want to continue making the mistake of not normatively defining how to secure a VC. That ship sailed last year some time.

There seem to be a few people in the WG that keep forgetting this, so I wanted to make sure to remind everyone of where we got to well BEFORE we went into CR.

@iherman
Copy link
Member

iherman commented May 28, 2024

@iherman wrote:

we agreed that the VCDM should not normatively depend on the DI spec.

We definitely DID NOT agree to that. :)

We agreed to the opposite, which is why this normative "RECOMMENDED" text exists in the VCDM around securing mechanisms:

I stand corrected :-)

Nevertheless: the controller document already has items on Multihash and Multibase (and the datatype for the latter), I still think that it would be a more natural place for the definition of digestMultibase. But I definitely will not lie down the road over this...

@brentzundel
Copy link
Member

mesur can't yet support this PR, as our implementation uses digestSRI, but I have started a conversation internally to discuss switching.

@msporny msporny added normative The PR is a normative change to the CR specification CR1 This item was processed during CR1 labels May 29, 2024
@iherman
Copy link
Member

iherman commented May 29, 2024

The issue was discussed in a meeting on 2024-05-29

  • no resolutions were taken
View the transcript

4.2. Use digestMultibase with base64-url-nopad encoding for integrity. (pr vc-data-model#1490)

See github pull request vc-data-model#1490.

Manu Sporny: the other one is the question around digest values.
… we have 3 options that have been put in front of the group.

Manu Sporny: digest options: #1489 (comment).

Manu Sporny: A. use digestSRI.
… B. use digestMultibase.
… C. use both.
… all of these have objections.
… so we need to make a decision.
… I'll note that the folks arguing for digestSRI want it to be base64url...but SRIs in browsers are not URL encoded, just base64.
… the same folks object to digestMultibase.
… pretty much no one wants both...
… so we need to have a path through these.

Brent Zundel: (representing measur.io, not as a chair) I believe the digestSRI stuff is in the spec because Mike Prorock wanted it because Mesur.io was implementing it.
… I've opened an internal conversation to ask about how we're using it.
… we're supporters of it currently, so would not want it removed.
… but I will gather more data.
… any other thoughts on this PR?

Dave Longley: +1 to the PR ... approved already.

Manu Sporny: it would be good to get feedback on the PR.
… I will note that multibase is definitely being used by DHS.
… among those were Drivers Licenses and Employment Authorization Documents.
… multibase stuff is also in production in TruAge.
… it is now very much defined in a specification and is in use in production.
… that's where we are with that technology.
… we would prefer to pick one, but not object if the group picked both.
… it's annoying if the group picked both, but still doable.
… and not hard to implement either one.

Brent Zundel: thank you, manu.

Copy link
Member

@brentzundel brentzundel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mesur has a customer requirement to use digestSRI, so we unfortunately need to object to its removal.

@msporny
Copy link
Member Author

msporny commented May 30, 2024

mesur has a customer requirement to use digestSRI

To be clear, Mesur has a customer that is insisting that SRI be used? Or is their requirement that a digest value be available to be checked?

so we unfortunately need to object to its removal.

Thanks for checking with your team, @brentzundel.

Would mesur.io, or anyone else reviewing this PR, object to any of the alternatives below?

  1. Supporting either digestSRI or digestMultibase in the core spec (implementers can choose to support either one, or both)?
  2. Moving the definition of resourceIntegrity and digestSRI to the JOSE-COSE and keeping digestMultibase in the Data Integrity specification, and supporting all options in the VCDM v2 JSON-LD Context? (implementers can choose to support either one, or both)?

Are there other options that could gain broader consensus that Mesur can propose?

@brentzundel
Copy link
Member

I'm not opposed to either of the options you've outlined @msporny

@msporny
Copy link
Member Author

msporny commented May 30, 2024

Nevertheless: the controller document already has items on Multihash and Multibase (and the datatype for the latter), I still think that it would be a more natural place for the definition of digestMultibase.

IMHO, Multibase, Multihash, Multikey, and digestMultibase belong in the Data Integrity spec, because they're about "data integrity". :)

For political reasons, they now exist in the Controller Document because some of the proponents of the Controller Document do not, under any circumstance, want to refer to the Data Integrity specification (even though they are already doing that in VCDM v2). Multihash isn't used at all in the Controller Document. Multibase is used to encode Multikeys, which are defined in the Controller Document (but are probably better defined in Data Integrity). Again, a definition of Multibase in Data Integrity with a reference from the Controller Document to Data Integrity would be better.

We have what we have now, and I don't think any of the proponents of Data Integrity will have an issue with normatively referring to the Controller Document spec for whatever they need, but I wanted to highlight that the current structure is subpar from a technical/references perspective.

@iherman
Copy link
Member

iherman commented May 31, 2024

  1. Supporting either digestSRI or digestMultibase in the core spec (implementers can choose to support either one, or both)?
  2. Moving the definition of resourceIntegrity and digestSRI to the JOSE-COSE and keeping digestMultibase in the Data Integrity specification, and supporting all options in the VCDM v2 JSON-LD Context? (implementers can choose to support either one, or both)?

I would vote for (1). The integrity of an external resource is not an issue of the securing mechanism, but the credential itself. I do not see any reason for binding the approach to that mechanism.

The question that I raised in #1490 (comment) would still be valid, though. If we have both digestSRI and digestMultibase, they should be defined alongside one another. I do not really care whether that is the VCDM or the controller document; maybe the former requires less work.

@iherman
Copy link
Member

iherman commented May 31, 2024

We have what we have now, and I don't think any of the proponents of Data Integrity will have an issue with normatively referring to the Controller Document spec for whatever they need, but I wanted to highlight that the current structure is subpar from a technical/references perspective.

I know :-). It is as it is now, and we should make it as clean as possible. That is my only motivation...

@msporny
Copy link
Member Author

msporny commented May 31, 2024

@iherman wrote:

The integrity of an external resource is not an issue of the securing mechanism, but the credential itself.

While we're dancing on the head of a pin :), Data Integrity is not purely a securing mechanism. It is a collection of technologies, that includes a securing mechanism, that helps one secure graphs of information.

The "data integrity" security happens at a lower level than the credential (though the credential might tie multiple proofs together, which is what relatedResource can do). These are among the things can provide data integrity and they are independent things from relatedResource:

  • cryptographic hashes
  • cryptographic signatures
  • proof of work
  • proof of stake
  • proof of set membership

... and so on. You can, via the use of id and digestMultibase and/or proof, protect the integrity of data in a graph without having to use relatedResource at all. It's, debatably, useful to protect @contexts, but only because the group decided that they're not going to allow hashlink-based security on@context values, and some implementers don't think that static contexts are "secure enough".

Additionally, the relatedResource feature is applicable to ANY bounded graph of information, not just "credentials". So, we shouldn't presume that "credentials" are the only thing that can express "related resources" and the more proper thing to do would be to define "related resource" in the Data Integrity specification as well. That would be the right thing to do if we didn't have a small number of WG Members that insist on objecting to us logically grouping these things in a way that made the most sense based on a misguided notion that NOT normatively referencing Data Integrity via the Controller Document is going to achieve something useful. The only thing it's achieving at this point is establishing the definitions in a place that no one will object to, maybe.

I know :-). It is as it is now, and we should make it as clean as possible. That is my only motivation...

Well, we're going to do something that is not as clean as possible, and is some hacky-thing that no one will object to (hopefully)... :) ... but, I get what you're saying, which I think is:

  • Define Multibase, Multihash, and Multikey in Controller Document.
  • Define relatedResource in VCDM v2.
  • Define digestMultibase and digestSRI in Controller Document.

We could alternative define digestMultibase and digestSRI in VCDM, OR define digestMultibase in Data Integrity and digestSRI in VC-JOSE-COSE.

Any thoughts on where we define digestMultibase and digestSRI?

@brentzundel
Copy link
Member

fwiw, I have a slight preference for defining them in VCDM

@iherman
Copy link
Member

iherman commented May 31, 2024

I know :-). It is as it is now, and we should make it as clean as possible. That is my only motivation...

Well, we're going to do something that is not as clean as possible, and is some hacky-thing that no one will object to (hopefully)... :) ... but, I get what you're saying, which I think is:

  • Define Multibase, Multihash, and Multikey in Controller Document.

Yes.

  • Define relatedResource in VCDM v2.

Yes.

  • Define digestMultibase and digestSRI in Controller Document.

We could alternative define digestMultibase and digestSRI in VCDM,

Yes. Both are possible, and I do not have strong feeling about this either way. The only point is that they should be at the same place, i.e.,

OR define digestMultibase in Data Integrity and digestSRI in VC-JOSE-COSE.

is what I would not want. After all, all the usages of digestMultibase could be done with digestSRI as well.

Any thoughts on where we define digestMultibase and digestSRI?

As I said, I do not have a preference either way. I see that @brentzundel has a (slight) preference for VCDM, and that is o.k. with me.

@msporny
Copy link
Member Author

msporny commented May 31, 2024

@iherman wrote:

After all, all the usages of digestMultibase could be done with digestSRI as well.

No, that's not correct. :)

digestMultibase can do everything digestSRI can do. The reverse is not true; digestSRI cannot do everything that digestMultibase can do. I wonder if others in the WG don't realize this either. This is not a comparison between two equally-matched features.

digestSRI is limited to things standardized 23 years ago, during the early 2000s; it cannot do anything beyond SHA-2, and it can't be expressed in anything other than base64pad, and the VCWG has no control over the ability to modernize it. These limitations with digestSRI are documented under Option A.

Following @brentzundel and @iherman's slight preferences, I'll go ahead and raise another PR to add digestMultibase back in to the spec, in parallel to digestSRI, in the VCDM v2 specification. I will note that this is exactly what PR #1484 did before I was asked to back out the change, and I expect the same people that objected before to object again. At that point, we'll be in a position to straw poll all three options, call for FOs for each option, see if any other options appear as ones that won't draw FOs, pick something and then move on.

@msporny msporny added the DO NOT MERGE PR contains something that should not be merged. label May 31, 2024
@iherman
Copy link
Member

iherman commented Jun 5, 2024

The issue was discussed in a meeting on 2024-06-05

  • no resolutions were taken
View the transcript

5.2. Use digestMultibase with base64-url-nopad encoding for integrity. (pr vc-data-model#1490)

See github pull request vc-data-model#1490.

Manu Sporny: other thing we need to decide on is what digest formats we will support for related resource formats.
… only support digestSri has objections that would turn into formal objections, same with only supporting digestMultibase.

See github pull request vc-data-model#1492.

Manu Sporny: have not heard if anyone would formally object to supporting both, is in PR 1492.
… at this point, need to ask group if anyone would object to supporting both digestSRI and digestMultibase.

Brent Zundel: the two options that are before the group that have a chance of avoiding formal objection are.
… a, get rid of description of related resource.
… b, support both, allow implementations to choose on digestSRI vs digestMultibase.
… people don't like option b, but does anyone dislike it enough to formally object?

Phillip Long: +1 to let the market decide.

Brent Zundel: not hearing or seeing anyone object to the option to include both.
… that is what we have decided to do.

Manu Sporny: thank you, we will merge 1492, supporting both formats, to give more data to the group. I had to implement this this past weekend and it took 30min.
… had to update respec-vc to generate hashes in both formats, took 30 mins, was not difficult.

Ted Thibodeau Jr.: which digest form were you adding?

Manu Sporny: both, added support for every variation of both digest formats.

Brent Zundel: additional PRs we should look at?

Manu Sporny: yes, next set, 6 of these to remove at risk issue markers (confidenceMethod, renderMethod, refreshService, terms of use).

@msporny
Copy link
Member Author

msporny commented Jun 9, 2024

This PR has been superseded by PR #1492, which has been merged, closing.

@msporny msporny closed this Jun 9, 2024
@msporny msporny deleted the msporny-digestMultihash branch June 9, 2024 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CR1 This item was processed during CR1 DO NOT MERGE PR contains something that should not be merged. normative The PR is a normative change to the CR specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants