Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the value of the base context normative. #1158

Merged
merged 4 commits into from
Jun 30, 2023
Merged

Conversation

msporny
Copy link
Member

@msporny msporny commented Jun 19, 2023

This PR implements two resolutions made during the last special topic call:

  1. https://www.w3.org/2017/vc/WG/Meetings/Minutes/2023-06-13-vcwg-special#resolution1
  2. https://www.w3.org/2017/vc/WG/Meetings/Minutes/2023-06-13-vcwg-special#resolution2

Namely:

Resolution #1: The v2 context URL will remain normative (https://www.w3.org/ns/credentials/v2), its value will be made normative through the use of a hash digest.

Resolution #2: Add issue markers saying that the value of the hash digest for the v2 context may change before PR and that’s expected.

One editorial fix the Editor's might make after this PR is to move the Base Context section from the appendix into the main specification (given that having a normative appendix section is a bit strange).


Preview | Diff

Copy link
Member

@iherman iherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with the changes proposed in this PR, hence my approval.

However, with regard to your comment:

One editorial fix the Editor's might make after this PR is to move the Base Context section from the appendix into the main specification (given that having a normative appendix section is a bit strange).

I am a -1 to that. There is a difference between saying that the URL of the context file is "normative" (also reinforced through the hash value) and listing the content of the context file in the specification, which makes every line of the context normative as well. This leads to the sort of confusion that I have described, for example, in #1103 (comment):

If we relied on @context […] as the source of that normative information, then we hit a bunch of additional questions like: what is the status of the mapping of nonce to https://w3id.org/security#nonce v.a.v. the security specification? How do we account for mappings like name to https://schema.org/name? In essence: which part of the @context is normative and which one is not? Etc.

My proposal is not to include the context file in the specification at all. The URL of the context file is de-referencable, and we can also be sure that one will download the right file if the hash value is checked. By keeping only the reference in the spec, i.e., by being silent about the normativeness of the content of the @context file, we avoid all these issues, we avoid the set of open PR-s that resulted from that confusion, etc.

My 2 cents...

Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

<h3>Base Context</h3>

<p class="issue" title="(AT RISK) Hash values might change during Candidate Recommendation">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there an issue we can tag to tack this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to create an issue to track this statement. It creates more work for everyone w/o really having any upside. What's the issue going to say other than "the hash values might change" -- there's nothing to be discussed/debated there -- it's just a statement of fact.

Not every issue marker needs to map to an issue. This is just guidance to readers and implementers.

@brentzundel
Copy link
Member

I agree that having a normative appendix is weird. Could this language be added instead to the @context section?
Just a thought, I approve either way.

@msporny
Copy link
Member Author

msporny commented Jun 22, 2023

I agree that having a normative appendix is weird. Could this language be added instead to the @context section?

Yes, excellent idea... that's probably the ideal location for this text. I will update the PR to move the text to that location.

Copy link
Contributor

@OR13 OR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be aligned with normative requirements of data integrity and sub resource integrity.

index.html Outdated
Implementations MUST ensure that the base context value, located at
<code>https://www.w3.org/ns/credentials/v2</code>, matches the following
SHA-256 digest of the value:
<strong><code>dc28cf86d8a3199beb9d14e29d1da662346b8660abf2edca873098c88ca2a457</code></strong>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the same approach for data integrity and context integrity see #1140

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rational for "why" alignment matters.

Implementers are required to load dependencies to support normative statements, when implementing the specification.

It should be a design goal of this working group to reduce the amount of vendor specific libraries (non standard implementations) that an implementation must use to implement this standard.

If implementers are already required to load a library that supports multibase (even a trivial library implemented by a startup), then they should be able to use that library to implement support for normative requirements the same way, everywhere they show up.

Here is a counter example (showing what not to do):

  • v2 core context secured with sha512 hex encoded.
  • resourceIntegrity feature secured with sha384 (base64url encoded)
  • proofValue in data integrity secured with multibase (base58btc encoded)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rational for "why" alignment matters.

Implementers are required to load dependencies to support normative statements, when implementing the specification.

It should be a design goal of this working group to reduce the amount of vendor specific libraries (non standard implementations) that an implementation must use to implement this standard.

If implementers are already required to load a library that supports multibase (even a trivial library implemented by a startup), then they should be able to use that library to implement support for normative requirements the same way, everywhere they show up.

Here is a counter example (showing what not to do):

  • v2 core context secured with sha512 hex encoded.
  • resourceIntegrity feature secured with sha384 (base64url encoded)
  • proofValue in data integrity secured with multibase (base58btc encoded)

i agree in principle and think that there is a desire to have this pr (or a follow on to it) to align with #1140 and vice versa. This would mean that our primary normative doc (the spec itself) is consistent.

regarding the reference to proof value (i believe only a reference from core data perspective in the context) i think that this should be handled separately (e.g. what goes in the context since we agreed it is normative?)

Broadly, i think the questions of encodings for data integrity and what is done under the data integrity spec belong there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(e.g. what goes in the context since we agreed it is normative?)

I agree, but unfortunately, we are making the context normative with logs of things already in it... so your solution does not work.

Broadly, i think the questions of encodings for data integrity and what is done under the data integrity spec belong there.

I also agree with this, but the working group disagreed, when I attempted to separate things.

So now we have to make them consistent, or someone else needs to try to separate them in a way so that they can diverge without causing further implementer burden.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OR13s/logs of things already in it/lots of things already in it/ (that is, logs->lots)?

index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
Copy link
Contributor

@mprorock mprorock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msporny minor adjustments suggested here to bring things in line with SRI encoding of digest values and examples in integrity metadata - also brings this in line with latest updates (3ecd1b8) on #1140

@msporny
Copy link
Member Author

msporny commented Jun 27, 2023

@msporny minor adjustments suggested here to bring things in line with SRI encoding of digest values and examples in integrity metadata - also brings this in line with latest updates (3ecd1b8) on #1140

I'm fine w/ the changes and will approve when @OR13 accepts these changes.

@OR13
Copy link
Contributor

OR13 commented Jun 27, 2023

I assume the base64 vs base64 url is correct?

@OR13
Copy link
Contributor

OR13 commented Jun 27, 2023

I will approve if changes are applied.

@dlongley
Copy link
Contributor

dlongley commented Jun 27, 2023

We should express the hash in a way that's compatible with #1140 (to make it easier on developers who have implemented that mechanism). Notably, we could express the hash in multiple ways or add new ways editorially in the future as needed -- and this doesn't cause implementation burden for anyone.

index.html Outdated
Comment on lines 4940 to 4942
Implementations MUST ensure that the base context value, located at
<code>https://www.w3.org/ns/credentials/v2</code>, matches the following
SHA-256 digest of the value:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are implementations supposed to do if they don't match?

Copy link
Member Author

@msporny msporny Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the base context... almost certainly throw an error (if you are not using a cached copy, which all VC processors are expected to use).

@jyasskin, we did have approval to merge this on a call today, would you prefer that we don't and address this particular concern before we merge this PR? Or would you be ok w/ an issue marker that tracks the concern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any opinion about the timing of fixing this (i.e. feel free to merge and just queue the issue), but I think the spec will need to eventually tell the processors what to do instead of just saying they need to check. This might need a change to 5.2.5 in https://www.w3.org/TR/json-ld-api/#algorithm to make it possible to check contexts against a hash value when fetching them, or you could possibly require that all implementations treat this URL as "previously dereferenced" (step 5.2.4) with a particular value, in which case you don't need the hash at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the URL is previously dereferenced, it doesn't need to be a URL, it could be a URN.

Context needs to dereference to the same bytes... Context identifier does not actually matter...

Using a URN, or some other IRI would also protect W3C from being forced to serve the JSON-LD file that is always assumed to be previously dereferenced over and over again.

Making the value a URL encourages traffic, and implies it will be resolvable forever... It's not just a security thing, there is a cost / sustainability component to how this is made normative as well.

Copy link
Contributor

@dlongley dlongley Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OR13,

If the URL is previously dereferenced, it doesn't need to be a URL, it could be a URN.

We've had this discussion before (setting aside also that a "URN" is now also a "URL" per WHATWG). To increase interoperability in the linked data ecosystem, it should not be a URN. Only the people who come, read, and implement the VCDM spec will be able to (and should be required to, IMO) treat the context as already dereferenced. We should not block other processors written by people who haven't read the spec from being able to consume VC data as general linked data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jyasskin wrote:

What are implementations supposed to do if they don't match?

Added an issue marker for your concern here:

5505cd8

Co-authored-by: Mike Prorock <mprorock@users.noreply.github.com>
@msporny msporny requested a review from OR13 June 27, 2023 21:22
index.html Outdated Show resolved Hide resolved
@iherman
Copy link
Member

iherman commented Jun 28, 2023

The issue was discussed in a meeting on 2023-06-27

  • no resolutions were taken
View the transcript

1.1. Make the value of the base context normative. (pr vc-data-model#1158)

See github pull request vc-data-model#1158.

Michael Prorock: Suggest a quick diversion to 1158 since its related to 1140. There are a couple change I suggested to bring the language into alignment. If we are talking about hashing.
… we should use the same mechanism with the core data model.
… 1158 as is is inconsistent with 1140. I provided text to make those two consistent.

Kristina Yasuda: lets talk about 1158.

Manu Sporny: fine with these changes. Concern that it sounds like an argument is constructed based on these changes that puts us on a slippery slope.
… This changes argues over how we express the hash as base64 or hexidecimal. We can make this change. +1, but I have a feeling its not the last we will hear about this.
… Is this part of a larger suggest to change the way we represent hashes and other values across our specifications.

Kristina Yasuda: How is this a slippery slope.

Manu Sporny: The next argument could make that all DI specification need to follow the SRI encoding format.

Orie Steele: CEPC lets assume positive intent.

Dave Longley: -1 to doing anything that would prohibit or align the use of digestMultibase.

Kristina Yasuda: clarify that the scope is only the VC DM. Would not apply to data integrity.

Dave Longley: +1 to Orie, let's assume positive intent, there's no goal here to prohibit or malign digestMultibase through this action.

Michael Prorock: This PR is solely to bring it up to 384 and encode the digest according to the SRI spec. this is focused just on the core data model.

Manu Sporny: As long as what MikeP is saying holds, we're all good here. :).

Kristina Yasuda: thanks Orie and dave re positive intent comments.

Orie Steele: I am still unsure if we can normatively depend on multiformats / multibase in a W3C TR... but that is not what resource integrity or securing the core context is about.

Dave Longley: Orie: we can, and it's already been done in the past (e.g., DID core).

Orie Steele: agree with mike prorock. When we look at resource integrity for context, context is a resource. we found a way to make the context values normative.
… by specifying a cryptographic hash function. many ways to represent hashes. Data integrity proof relies on multiformat. totally separate from describing how to secure resource securely.
… in the core data model.

Manu Sporny: Good, I'm glad we have confirmation from both Orie and MikeP on those points.

Manu Sporny: We're good to merge.

Manu Sporny: (afaict).

Dave Longley: What Orie said is why this was so related to #831 -- and does some of what it was trying to do.

Orie Steele: I suspect we may see formal objections in the "hash" is used to make "down refs" normative, but we will see.

Kristina Yasuda: PR 1158 we have agreement to align with 1140.
… we have 15 minutes left. Any other PRs related to this topic?

Orie Steele: Can we close the PRs that are not getting consensus?

Kristina Yasuda: are we OK to bring up 1100 and 1101?

Brent Zundel: Can we get a sense of where each of the PRs are? we have 15 open PRs. Can we go through each of the PRs briefly?

Kristina Yasuda: WIll take up PRs where we have authors on the call.

Copy link
Contributor

@OR13 OR13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we will resolve what having a normative context with the current values in it means, for processors.

Assuming we have issue markers for addressing @jyasskin comments.

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
index.html Outdated Show resolved Hide resolved
@msporny
Copy link
Member Author

msporny commented Jun 30, 2023

Normative, multiple reviews, changes requested and made (issues raised for unresolved issues), no objections, merging.

@msporny msporny merged commit f147ce5 into main Jun 30, 2023
1 check passed
@msporny msporny deleted the msporny-context-normative branch June 30, 2023 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants