Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify that DI should focus on VCs but ensure generality. #96

Merged
merged 3 commits into from
Mar 23, 2022

Conversation

msporny
Copy link
Member

@msporny msporny commented Mar 4, 2022

This PR attempts to ensure that the group doesn't lose sight of the fact that the "proofs of integrity" utilize generalized solutions (JWTs, JWPs, DI). That is, in general, we shouldn't take a generalized solution and make it so specialized to VCs that you can only ever use the solution with VCs.


Preview | Diff

index.html Outdated Show resolved Hide resolved
@iherman
Copy link
Member

iherman commented Mar 5, 2022

We are walking on thin ice here, in view of the discussions we had around the RCH (née LDS) WG chartering.

Whatever we do, if the method we develop can be generalized for other types of data, it will be done by someone, somewhere. We obviously will not say in the documents that you MUST NOT use this and this approach for something else than VC, and we can even be mindful of that when writing the spec. But I think this change would possibly open flood gates.

I would prefer not to change the text; we do not need a war over this now.

@pchampin
Copy link

pchampin commented Mar 5, 2022

I share Ivan's concern. I think it wiser to keep the text as is. I don't think that it prevents the WG to ensure the generality of the approach, or even to publish non-normative guidelines to apply the proposed mechanisms in non-VC use-cases. At least, not if enough people in the WG are willing to push in that direction. Or am I too naive?

Co-authored-by: Dave Longley <dlongley@digitalbazaar.com>
@msporny
Copy link
Member Author

msporny commented Mar 5, 2022

@pchampin wrote:

Or am I too naive?

There has been a consistent multi-year effort spanning all the way back to 2016 to delay this work at W3C, and those efforts continue. One way to hobble the DI work is to hard code it to VCs (which was never the intention). I wouldn't put it past some in the group to attempt this. This text attempts to make it clear that the DI work is intended to be a generalized solution (but we will certainly focus on applying it to VCs and make sure it works there first).

@iherman wrote:

We obviously will not say in the documents that you MUST NOT use this and this approach for something else than VC

If it's obvious, we should state it just to be safe -- especially because there have been multiple interactions in the past with individuals that are hostile towards DI. :)

At the very least, we need to get it on the record that the assumption here is that the solution IS generalizable and the WG won't do anything to endanger that.

Is there alternate language that either of you would prefer to convey that?

@iherman
Copy link
Member

iherman commented Mar 6, 2022

At the very least, we need to get it on the record that the assumption here is that the solution IS generalizable and the WG won't do anything to endanger that.

Is there alternate language that either of you would prefer to convey that?

I do not see why we "need" to get anything like that on records. We can talk about this publicly at some point in the WG's life, the SW community may pick that up if it is convincing, and we will all be happy if that happens. But the charter of this WG ought to talk about VC-s and nothing else.

Personally, I do not think it is worth making any change on the current charter text on this subject. At best, it buys us nothing as far as the WG goes; at worst, we are open to possible attacks from some decisive voices in the SW community when it comes to the AC vote.

Copy link
Contributor

@Sakurann Sakurann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The charter is for vc-data-model-v2 and proofs of integrity covered are for VCs. The current text is sufficient.

Copy link

@selfissued selfissued left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @Sakurann 's assessment:

The charter is for vc-data-model-v2 and proofs of integrity covered are for VCs. The current text is sufficient.

Please close this PR without merging it.

@OR13
Copy link
Contributor

OR13 commented Mar 9, 2022

If this language is not added, JWTs don't provide data integrity... explicit is better here.

index.html Outdated
@@ -237,7 +237,7 @@ <h3>
<dd>
<p>
This family of specifications consists of documents that each define how to
express proofs of integrity for verifiable credentials using a number of
express and associate proofs of integrity with data, focusing on verifiable credentials and
Copy link
Member Author

@msporny msporny Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate # 1:

Suggested change
express and associate proofs of integrity with data, focusing on verifiable credentials and
express and associate proofs of integrity with well-bounded datasets, focusing on verifiable credentials and

Alternate # 2:

Suggested change
express and associate proofs of integrity with data, focusing on verifiable credentials and
express and associate proofs of integrity with digital documents, focusing on verifiable credentials and

Copy link
Contributor

@Sakurann Sakurann Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably be ok with "focusing only on verifiable credentials".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably be ok with "focusing only on verifiable credentials".

Ok, I think that compromise might be fine. To be clear, you'd be ok w/ this language?

express and associate proofs of integrity with well-bounded datasets, focusing only on verifiable credentials

(modulo an alternative to "well-bounded datasets" that everyone can agree to, like "document-bounded datasets -- like @dlongley suggested during the call today).

Copy link
Contributor

@OR13 OR13 Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ I would not be ok with this, unless we drop the data integrity work as a whole. We either do it well, or we don't do it... I am against "letting it in" and then "crippling it" by forcing it to be only about VCs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We either do it well, or we don't do it... I am against "letting it in" and then "crippling it" by forcing it to be only about VCs.

I agree with your desire, @OR13 -- that's exactly what I'm concerned about.

That said, what part of @Sakurann's suggestion makes you feel like by adding the word "only", we run the risk of crippling the work? I guess one interpretation of "only" is that we're "forbidden to talk about Data Integrity in any other context other than VCs?"... so if someone was like: "Oh, if we do that in the Data Integrity work, that'll make it impossible to do X in YZWG!", then someone could respond with "Our charter doesn't allow us to contemplate how Data Integrity applies to YZWG!".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the term "well-bounded datasets" would make things worse. What do we mean by "well-bounded datasets"? How would we decide/define whether dataset XYZ coming from ZZWG is well-bounded? What happens if it is: do we have then a charter obligation to fulfil the requirements of the ZZWG even if the dataset XYZ has absolutely nothing to do with VC data? A definite -1 from me adding that term.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the term "well-bounded datasets" would make things worse.

What about "documents" or "documents (well-bounded digital objects containing information)"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define well-bounded? In any relevant context?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, what about leaving the term "well bounded" out and just saying "documents" or "digital documents"? It avoid the thing DanB was concerned about wrt. "arbitrarily large graphs", right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is: however we try, it is a non-trivial task to define the class of datasets for which our technologies should be applicable. We all have a vague "feeling" of what they are, but coming up with a precise terminology out of the blue for the purpose of this charter will not work. May I remind you that, in the community we are talking about, we also have very mathematical minds who would criticize such an attempt in excruciating details (and for good reasons)? Do we need that here and now? I do not think so.

@Sakurann
Copy link
Contributor

Sakurann commented Mar 9, 2022

@OR13 that is not true. JWTs are already a standard in IETF and provide "data integrity". This is regarding the scope of data integrity 1.0 ie former LDP, which is being standardized in this group.

@Sakurann
Copy link
Contributor

Sakurann commented Mar 9, 2022

This only says that this specification will define how JWTs and LDPs, etc. work with VCs, it does not mean that JWTs and LDPs, etc. cannot work with other "data". those are separate things and are out of scope of this specification.

@iherman
Copy link
Member

iherman commented Mar 9, 2022

The issue was discussed in a meeting on 2022-03-09

  • no resolutions were taken
View the transcript

2.2. Clarify that DI should focus on VCs but ensure generality. (pr vc-wg-charter#96)

See github pull request vc-wg-charter#96.

Brent Zundel: PR 96 clarifies that DI should focus on VCs but ensure generality.

Manu Sporny: this came out of something jeremie miller said, the way I read the charter data integrity isn't in scope at all. That was concerning to me, is because we've been trying to get DI in scope for a while. People still believe the stuff that is in scope isn't. When it isn't in scope people are literally reading the charter, so the DI is only for VC. The danger is people in WG get the impression that we're doing a point solution for VC when it can be used beyond that.
… to generalized tech. The misinterpretation is concerning to me. The language is making it clear the wg is not harming the generality of the DI solution. We can point to the minutes..

Ivan Herman: some of us have gone through a long and excruciating discussion with the semantic web community where this doc was going through another wg charter. I don't remember the number of emails but it went well over 100. The core problem that we hit was to say "if you want to solve DI in general, the consider X,Y,Z issues too" which included things like how do you secure integrity of google's knowledge graph..
… the basic argument is what works for relatively small graphs, does not scale to a bigger one. if we want to go to generality we need to make a statement about what it applies to. In the end is we decided that it will not fly, the message we got back is do it for VC because thats what this group has an expertise in..

Dave Longley: well bounded graphs / datasets.

Ivan Herman: my big fear is if we go down that way we will get FO from members who care about this; we should not open the flood gates..
… perfectly fine if we say "members of this group should not put in obstacles for generalization" but that is crazy, because nobody would do that anyway. As far as i'm concerned we're going down a dangerous road and charter is good as is..

Brent Zundel: i felt roughly the same until i read the PR. I fear the floodgates being open, but i don't think the language does that..

Orie Steele: I think the language is helpful, its better to be more explicit, we know in these cases, that the two formats embedded and external both secure data. Both formats have been applied to things other than VCs, if we're objecting to this language we're making an absurd statement about the usage of some of the tech in wider scope..

Manu Sporny: I think just asking hte question in the group is also helpful. Security VCs is the handwavy name, my expectation is we're not going to try to shove VC JWT, integrity etc, into a single spec..
… is anyone here intending to make the DI spec a point solution for VCs? If no one is speaking up, then we have no intent in the group to do that..
… modified the language to focus on VCs and well-founded data sets.

Kyle Den Hartog: the one thing i'd says is we should double check with jeremy as he is dissenting but not on the call.
… I'm stating it because it was for the purposes of gaining an understanding without actually forcing FOs.


@OR13
Copy link
Contributor

OR13 commented Mar 9, 2022

@Sakurann please review the commit... I don't think you are responding to the actual language that is changing.

verifiable credential is JSON protected by a proof... the proof has 2 formats... both protect integrity.

both of those "proof formats" are used to secure other data (things that are not verifiable credentials).

You are right that we are not defining JWT here, and that we are defining "Data Integrity Proofs" here... this PR makes that clearer and should be accepted.

If you want to object to "defining data integrity proofs here"... do that in a separate PR / issue.

@iherman
Copy link
Member

iherman commented Mar 10, 2022

Because (of course...) I understand the intention of this PR, I was wondering about some alternative texts. Just putting out there the following:

This family of specifications consists of documents that each define how to express and associate proofs of integrity for verifiable credentials and concrete serializations for each of the defined syntaxes. The Working Group would welcome to see the usage of these techniques for data in general, but expressing those are not in the scope of this deliverable.

The specific set of concrete serializations included will be determined by the Working Group. The following are a non-exhaustive selection of expected input documents:

This is probably not a final text, but my intention is to put something in the text that makes it difficult to actively object to a general usage without jeopardizing the acceptance of the charter... I am sure some of you may come up with a better text.

@kdenhartog
Copy link
Member

Because (of course...) I understand the intention of this PR, I was wondering about some alternative texts. Just putting out there the following:

This family of specifications consists of documents that each define how to express and associate proofs of integrity for verifiable credentials and concrete serializations for each of the defined syntaxes. The Working Group would welcome to see the usage of these techniques for data in general, but expressing those are not in the scope of this deliverable.
The specific set of concrete serializations included will be determined by the Working Group. The following are a non-exhaustive selection of expected input documents:

This is probably not a final text, but my intention is to put something in the text that makes it difficult to actively object to a general usage without jeopardizing the acceptance of the charter... I am sure some of you may come up with a better text.

+1 to this suggestion I think the direction it's heading in will tread the line carefully and leave all parties happy without opening a can of worms

@Sakurann
Copy link
Contributor

+1 to Ivan's suggestion.
I think the text as-is is sufficient, but if we are to modify, the nuance that "for this WG, only defining how proofs of integrity are used with VCs are in scope. Nothing in the charter prohibits using them for purposes other than VCs, but defining how is out of scope of this WG" should be clear.

index.html Outdated
Comment on lines 240 to 241
express and associate proofs of integrity with data, focusing on verifiable credentials and
concrete serializations for each of the defined syntaxes. The specific set of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
express and associate proofs of integrity with data, focusing on verifiable credentials and
concrete serializations for each of the defined syntaxes. The specific set of
express and associate proofs of integrity for verifiable credentials and concrete
serializations for each of the defined syntaxes. The Working Group would welcome
to see the usage of these techniques for data in general, but expressing those are
not in the scope of this deliverable. The specific set of

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good compromise.

Copy link
Member Author

@msporny msporny Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer @brentzundel's base language with @dlongley's modification below: #96 (review)

index.html Outdated Show resolved Hide resolved
@iherman
Copy link
Member

iherman commented Mar 17, 2022

The issue was discussed in a meeting on 2022-03-16

  • no resolutions were taken
View the transcript

3.3. Clarify that DI should focus on VCs but ensure generality. (pr vc-wg-charter#96)

See github pull request vc-wg-charter#96.

Brent Zundel: This PR tries to make sure that the work taken on in the WG on data integrity can be more general than strictly VCs -- as long as we can still do VC stuff. This is just an attempt to make sure that the charter language isn't seen as overly binding.
… So people can work on things that are the reason why they joined the group.

Orie Steele: So I joined this group to do two things. The first is to make VCs work with JOSE, to work on VC-JWT as an envelope format for moving VCs around. I also joined this group because there's work I've been involved in for many years on linked data integrity, JSON-LD proofs, interesting things like selective disclosure with RDF normalization.
… A lot of technology just like JWTs apply to more than VCs, this other tech also applies to more than VCs.
… One of the interesting things is that JWTs are already an established standard and we acknowledge that they work for things other than VCs. I don't think anyone would accept language that says JWTs only work for VCs. We shouldn't accept any language for the same thing for the LD proofs. The alternative is to remove all of that from the WG because I'm here for JWTs too.

Michael Prorock: +1 Orie.

Orie Steele: If we're doing something right, we should do it right. We should get over that fight before we do it. It's in scope for the charter. If it's removed we won't be having these discussions. I'm very strongly keeping it in and doing it right. It's one of the main reasons I joined the group.

Brent Zundel: To channel Ivan, he's raised a number of concerns -- that if we make these changes he's worried about arguments from other folks who killed the DI work when it was too general. He said he won't lay down in the road for this.
… Looking at the changes, I believe some have been made. Are these changes sufficient to approve merging this PR, Mike?.
… And Kristina?.

Kristina Yasuda: I've agreed to those suggestions, if those slight suggestions go in -- and I think Manu liked my comments. If he can make those changes, we'd be good to go.

Orie Steele: https://github.com/w3c/vc-wg-charter/pull/96/files.

Brent Zundel: There are two suggestions in there are those the ones you're referring to, Kristina?.

Michael Jones: I don't know what a well-bounded dataset is, so that's strange language.
… And I don't know what digital documents are.

Orie Steele: I think you know what it is.

Michael Jones: It's not standard industry terminology.

Orie Steele: Yes, it is. Industries understand what RDF is, what digital documents are (JSON is a digital document), lots of people know what these things are. These concepts are very understood by some folks and poorly understood by others.

Michael Jones: Say RDF if that's what you mean.

Orie Steele: What we're talking about a form of proof around a structure of JSON. Sometimes it's JSON and sometimes it's canonicalized RDF. I would really prefer to not be in, in every call, we're talking about only doing vanilla JSON or only doing RDF dataset canonicalization.
… The reality is that neither should be used for all use cases. That's a really bad idea.

Ted Thibodeau Jr.: um... most JSON can be printed (as can many if not all serializations of RDF) ... and most of the things that can be done with the digital version can be done with the physical, albeit by hand instead of by cpu...

Orie Steele: The reason that there's objection and sensitivity here is because there seems to be some maneuvering around here. Not understanding RDF and objecting to it all the time is not a path forward. We should just get over that and acknowledge we're doing it together or we're not and removing it. I'd be ok with either approach.
… But doing it and making it better means understanding it when you're objecting.
… I'm going to be direct and tell Mike and Kristina to say that you're not familiar enough with the technology to object in the way that you are. You need to understand it first -- not just object first without understanding. That isn't helpful.
… I want us to focus on doing the remaining things well.

Michael Jones: I appreciate you being frank. I'm not criticizing the tech. That's a mischaracterization of my comments in the PR. I'm echoing Ivan's comments -- who understands W3C process -- that we'd be opening a pandora's box. I've made no comments on RDF/etc.
… I propose we defer this until Ivan says what he thinks here.

Brent Zundel: Ivan has said he will support whatever the group wants to do here.

Kristina Yasuda: I think there's some strong statements being made where we mischaracterize here. There's no objection to working on data integrity proofs here. If you look at the history at why it moved to this group -- it's because Ivan clearly elaborated in last week's call. When it was tried to define it generally there were a lot of questions.
… Be it in cloud database like Google or in IOT for Microsoft. So this work moved here so we can have LD / data integrity proofs for VCs. I'm fully supportive of that. I wouldn't be here if I wasn't.
… But saying we'd work on data integrity proofs on a group that just works on VCs beyond the VC use cases is too broad.
… Drawing on what you said, Orie, saying we'll work on JWT for the OpenID spec would not make sense.
… I think we should focus on getting data integrity proofs right for VCs first. If other people want to explore using DI proofs for other use cases they can do so, no one can stop them.
… When thinking about the limited time this WG has -- we should focus on DI proofs for VCs. That's the context.
… If that doesn't correspond with your definition of getting them right, I don't know what to do.

Michael Prorock: Data integrity proofs are kind of required for VCs to work.

Dave Longley: "The group will focus on defining DI proofs to solve VC use cases" language may help.

Markus Sabadello: I think kristina's comments / suggestions could be a good compromise could help. I like some of the suggestions to agree and state that this can be used for other things than VCs, but we think the scope of the group is to define it for VCs specifically.

Michael Prorock: +1 Markus.

Brent Zundel: If people think Ivan's language would get approved then we can do that.

Orie Steele: the suggestions are going in the right direction.

Michael Prorock: I think that language is an approvement, it's very helpful. I think for VCs to work for our customers, that JSON linked data aspect and semantic aspect is key. We need to sign over RDF data properties, if we can't clean that up and express that very directly. If we can't do that here we're going to create problems and that's not acceptable.

Dave Longley: I do think there is some language we can agree to.
… that can reach a compromise here. We want to be very careful that the lang doesn't say 'someone else can define how to use Data Integrity Proofs elsewhere, but not here'.
… because that implies that another WG has to be created, to go and define how to do that.
… we should take the VC use cases into consideration, and those are the use cases that we work with, and make sure Data Integrity can be used with those use cases.
… and if somebody else finds Data Integrity useful for other use cases than VCs, then we've succeeded.
… the language needs to say:
… "This WG is going to define Data Integrity Proof to solve VC use cases".
… and if it solves other stuff and is generalized, that's great.

Brent Zundel: +1 dlongley.

Dave Longley: but if we explicitly say "but we're not going to define it for other things", that automatically implies that some other WG has to.

Michael Jones: Brent, you can feel free to not answer this / not have the WG answer this. Before I was a member of the WG I suggested that the DI work should be done in its own WG. That was closed without action as far as I can tell.
… I know there is W3C background, I don't know what it is. The people in the connect working group so we created the JOSE WG and we did it. They are general purpose. It seems never non-orthogonal and weird and doing general RDF and signing.

Orie Steele: Agree with Mike, we are doing both "general" and "specific" RDF Signing.

Dave Longley: the short answer to that, Mike, is - there are people in the RDF community.
… btw, this is where the "well-bounded" and "digital document" text comes from.
… so, RDF community said "we can't use this text to sign everything on the internet".
… because that's an unbounded set of data among all web pages.
… what we want to address is - you have a well-bounded single digital document, that's the use case we want to solve.
… that's incidentally what a VC is. but it applies to other things too.
… the only way we could easily move forward to this work, is to say - let's go to some group that has use cases that have to do with well-bounded documents, and work on data integrity there. so that's this WG.

Brent Zundel: Dave, if you wanted to do an alternate wording you can do that in the github issue.

Co-authored-by: Dave Longley <dlongley@digitalbazaar.com>
Copy link

@selfissued selfissued left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me

@brentzundel
Copy link
Member

PR reflects language which gained most consensus in the group and is approved. Merging.

@brentzundel brentzundel merged commit 0b6f3cd into main Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet