Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider abandoning drafts for non-data-integrity-proof securing formats #1315

Closed
OR13 opened this issue Oct 11, 2023 · 13 comments
Closed

Consider abandoning drafts for non-data-integrity-proof securing formats #1315

OR13 opened this issue Oct 11, 2023 · 13 comments

Comments

@OR13
Copy link
Contributor

OR13 commented Oct 11, 2023

Based on the dialog on #1308

I feel the working group made a major improvement by limiting the scope of the data model to focus only on JSON-LD.

Although that decision has been criticized here, and I also agree with some of the criticism: https://tess.oconnor.cx/2023/09/polyglots-and-interoperability

I feel that an opportunity to further improve the quality of the deliverables exists, should the working group be able to come to consensus on only specifying data integrity proofs as a securing mechanism.

As many of you probably know, I don't believe data integrity proofs are a good choice for securing JSON-LD (or arbitrary content types), but that does not mean that our working group can do a good job of describing both data integrity proofs and other securing mechanisms... and we should consider delivering quality documents for data integrity vs spreading our attention and recommending multiple securing formats for the same data model.

Its our job to make choices that eliminate optionality and help these technologies protect users, sometimes that means focusing on what we can do well, and not delivering many documents that are substitutes for each other, when less documents with simpler recommendations might have done a better job.

@msporny
Copy link
Member

msporny commented Oct 12, 2023

-1, it would be a mistake to abandon the vc-jose-cose specification (if that's what's being implied here; it's hard to tell). People are securing VCs using JWTs today, so we should be clear about how to do that in the WG that is working on VCs.

@OR13
Copy link
Contributor Author

OR13 commented Oct 12, 2023

In v1, people secured VCs with JWTs, AnonCreds and LinkedDataProofs.
In v2, people can secure things with SD-JWTs, CoseSign1, DataIntegrityProofs or... other media types developed by other SDOs.

My point is v1 had 3 formats from W3C, but only defined 2 of them concretely, and the JWT format was not defined well enough to ensure interop.

In v2, the VCWG, might define SD-JWT and DataIntegrityProofs well enough for implementers to achieve interop for both... but thats still a failure IMO, the SDO should make the choices here, based on what the market has adopted / needs.

Of course, we can close this issue and keep doing what we are doing.

I raise the issue, to plead for a better outcome for W3C, but it's possible its not achievable.

@awoie
Copy link
Contributor

awoie commented Oct 13, 2023

I think I didn't see even one non-experimental open-source implementation of Anoncreds using the approach proposed in the W3C VCDM 1.0/1.1 which is an indication that those proof-types should be defined by their home SDO to be successful. Note, this approach was different than the one that is using Anoncreds + DI. Also note that Hyperledger kept going with their own Anoncredss representation for a long time after VCDM 1.0.

@awoie
Copy link
Contributor

awoie commented Oct 13, 2023

The reason why W3C VCDM 1.0/1.1 with JWT-VC is the way it is today is also because it was largely due to a compromise to make the non-JWT people in the W3C VCWG happy which was the majority of people. It simply didn't serve its own community which is centered around a different set of SDOs. Please note that I don't want to blame those people (including myself who contributed a lot to the JWT-VC spec).

@alenhorvat
Copy link

From experience, the topic is much more complex than it appears since signature validation is bound to many elements; the least important one is the content format that's being protected. The key resolution, signer's identity and identifier, revocation checks, timestamps, and validity dates... The only standards (that I know of) where these things are defined sufficiently (in an implementable way) are ISO and ETSI - they actually define full profiles for their use cases.

These profiles are usually tightly bound to actual use case requirements and legal constraints.

a) the WG can define a "template" profile and a collection of profiles
b) the WG can define a specific profile that serves as a baseline, but we can expect that the use cases will add additional constraints/requirements

If SDOs are willing to collaborate and read each other's specs, a high level of interoperability can be achieved. But if SDOs have other interests, ...

@OR13
Copy link
Contributor Author

OR13 commented Oct 24, 2023

@alenhorvat the working group is not likely to do anything beyond either advancing vc-jose-cose to TR or abandoning it.

In either case the core data model is JSON-LD and the core data model spec is filled with guidance regarding data integrity proofs.

I think its misleading to suggest that securing specification proposed by this WG are getting the kind of expert review that they deserve, I'm trying, but I am not observing enough engagement to meet my "quality bar"... other folks will probably have a different level of review they are comfortable with.

@awoie
Copy link
Contributor

awoie commented Oct 24, 2023

VCs conforming to VCDM 2.0 are JSON-LD representing an underlying RDF graph.

I believe (with my very personal hat on) that it is actually required (or at least the cleanest approach) that all "securing mechanisms" that secure the VCDM 2.0 have to make sure that the underlying RDF graph cannot be tampered with. This requires signing over the RDF graph/N-quads. To do that with vc-cose-jose, it would require detached JWS that signs over the N-quads of the VC. In this case, it would also have an implication on vc-data-integrity since only RDF canonicalization algorithms could be used.

I don't think it is a good idea that vc-cose-jose, vc-data-integrity (with JCS) and vc-data-integrity (with RDF canonicalization) have different guarantees of data integrity. The first two guarantee data integrity on the visible payload (which looks like JSON) whereas the latter guarantees data integrity of the underlying RDF graph. That does not seem to be right to me.

@alenhorvat
Copy link

Securing the RDF is different from securing using RDF. Securing using RDF does not mean that RDF is also secured (if referenced, it can be tampered with). RDF can be secured with any signature (jose/cose/...)

IMO, it should be clarified how to secure the different elements.

@awoie
Copy link
Contributor

awoie commented Oct 24, 2023

Securing the RDF is different from securing using RDF. Securing using RDF does not mean that RDF is also secured (if referenced, it can be tampered with). RDF can be secured with any signature (jose/cose/...)

IMO, it should be clarified how to secure the different elements.

My assumption was that if I produce the N-quads and sign over the N-quads I secure the RDF graphs described by those N-quads.

@peacekeeper
Copy link
Contributor

The first two guarantee data integrity on the visible payload (which looks like JSON) whereas the latter guarantees data integrity of the underlying RDF graph.

This probably refers to my recent demonstration on this: https://medium.com/@markus.sabadello/json-ld-vcs-are-not-just-json-4488d279be43

What I forgot to mention in that article is that with Data Integrity it's also possible to sign just the JSON document (using JCS canonicalization instead of RDFC-1.0).

That does not seem to be right to me.

I believe it was you who proposed using JWT to secure the VCDM in the first place, which has seen a lot of adoption due to its simplicity :)

While I agree that it's strongly preferred to sign the underlying RDF graph in case of a JSON-LD document, I'm undecided if I would go as far as abandoning all pre-existing work on vc-cose-jose. As long as the details are properly understood, I think vc-cose-jose can potentially still be used to secure the VCDM, e.g. if the content of the JSON-LD context is secured in some other way via hashlink or separate signature, or if there is some other out-of-band way to agree on the JSON-LD context in a reliable way.

@awoie
Copy link
Contributor

awoie commented Oct 25, 2023

What I forgot to mention in that article is that with Data Integrity it's also possible to sign just the JSON document (using JCS canonicalization instead of RDFC-1.0).

Yes, I also mentioned that in #1327 (comment) and I think it would also mean to not use JCS in DI if we go down this path.

That does not seem to be right to me.

I believe it was you who proposed using JWT to secure the VCDM in the first place, which has seen a lot of adoption due to its simplicity :)

That is correct but JSON-LD is only required since VCDM 2.0.

While I agree that it's strongly preferred to sign the underlying RDF graph in case of a JSON-LD document, I'm undecided if I would go as far as abandoning all pre-existing work on vc-cose-jose. As long as the details are properly understood, I think vc-cose-jose can potentially still be used to secure the VCDM, e.g. if the content of the JSON-LD context is secured in some other way via hashlink or separate signature, or if there is some other out-of-band way to agree on the JSON-LD context in a reliable way.

I think the hashlink approach does not work for the reasons I explained in #1327 (comment) -> 1) and 2). Additionally, I think it might have issues with preserving the order of the context URLs if the idea is to merge all context definitions into one thing. Also, there is no algorithm defined for this. We would need to define this somewhere to allow interoperability.

Furthermore, I thought RDF canonicalization was invented to solve exactly this problem. If we are saying there is another way of protecting the underlying RDF graph (obtained via context resolution), then I don't understand why RDF canonicalization is the preferred option in DI.

I still believe that having securing mechanisms that ensure different levels of data integrity -> RDF graph vs plain JSON, is a bad idea and can lead to (security) issues in practice.

@OR13
Copy link
Contributor Author

OR13 commented Oct 31, 2023

Typically "data integrity" means "preserving the bytes", not "preserving the abstract information that has been canonicalized by an algorithm that is exponential in the input size in worst case, and requires deserializing and graph processing of untrusted data".

Regardless of how many media types exist, securing them as bytes will continue to be the safest way to protect their integrity.

I still think that vc-jose-cose should be abandoned, primarily because it appears to reflect W3C consensus, that I don't believe exists.

However, since I am in the rough on this point, I will close the issue.

If you feel that absolute consensus is required to proceed, you can reopen it.

@OR13 OR13 closed this as completed Oct 31, 2023
@TallTed
Copy link
Member

TallTed commented Nov 2, 2023

I realize that this has been closed, but still ask that its title be changed from Consider abandoning drafts for none data integrity proof securing formats to Consider abandoning drafts for non-data-integrity-proof securing formats.

@brentzundel brentzundel changed the title Consider abandoning drafts for none data integrity proof securing formats Consider abandoning drafts for non-data-integrity-proof securing formats Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants