Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider mandating securing RDF graph for all securing mechanisms #1327

Closed
awoie opened this issue Oct 24, 2023 · 23 comments
Closed

Consider mandating securing RDF graph for all securing mechanisms #1327

awoie opened this issue Oct 24, 2023 · 23 comments

Comments

@awoie
Copy link
Contributor

awoie commented Oct 24, 2023

VCs conforming to VCDM 2.0 are JSON-LD representing an underlying RDF graph.

(With my very personal hat on) I believe that it is actually required (or at least the cleanest approach) that all "securing mechanisms" that secure the VCDM 2.0 have to make sure that the underlying RDF graph cannot be tampered with. This requires signing over the RDF graph/N-quads. To do that with vc-cose-jose, it would require detached JWS that signs over the N-quads of the VC. In this case, it would also have an implication on vc-data-integrity since only RDF canonicalization algorithms could be used.

(With my very personal hat on) I don't think it is a good idea that vc-cose-jose, vc-data-integrity (with JCS) and vc-data-integrity (with RDF canonicalization) have different guarantees of data integrity. The first two guarantee data integrity on the visible payload (which looks like JSON) whereas the latter guarantees data integrity of the underlying RDF graph. That does not seem to be right to me.

(With my very personal hat on) We might want to consider that all securing mechanisms must integrity protect the underlying RDF graph. Thoughts?

Originally posted by @awoie in #1315 (comment)

@alenhorvat
Copy link

alenhorvat commented Oct 24, 2023

Signing over the RDF graph doesn't protect the graph itself if it is referenced. RDF/Context can be protected by either embedding the context or computing the digest of the context (fully dereferenced) and protecting it in the payload or protected header.

Once this is done, RDF may or may not be used to protect the payload, depending on the signature. If the @context is protected, JSON-LD can be protected the same way any other JSON is protected today.

@awoie
Copy link
Contributor Author

awoie commented Oct 24, 2023

Signing over the RDF graph doesn't protect the graph itself if it is referenced. RDF/Context can be protected by either embedding the context or computing the digest of the context (fully dereferenced) and protecting it in the payload or protected header.

Sorry, I'm not following entirely. I was talking about the N-Triples/Quads that describe the RDF graphs of the VC which are themselves encoded as JSON-LD.

IMHO, I think using the digest approach you described has at least two issues:

  1. It does not solve the problem. This is because it does not capture all required aspects especially if you have several levels deep context definitions. Consequently, it won't work in all cases.
  2. It introduces further complexity on top of VCs especially for those that are using data integrity with RDF canonicalization.

Once this is done, RDF may or may not be used to protect the payload, depending on the signature. If the @context is protected, JSON-LD can be protected the same way any other JSON is protected today.

RDF does not protect any payload, it is always the content hash of the canonical form of the N-quads that does.

Updated:

  • N-quads contain a graph identifier in addition to N-triples.

@awoie
Copy link
Contributor Author

awoie commented Oct 24, 2023

@peacekeeper wrote an interesting article recently, perhaps he could provide some input on the discussion as well.

@alenhorvat
Copy link

I responded to that article :)

@context can be fully dereferenced into a JSON file. If that file is protected, the problem is solved. To expand into N-quads, you need the full context file.

What's explained above is a well-established way of protecting external content (at least with advanced electronic digital signatures domain).

If the context file is unprotected, how will you expand into N-quads? Hence, protecting the RDF differs from using RDF (using RDF to create N-quads).

@awoie
Copy link
Contributor Author

awoie commented Oct 24, 2023

I responded to that article :)

@context can be fully dereferenced into a JSON file. If that file is protected, the problem is solved. To expand into N-quads, you need the full context file.

What's explained above is a well-established way of protecting external content (at least with advanced electronic digital signatures domain).

If the context file is unprotected, how will you expand into N-quads? Hence, protecting the RDF differs from using RDF (using RDF to create N-quads).

Why is this approach not used by Data Integrity over RDF canonicalization if it can achieve the same thing but it seems to be a simpler method?

@awoie
Copy link
Contributor Author

awoie commented Oct 24, 2023

I responded to that article :)
@context can be fully dereferenced into a JSON file. If that file is protected, the problem is solved. To expand into N-quads, you need the full context file.
What's explained above is a well-established way of protecting external content (at least with advanced electronic digital signatures domain).
If the context file is unprotected, how will you expand into N-quads? Hence, protecting the RDF differs from using RDF (using RDF to create N-quads).

Why is this approach not used by Data Integrity over RDF canonicalization if it can achieve the same thing but it seems to be a simpler method?

IMO, if this approach works and is feasible in all cases then it seems to be a simpler option than what is currently proposed in Data Integrity using RDF canonicalization, right?

@awoie
Copy link
Contributor Author

awoie commented Oct 24, 2023

@alenhorvat can you provide an example how this works, and also how this works with several layers of context definitions, i.e., a context definition references another one by URL.

@OR13
Copy link
Contributor

OR13 commented Oct 24, 2023

It's not clear to me that securing the RDF graph is more valuable than securing the conforming documents.

If it is more valuable, than I would suggest that signing application/n-quads is a much more direct solution than the ones we've seen regarding signing JSON-LD, which is (sometimes) transformed to RDF and sometimes not.

@mtaimela
Copy link

mtaimela commented Oct 25, 2023

Base issue is that all referenced links need long-term availability and/or integrity protection, including the @context property. I would prefer to not dwell in the fact that the link happens to dereference into RDF, as the base issue is same for all kinds of links.

The support for this can be done in the signature mechanism, or in the VCDM, where integrity protection for @context could be expected to be done in VCDM.

@alenhorvat
Copy link

@alenhorvat can you provide an example how this works, and also how this works with several layers of context definitions, i.e., a context definition references another one by URL.

I put some examples: https://code.europa.eu/ebsi/ecosystem/-/issues/18#note_74857
I just took the example from ld-playground. It can work with any JSON-LD.
(previous version used the protected header claims, but with the latest evolution of the VCDM it is possible to protect the VCDM-related resources within the VCDM itself, which is great, since it nicely separates the layers).

I would also like to bring to your attention https://w3c.github.io/json-ld-bp/#consuming
We also observe this in most VC exchanges. VCs are processed as JSON and use cases define whether they need the RDF/... other LD features or not.

@awoie
Copy link
Contributor Author

awoie commented Nov 28, 2023

https://code.europa.eu/ebsi/ecosystem/-/issues/18#note_74857

Thanks for the work of putting this together.

All the options assume the issuer has full control over all the context information and how other context information is referenced. What if there are existing immutable multi-level context definitions?

IMO, it would be just cleaner to define requirements for securing mechanisms, and define that the RDF graphs need to be protected, otherwise securing mechanisms have different security guarantees.

@alenhorvat
Copy link

@awoie , as demonstrated, if the source is not integrity-protected, protecting the rdf itself will not help.
If the source changes, and you cannot detect it, the RDF you derive won't match the one that was used to protect the content.

@awoie
Copy link
Contributor Author

awoie commented Nov 28, 2023

@awoie , as demonstrated, if the source is not integrity-protected, protecting the rdf itself will not help. If the source changes, and you cannot detect it, the RDF you derive won't match the one that was used to protect the content.

It helps with detecting that the context was tampered with at verification time.

I personally don't care anymore whether to mandate securing the RDF graph or not as long as the integrity is guaranteed for all securing mechanisms. I'm speaking of a deep protection of integrity (e.g., context that reference other context and so on), not a shallow protection of single resources in the VC. This is currently not the case. It is also not the case that integrity of related resources is mandated. So, either we make it at least detectable for a verifier by mandating securing the RDF graph, or we seal it by one of the approaches you outlined in your notes but with deep integrity protection @alenhorvat .

@peacekeeper
Copy link
Contributor

@context can be fully dereferenced into a JSON file. If that file is protected, the problem is solved.

I continue to believe that securing the RDF graph is preferable over securing the JSON-LD document (even if the context is also secured somehow).

I think this difference will matter once we come to selective disclosure. If you do selective disclosure with SD-JWT, you selectively disclose JSON members. If you do selective disclosure with Data Integrity, you selectively disclose RDF Quads. I need to think about this a bit more, but I feel like there's a potential "part 2" of my article that could reveal some additional interesting effects.

@alenhorvat
Copy link

@awoie , the issuer is fully liable for the content it is signing. In all cases, it must protect any reference, no matter where/how it is referenced. If a vocab references another vocab, the issuer will always protect both, no matter how complex the relationship is.

@peacekeeper, as presented, if the source is protected, the graph will also be. Protecting the source is generic and works for any referenced content, not just vocabs.

@mtaimela
Copy link

Fully agree with @alenhorvat. The current proposal of having relatedResource as an array is well equipped to declare integrity of any URI, in any depth or structure, including the context that references multiple contexts and so on.

@awoie
Copy link
Contributor Author

awoie commented Nov 30, 2023

@awoie , the issuer is fully liable for the content it is signing. In all cases, it must protect any reference, no matter where/how it is referenced. If a vocab references another vocab, the issuer will always protect both, no matter how complex the relationship is.

@peacekeeper, as presented, if the source is protected, the graph will also be. Protecting the source is generic and works for any referenced content, not just vocabs.

There is no reliable way for the verifier to detect that the integrity was not tampered with.

@awoie
Copy link
Contributor Author

awoie commented Nov 30, 2023

Fully agree with @alenhorvat. The current proposal of having relatedResource as an array is well equipped to declare integrity of any URI, in any depth or structure, including the context that references multiple contexts and so on.

I don't think the following line covers vocabs that reference vocabs. If it does then it has to be made more precise:

If relatedResource is present, there MUST be an object in the array for each remote resource for each context used in the verifiable credential.

And my point was that relatedResources have to be made mandatory for cases where there is no signature over the N-quads.

@alenhorvat
Copy link

@awoie can you please provide an example?

  • digests of all referenced files are protected
  • original files are either resolved or shared in an unprotected header
    This is how all external references are protected today. If any element changes, the verifier detects it.

Can you give an example where this wouldn't work?

@awoie
Copy link
Contributor Author

awoie commented Dec 1, 2023

@awoie can you please provide an example?

I'll provide an example on Monday.

  • digests of all referenced files are protected

Does this include vocabs that include other vocabs? According to the spec text I'm not sure. If it does, then it should be made more precise, see my comment above. And it should be made mandatory for securing mechanisms that don't use N-Quads. Otherwise securing mechanisms have different security guarantees.

  • original files are either resolved or shared in an unprotected header
    This is how all external references are protected today. If any element changes, the verifier detects it.

I don't understand this comment. If they are resolved without a digest and if they are in the unprotected header, then nothing is protected and nothing can be detected. Perhaps I misunderstood.

@awoie
Copy link
Contributor Author

awoie commented Dec 1, 2023

@awoie can you please provide an example?

  • digests of all referenced files are protected
  • original files are either resolved or shared in an unprotected header
    This is how all external references are protected today. If any element changes, the verifier detects it.

Can you give an example where this wouldn't work?

I just read your comment again, so you are saying if ...

  • digests of all referenced files are protected
  • original files are either resolved or shared in an unprotected header

then it is secure. If this includes vocabs that include vocabs, then I'd agree with that. I still think that this should be made required otherwise securing mechanisms don't share the same integrity guarantees.

This is how all external references are protected today. If any element changes, the verifier detects it.

Not sure what this refers to though but I guess you were just generally speaking. Again, I would only agree if this is a deep and not shallow protection.

@awoie
Copy link
Contributor Author

awoie commented Dec 5, 2023

related issue w3c/vc-jose-cose#188 (comment)

@awoie
Copy link
Contributor Author

awoie commented Dec 5, 2023

I believe we can close this issue since this issue is really asking for consistent requirements on securing mechanisms that verifiers can rely on which might be covered by PR #1338

@awoie awoie closed this as completed Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants