Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precise specification of the DID Core Vocabulary #404

Closed
iherman opened this issue Sep 19, 2020 · 42 comments
Closed

Precise specification of the DID Core Vocabulary #404

iherman opened this issue Sep 19, 2020 · 42 comments
Assignees

Comments

@iherman
Copy link
Member

iherman commented Sep 19, 2020

(I decided to spin-off this issue from #403, because it becomes a bit intricate and having it as a separate issue may be helpful.)

At the moment the Core spec contains property specifications (authentication, alsoKnownAs, etc) in English prose. These are scattered all over the place, and that is confusing. There is no one place where all the terms, classes, and constraints are properly defined, are referencable, and possibly easily usable for testing or by implementations. I think it is necessary to provide a clean, crisp, and unambiguous specification as part of DID-Core. This specification should account both the JSON-LD (ie, Linked Data) and pure JSON world, as well as CBOR.

The only reference I found at the moment is The Security Vocabulary which contains the DID Core terms (although it is incomplete, alsoKnownAs is, for example, missing). This is the vocabulary referred to (via the https://w3id.org/security# namespace) from the JSON-LD @context, i.e., this is where one get by "following one's nose". (Note that this set of references is only meaningful for JSON-LD.)

But there are problems with the Security Vocabulary:

  • I did not find any RDF(S) version of that vocabulary specification, ie, no formal vocabulary definition. (That may be my fault.)
  • The vocabulary seems to be much larger than what is used by DID Core. It is, therefore not a faithful representation of the DID Core vocabulary, and that may become a source of confusion
  • It is also not clear what the exact relationship is between the DID Core Registry and this vocabulary.
  • The vocabulary is a CG Report. It is therefore not considered as a stable document, i.e., it is not appropriate as a normative reference in the DID Core spec (as an aside, it is not referred from the spec at all…)

I believe what may want to do:

  1. There should be a separate section (appendix or elsewhere) in the DID Core spec, including the definition of all the DID Core terms, with constraints defined in English prose using, when appropriate, the INFRA vocabulary. The terms, the relevant constraints, etc, should be seen as part of the abstract data model_, the common reference point for JSON, JSON-LD, and CBOR.
  2. We should have a separate, RDFS ontology file for the DID Core Vocabulary. This should be part of the DID Core spec (exactly how is an editorial question) and should probably be normative.
    • Caveat: ideally, the (RDF) DID Core Vocabulary should be in a different namespace than https://w3id.org/security#; it should use a namespace URI that would return the normative RDFS file of DID-Core if dereferenced as an RDF graph. That may create some hiccups v.a.v. the current security vocabulary, though.
  3. We should also have a separate SHACL shape definition to express the various constraints on the vocabulary ("there can only be one subject", etc.)
    • Caveat: there are much less tools and expertise around for SHACL, this may be a more difficult task to accomplish. There a SHACL playground that can greatly help, though.
  4. We should also have a separate but, alas!, definitely non-normative JSON Schema to express the various constraints on the vocabulary when used from JSON.
    • Caveat: unfortunately, the JSON Schema specification is a moving target, and the various JSON Schema versions are not necessarily backward compatible. Hence I do not think it is appropriate to refer to a JSOn Schema normatively. I would be happy to be proven wrong.
  5. I am not sure what the equivalent formalism is for CBOR, to express the constraints…

I believe (1) above is the most important for the integrity of the DID Core specification. The other items may be non-normative, but should definitely be considered as stable, be part of the Working Group's output.

@iherman
Copy link
Member Author

iherman commented Sep 21, 2020

To put my money where my mouth is...

I have created an initial draft for a DID Core Vocabulary and for a DID Core SHACL Shape. Both are based on my understanding of the vocabulary reading the spec; I may have also misunderstood it here and there. (Based on my comment earlier today the vocabulary separates a subject term. I realize that is still to be discussed.)

This is obviously not final but maybe it will help us to come up with a really precise specification of the vocabulary. Also, my experience/knowledge of SHACL is fairly shallow, although I could make sense of the spec based on my background in RDF. But if it becomes a "production" version we may want to talk to someone who is more of an expert than I am.

If you want to test all this (and somebody should...) I used a Python-based SHACL implementation called pySHACL based on RDFLib. I installed it easily and worked out of the box. I used the following command line:

pyshacl -s DID-core-shape.ttl -e DID-core.ttl -i rdfs -f human test.ttl

which means that the tool performed and RDFS inferencing on the data using the vocabulary before checking the shape (that is important to get the subclassing worked out).

Note that I did not try to incorporate the JWK key structures into the vocabulary or the shapes. It is just an unknown class for now.

Note also that this may be a good tool to create tests on DID documents...

Cc @OR13 @msporny @dlongley @talltree @peacekeeper

@OR13
Copy link
Contributor

OR13 commented Sep 21, 2020

Is DID Core SHACL compatible with the JSON-only representation?... I think they are fundamentally incompatible because JSON only supports arbitrary properties that are not included in the registries.

@iherman
Copy link
Member Author

iherman commented Sep 21, 2020

Is DID Core SHACL compatible with the JSON-only representation?... I think they are fundamentally incompatible because JSON only supports arbitrary properties that are not included in the registries.

I have to think it over, but I am not sure. SHACL is a bit like JSON schema (actually, it would be interesting to create a JSON Schema from the SHACL). It puts constraints on properties that are in use, but it is silent on other properties. Isn't it similar to JSON?

Putting another way: if we have a JSON file for a DID Document, we can turn it into JSON-LD in a trivial manner, we can check it with SHACL and, I believe, the possible validation errors would be valid for the original source, too.

@OR13
Copy link
Contributor

OR13 commented Sep 21, 2020

if we have a JSON file for a DID Document, we can turn it into JSON-LD in a trivial manner

this is not true today, and a number of open PRs are seeking to make it even less true tomorrow.

one of the major features for JSON-only is that is has no reliance on "schemas / registries"... I suspect a lot of the JSON-only representations would be incompatible with RDF triples, because if thats a thing people cared about, they would probably have just used JSON-LD....

I think members of this WG, including @dhh1128 @jricher @peacekeeper @talltree and @selfissued have a strong desire for the JSON representation to be capable of incompatible with RDF....and the flexibility that separate JSON representation enables is viewed as a benefit by some and a security / interop failure by others.

@dhh1128 has some great comments advocating against an abtract data model for DIDComm here: decentralized-identity/didcomm-messaging#112 (comment) (which I agree with).

(I was a proponent of this abstraction, BTW -- but mostly because I detested the concrete model that had been put forth, not because I wanted to facilitate lots of alternatives...)

I've not seen a lot of work on JSON-only, but I have seen a lot of work trying to make it incompatible with JSON-LD.... this leads me to believe that a primary design goal of JSON-only representation is to support did document representations that are NOT compatible with RDF / linked data.... this incompatibility (flexibility) is viewed as a feature which is only possible to achieve by creating a new representation for JSON-only....

In short the argument of some goes like this:

  1. All DID Documents MUST NOT be RDF / Linked Data compatible
  2. Some DID Documents MUST support arbitrary properties that don't map to RDF triples

I don't agree with these arguments, for a number of reasons, but its clear that many in this working group also don't agree with me :)

@iherman
Copy link
Member Author

iherman commented Sep 22, 2020

@OR13, I try to keep away from the general, slightly ideological discussions...

At the moment the DID Core document is based on (1) an abstract data model and (2) a manifestation of that data model in JSON-LD and JSON. (And possibly CBOR, let me put that aside for now, not being a CBOR expert at all.) Also, at the moment, the DID Core document is referring to terms like controller or alsoKnownAs in a syntax-independent way, but, as far as I can see, both the JSON-LD and the JSON manifestations (serialization) used these names/keys/terms (whatever terminology we use).

As I said in #404 (comment) we should have all the constraints on the model described in clear terms, using INFRA terminology, because that should be part of the abstract model. Ie, that the alsoKnownAs property has one or several URI-s as values. We can then translate the constraints into whatever formalism a practical manifestation/serialization uses.

For the JSON-LD version, because, in effect, we define a LD vocabulary, we are supposed to provide a vocabulary file in RDFS and, because RDFS cannot express constraints, SHACL comes into play. This is just being good LD citizens, but it also gives us a proper, unambiguous specification of the vocabulary.

For JSON, ideally, JSON Schemas could be used to express those same constraints. Except for the problems around JSON Schemas, as mentioned in my aforementioned comments.

My off-hand remark about turning the JSON into JSON-LD was not meant as a general, fundamental step, just a very down-to-earth, practical one: if we do not have a JSON Schema, then one way of checking a JSON serialization of the model (which, at this moment, is still based on using the same property names as the JSON-LD one) is to that. That being said, I agree that it is not really clean and stirs up too many questions, so I would be fine having a (non-normative) JSON Schema for the same purpose. I am not very good at JSON Schema but, based on my experience, it should be easy to translate the SHACL constraints into JSON Schema.

Both the SHACL and the JSON Schema can express the open-ended requirement, namely the fact that there may be additional properties added to the DID document, not condoned by this specification. I do not see that as a problem.

@rhiaro
Copy link
Member

rhiaro commented Sep 22, 2020

one of the major features for JSON-only is that is has no reliance on "schemas / registries"

Just jumping in to say whaaat?!? I though the point of the DID spec registries was to support interop between JSON implementations, because they have no other way to talk to each other about the terms the use except through a centralized registry. My recollection is that people pushing for JSON-only support were the ones pushing for the registries in the first place.

Sorry for that slight tangent Ivan. Agreed we need a cleaner layout of the core vocab. I thought this was something that might happen in the Registries, but then everything got reshuffled so there was less differentiation between properties from DID Core and other places. I made a half hearted attempt at a handy listing at the start of the Core Properties section, but that obviously pales in comparison to what you're working on..

I hadn't thought that we'd use a different namespace from security was in question? https://www.w3.org/ns/did/ has been mentioned in various places for a while. It's more that DID Core uses some terms from security v1 (in a way that only matters to JSON-LD implementations as it is only apparent in the context) than security v1 being intended to cover everything for DID Core.

@iherman
Copy link
Member Author

iherman commented Sep 22, 2020

I hadn't thought that we'd use a different namespace from security was in question? https://www.w3.org/ns/did/ has been mentioned in various places for a while. It's more that DID Core uses some terms from security v1 (in a way that only matters to JSON-LD implementations as it is only apparent in the context) than security v1 being intended to cover everything for DID Core.

I do not know. At this moment there is no trace of anything else but the security vocabulary.

@OR13
Copy link
Contributor

OR13 commented Sep 22, 2020

Just jumping in to say whaaat?!?

I know... surprisingly some of the proponents of creating the registries and an abstract data model desire to enable did representation that do not use the registries or the absract data model, and that are not RDF-esque...

I am supportive of this, as long as we don't destroy JSON and JSON-LD common sense to achieve it.

@dhh1128
Copy link
Contributor

dhh1128 commented Sep 23, 2020

I mostly don't have a strong opinion about the question implied by the issue title, but I feel the need to comment on Orie's initial characterization of my thinking, since he mentioned me by github handle. He's not entirely wrong, but I want to nuance. :-)

I am not opposed to a DID document being interpretable as JSON-LD, if I don't have to list a finger to make this true. I am opposed to going out of my way to make it so.

The typical assertion that I hear from JSON-LD proponents at this point is: "Good news. You don't have to lift a finger. Just include @context and be done." But that is not reality. I raise issues against the DID core spec and I'm told, "No, we don't want to do it that way because it muddies RDF triples. You have to use subject predicate object." This is forcing me to model data structures differently to accommodate JSON-LD -- which is not supposed to be a requirement of the ADM. I'm told that if I want to register my extension, I have to jump through JSON-LD hoops to do so. Etc. That's the sort of thing the makes the simple story fall apart.

I have never believed that JSON-LD-style extensibility is a good idea for DID docs; I think it complicates something that should be dirt simple. Less extensibility = more traction for the core.

I was never in favor of creating registries; I accepted them as a tolerable but undesirable compromise. I would be happier to just extend JSON by putting new fields into it, documenting the update, and letting chips fall where they may. That's how the JSON or XML of early RESTful web services used to work: duck typing FTW. I viewed registries as something insisted upon by JSON-LD proponents. I'm surprised at the implication in this thread, that ADM proponents were behind the registries movement. Maybe I'm just misinformed...

I'm not trying to argue a point here. I'm reconciled to the compromise we worked out in the F2F. I will make some of my DID method work JSON-LD compatible to broaden its appeal, and I will probably register some extensions (and corresponding JSON-LD contexts) to broaden their appeal, too. But I don't feel a need to register every extension, or to register at the beginning of an extension's lifecycle; I might register an extension only when it has enough gravitas to make interop important. That doesn't mean I don't care about interop, or that I'm actively trying to sabotage JSON-LD. It just means JSON-LD extensibility has little value to me WRT DIDs, so I'm minimizing the attention I pay to it. (And lest you think I'm down on JSON-LD, I feel quite differently about its relevance to VCs.)

@msporny
Copy link
Member

msporny commented Sep 23, 2020

@dhh1128 wrote:

I viewed registries as something insisted upon by JSON-LD proponents. I'm surprised at the implication in this thread, that ADM proponents were behind the registries movement. Maybe I'm just misinformed...

Haha, no way! The JSON-LD proponents were very much opposed to the registries -- they are unnecessary for JSON-LD. The registries were put in place by the JSON-only folks -- because they needed a place to specify terms so they could interop. JSON-LD doesn't need such a mechanism because it already has @context. I fear you're misinformed on how the registries came to be. They were largely driven by Sam Smith (ConsenSys), Mike Jones (Microsoft), Justin (Secure Key), and Drummond (Evernym), IIRC. We were looking at a series of formal objections if the group didn't find a compromise, which we did.

You may also be surprised to know that the "define the registry in JSON-LD" exists because the JSON-only folks then didn't show up (zero PRs submitted) to define JSON Schema and other stuff for the JSON-only representation.

But I don't feel a need to register every extension, or to register at the beginning of an extension's lifecycle; I might register an extension only when it has enough gravitas to make interop important.

Yep, that's the JSON-LD way -- lazy registration and progressive interop -- none of this heavyweight "registry in the beginning" stuff... but we made the compromise we did to placate the JSON-only people that wanted absolutely nothing to do with JSON-LD.

Now, to be fair... the registry has grown on me a bit -- we were able to build a bigger tent because of it and the abstract data model. It does come in handy for representations that do not have a mechanism like @context, such as pure CBOR and pure JSON. It does provide a single point where people can go and see what the current state of attempted interop is in the ecosystem. It's not all bad.

... but don't think for a second that the JSON-LD folks argued for a registry -- it was the opposite. The registry was a compromise that was made, that has ended up not being perfect, but brought more people to the table with the downside of additional complexity.

@msporny
Copy link
Member

msporny commented Sep 23, 2020

I have created an initial draft for a DID Core Vocabulary

Oh no! :((((

The DID Spec Registires is the DID Core Vocabulary... if it's not, we're duplicating a ton of work.

We probably need to get a resolution on this quickly... don't want to generate a ton of unnecessary work for everyone... we need to converge on the concept above.

@iherman
Copy link
Member Author

iherman commented Sep 23, 2020

@msporny the Spec Registries document is

  • an HTML document and not a machine readable format of the vocabulary (of the ontology, if we prefer a much more elegant word :-)
  • it is a Note, and not a normative document, ie, it cannot be used as a normative reference from the Core for the (normative) terms of the core vocabulary

Hence, in my view, the need of a formal vocabulary; that is what I did in Turtle. I have no intention of re-writing the Registries' text, that is for sure!

@OR13
Copy link
Contributor

OR13 commented Sep 23, 2020

We've come full circle!

I fear I have wasted a tremendous amount of time trying to get the registries to work for multiple representations with mostly no help.... If I seem frustrated with the abstract data model, this is why.

At this point, I don't understand why the registries were even created, they are not required for JSON or JSON-LD, and nobody contributes to them AND we appear to need to define all the core terms in them in DID Core anyway, in an abstract data model, with support for special types like URIs, Dates and Cryptographic Keys..... perhaps we should just start from scratch?

  • As a developer I want to go to did core and see all the terms that are defined in did core.
  • As a developer I want to go to did core and see all the types for all the term property name and term property values.
  • As a developer I want to go to did core and see how to produce a representation from a map of abstract data model property names and values.
  • As a developer I want the production and consumption rules for did documents to match the conventions associated with the representation: don't force JSON to use JSON-LD, but also don't forbid (deny-list) arbitrary properties in JSON representations, including @context and $schema.

I need to see an INFA definition for every normatively defined did document property, resolution option, and resolver meta data, to believe that infra is a good idea....

In absence of that, I believe it must be removed from the spec.... if we are going to have an abstract data model, we must do it right... if we are not, we should not pretend we have an ADM.... strong consensus can lead to a lot of compromises that are confusing for spec readers / implementers....

Are the folks who proposed the ADM ready to do the PRs necessary to define every term in INFRA?

If we can't get consensus on defining every term in INFRA, what happens to the ADM?

@msporny
Copy link
Member

msporny commented Sep 23, 2020

Are the folks who proposed the ADM ready to do the PRs necessary to define every term in INFRA?

This was done a while ago for every property in the DID Document. If there is a DID Core property that exists today that isn't defined in INFRA, that's a bug and should be fixed.

Can you please point to a DID Core property that is not defined in terms of INFRA?

It is true that the resolution/dereferencing options aren't described in INFRA, but that's fairly easy to do since they're all strings at a minimum, and these more complex types (URL, DateTime) if the group comes to consensus on including those. Otherwise, we'll just fall back to strings for everything resolution/dereferencing related.

@OR13
Copy link
Contributor

OR13 commented Sep 23, 2020

@msporny publicKeyJwk, publicKeyBase58, although id, type and controller are defined... these are not, and without them, the VM is actually not useful.

https://w3c.github.io/did-core/#verification-methods

@msporny
Copy link
Member

msporny commented Sep 23, 2020

they are not required for JSON

Not true, that keeps being asserted, but the DID Registries exist partly because of the desire to interop in JSON-only ecosystems. Granted, that's mostly theoretical at this point since everyone seems to be using JSON-LD... but the theory holds. You don't know if you can interop in a JSON-only world w/o a registry -- that's the path the JSON-only folks chose to follow for JSON-only interop. It also helps cross-representation interop because folks know what properties are used in a global sense.

@msporny
Copy link
Member

msporny commented Sep 23, 2020

publicKeyJwk, publicKeyBase58 (are not specified in INFRA)

It's arguable that DID Core should specify those in INFRA, as those properties are really LD Security properties and should be defined over there. That said, if we want to define them in DID Core, it's easy enough to do in INFRA:

publicKeyJwk - If specified, the value of the property MUST be a single map(INFRA) containing allowed properties according to RFC7517(JWK). The properties MUST NOT include those that are marked as "Private" in the "Parameter Information Class" column in the JOSE Registry.

publicKeyBase58 - If specified, the value of the property MUST be a string(INFRA). The string is expected to conform to the Base58 Bitcoin encoding rules(BASE58-I-D).

So, easy to do in INFRA. Also, suggest that we don't do that and just push definition off to the Security Vocabulary or particular Linked Data Security suite.

What else do you need to see to convince you that we can use INFRA as the base for everything, Orie? I can continue to write definitions in here until we've exhausted all of them.

@iherman
Copy link
Member Author

iherman commented Sep 23, 2020

Just reacting on the distress call of @OR13 :-) I have raised this issue, i.e., I may be responsible for the mess. I guess it is up to me to try to settle the confusion.

TL;DR: I do not really think we have such a big problem as this may seem :-)

The way I see the various building blocks is as follows:

  1. We do have the ADM. After all it is fairly simple: we are talking about an INFRA map, i.e., key/value pairs. We use the term "property" in our document for a key (maybe we should keep to the INFRA term?). The values are maps, literals, or numbers (to be defined), or lists thereof.
  2. The Core specification defines a number of specific maps and keys (properties). We need a precise specification for all the properties, i.e., what their possible values are, what their constraints are (e.g., can there be only one for a specific map) and, of course, what their meaning (semantics) is. This specification MUST be made in the Core specification (because that is the only way of doing it normatively). This is not really a matter of "defining them in INFRA", it is just using the INFRA terminology.
  3. The Registries document should concentrate on the non normative terms that are defined elsewhere and are submitted to the community. To answer @OR13's question, yes, I believe, the correct way of doing this is that the registry entry is described in terms of INFRA terms, but that is just a fancy way of saying that the key and the corresponding value must be properly defined, including its constraints.
  4. The Registries document may decide (and has decided) to have the references to the Core vocabulary entries, too, but that is only a reference. Not a definition.

Some of my comments in #401 and #403 were editorial in this respect: some of the keys (properties) are not properly defined as being part of the Core set of terms (or it is not clear that they are part of it), and they should be made editorially much more visible. The constraints may not be fully clear. All editorial stuff, see my comments there.

That is on the abstract level. We then do have a separate section on the JSON-LD, JSON, and CBOR mapping to and from the ADM. It is there (modulo some editorial issues), and not really complicated; after all, INFRA already defines the mapping to and from JSON.

My original comment in this issue was on the JSON-LD conversion. Indeed, JSON-LD means connecting our ADM and the property definitions and constraints with the Linked Data world. My statement is that for this case, if we want to be good citizens, we need more than what we have, namely, we need a proper RDF based vocabulary (and of course a separate @context, but we mostly have that). Also, I propose to use SHACL shapes to surface the constraints. That puts us on solid ground for Linked Data and can be done as an addition of what is out there.

I also think that it is worth doing something similar for the LD-less JSON world, and the obvious candidate would be to express the terms and their constraints in JSON Schema. I don't think it would be complicated.

(Doing SHACL, as well as JSON Schema, also forces us to reveal all details of our vocabulary spec, so it helps us! And it is a service to the community, too.)

@iherman
Copy link
Member Author

iherman commented Sep 23, 2020

publicKeyJwk, publicKeyBase58 (are not specified in INFRA)

It's arguable that DID Core should specify those in INFRA, as those properties are really LD Security properties and should be defined over there. That said, if we want to define them in DID Core, it's easy enough to do in INFRA:

publicKeyJwk - If specified, the value of the property MUST be a single map(INFRA) containing allowed properties according to RFC7517(JWK). The properties MUST NOT include those that are marked as "Private" in the "Parameter Information Class" column in the JOSE Registry.

publicKeyBase58 - If specified, the value of the property MUST be a string(INFRA). The string is expected to conform to the Base58 Bitcoin encoding rules(BASE58-I-D).

So, easy to do in INFRA. Also, suggest that we don't do that and just push definition off to the Security Vocabulary or particular Linked Data Security suite.

I think I disagree with you, @msporny, but only on the last sentence above. Having re-read the spec last week, it is fairly clear that both publicKeyBase58 and publicKeyJwk play a fundamental role in our spec. They are at the basis of the core scaffolding provided by the core vocabulary, ie, it should be normatively defined. That means it should be part of the Core spec. It may be questionable whether the key types (RsaVerificationKey2018 etc) should be part of the core vocabulary, but the terms themselves are different.

@msporny
Copy link
Member

msporny commented Sep 23, 2020

Having re-read the spec last week, it is fairly clear that both publicKeyBase58 and publicKeyJwk play a fundamental role in our spec.

Well, if we start defining those things in DID Core things become strange... because those are things that a Linked Data Security WG should be defining :). Our charter also, arguably, forbids us from working in this area:

https://www.w3.org/2019/09/did-wg-charter.html

Out of scope: Defining specific authentication, signing, or cryptography mechanisms. Scope is limited to defining extension points for these mechanisms. (See "Scope".)

Given that Microsoft fought to have that language added to the Charter, I was surprised when @selfissued suggested that we should define those sorts of things.

I'm fine either way, just pointing out that there be dragons.

@iherman
Copy link
Member Author

iherman commented Sep 23, 2020

I'm fine either way, just pointing out that there be dragons.

:-)

But we are not defining a crypto mechanism. Crypto mechanism, in my book, would be if we defined a new crypto algorithm for the purposes of DID, for example, and that would be a bit no-no. We do not even define how we would express JWK, just saying "this is the way we refer to it". I think this is a very kind baby dragon:-)

@msporny
Copy link
Member

msporny commented Sep 23, 2020

I think this is a very kind baby dragon :-)

I don't disagree with you on that point (in general)... alright, well... you get to feed that baby dragon... it's yours, man. Let's just hope it doesn't grow up too quickly. :P

@msporny
Copy link
Member

msporny commented Nov 2, 2020

I'm trying to figure out if there is anything that needs to be done here. Yes, it would help to have formal definitions of everything, but that said, it doesn't seem like people are having that hard of a time implementing the specification as it stands today. I suggest we make precise definitions a stretch goal and if we don't meet that stretch goal, that's fine. At present, everything is defined in JSON-LD and INFRA. @jonnycrunch is working on CDDL. No one really contributed to the JSON Schema stuff. I suggest all of this stuff is non-normative helpful stuff that we don't need to put in the critical path for CR or Proposed Rec.

@msporny msporny added pre-cr-p2 ready for pr Issue is ready for a PR labels Nov 2, 2020
@iherman
Copy link
Member Author

iherman commented Nov 2, 2020

Yes, it would help to have formal definitions of everything, but that said, it doesn't seem like people are having that hard of a time implementing the specification as it stands today.

I do not think we can jump to the conclusion. We do not have a comprehensive test suite that would include all the various constraints on the terms that we may or may not have, and we have no idea whether implementations arrived through the same conclusions based on a specification that does not make the term definitions, and its constraints, crystal clear. See my note at the end of my comment, b.t.w.

I suggest we make precise definitions a stretch goal and if we don't meet that stretch goal, that's fine.
[...]
I suggest all of this stuff is non-normative helpful stuff that we don't need to put in the critical path for CR or Proposed Rec.

I do not think I agree. Where would you want to have a precise definition if not in the standard?

At present, everything is defined in JSON-LD and INFRA. @jonnycrunch is working on CDDL.

I would not agree with "is defined in JSON-LD". There is a generic narrative but no real formal definition. By using JSON-LD, this specification defines a vocabulary in RDF/Linked Data, and the proper way of doing that is to define the vocabulary (e.g., to make it very clear what we define in the spec and what is defined, possibly, elsewhere). At the minimum, something like the RDF vocabulary draft I created should be updated, checked, and published alongside the spec at the right namespace URL.

I like CDDL and, if that works, I am happy to use that for a formal specification of all the constraints. In that case the draft SHACL shape may indeed be a "non-normative helpful stuff". But I know that while I was writing the shape down I realized I was not sure about the constraints, in fact, because the spec was not clear (note that @jonnycrunch reported similar issues). From the spec's point of view I would actually consider both the SHACL and @jonnycrunch's CDDL drafts as very special implementations, and that contradicts your statement whereby "it doesn't seem like people are having that hard of a time implementing the specification as it stands today"

@msporny msporny added needs special call Needs a special topic call to make progress and removed ready for pr Issue is ready for a PR labels Dec 1, 2020
@msporny
Copy link
Member

msporny commented Dec 1, 2020

What we need is the following:

  • Human readable definition in DID Core.
  • RDF definition in some file TBD.
  • JSON-LD definition in did-core/v1.
  • CDDL/SHACL schema definition for specific serializations.

@rhiaro
Copy link
Member

rhiaro commented Dec 1, 2020

JSON-LD definition in did-core/v1.

Does this mean w3.org/ns/did/v1 ?

The RDF definition file can be at w3.org/ns/did/v1 as well, with conneg. And I would propose the file itself lives in the DID spec registries repo (eg. with the JSON-LD context). I don't think this needs to hold up CR.

I think the rest of this issue is about making sure there are no terms mentioned in the spec prose which aren't actually defined in the spec. The rewrite of the Verification Methods section should sort that out once and for all (see #423 and #240). Then we need to highlight in any examples where non-core terms are used (for illustrative purposes), which has its own issue: #447.

Finally it sounds like a summary table of the core properties and their constraints (in INFRA, not rdf) would be helpful. At the start of the core properties section there is a bullet list of them to help navigate. This could be turned into a table with a little more information - would that help @iherman? I'm happy to put a PR in for that if so.

All that to say: I think the critical part of this issue is being covered by other issues, and what remains is editorial. I re-read this thread to make sure I didn't miss something, but let me know if that doesn't sound right.

@iherman
Copy link
Member Author

iherman commented Dec 2, 2020

Finally it sounds like a summary table of the core properties and their constraints (in INFRA, not rdf) would be helpful. At the start of the core properties section there is a bullet list of them to help navigate. This could be turned into a table with a little more information - would that help @iherman? I'm happy to put a PR in for that if so.

I am not sure that this navigational aid can be easily extended. To give you a specific example, the id term is defined for the subject in §5.1.1. but, in fact, that very same term can be used for other purposes in a DID Doc, e.g., for a specific verification method. Of course, for this particular property and for a JSON-LD user this is natural because that property is an alias to a JSON-LD keyword, but it is not obvious for a JSON/CBOR user. The various constraints (e.g., can I use an alsoKnownAs property as part of a verification method?) have to be specified, too.

But we may have to experiment with various approaches. Note, b.t.w., that we have agreed to have a topic call on this, so it is probably better to wait until that call before you propose a PR.

@selfissued
Copy link
Contributor

Creating another parallel normative representation of the spec in RDF is an unnecessary large body of work and will almost certainly result in inconsistencies between the actual spec and the RDF. Therefore, we should decide not to do any RDF work.

If the spec text is unclear in particular ways, we should absolutely make clarifications in the normative text. Trying to do this work by creating a parallel description in RDF is unhelpful and unnecessary.

@OR13
Copy link
Contributor

OR13 commented Dec 15, 2020

@selfissued should we use any kind of security tooling, like CDDL, JSON Schema, SHACL or should we go out of our way not to define the shape of the ADM in anything other than INFRA and hand crafted normative statements?

I don't care about RDF... I do care about relying on english language interpretation for security.

@selfissued
Copy link
Contributor

As stated in https://w3c.github.io/did-core/#conformance , "As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative." Therefore, there are many many normative statements in the spec that do not use RFC 2119 language. Our tests should therefore test important normative statements not using RFC 2119 language as well as those that do.

@OR13
Copy link
Contributor

OR13 commented Dec 15, 2020

@selfissued sure, my point is that writing tests for english language sucks, and writing tests for JSON Schema, CDDL, or SHACL is a much better solution! :)

@iherman
Copy link
Member Author

iherman commented Dec 15, 2020

We define the DID document serialization in, among others, JSON-LD. JSON-LD stands for Linked Data, and JSON-LD is, in fact, a JSON based syntax for RDF. We are already there.

What we are discussing (beyond the issue of properly representing constraints, which is to be done regardless of RDF) is to do the work properly, ie, if we use RDF then the vocabulary should be described in a format that the Semantic Web community expects to have. Otherwise, we would do a sloppy job.

The real issue is not RDF. The real issue is to express all the terms and their constraints in an uneqivocal way. Creating the RDF formatted vocabulary is not a real issue.

@selfissued
Copy link
Contributor

Oops - my last comment was meant to be in #384. See #384 (comment) .

@iherman
Copy link
Member Author

iherman commented Dec 16, 2020

The issue was discussed in a meeting on 2020-12-15

  • no resolutions were taken
View the transcript

2. DID Core vocab

See github issue #404.

Brent Zundel: this was raised by ivan, folks have commented.
… we have 10 minutes or so to talk about it.

Manu Sporny: ivan this is mostly, the question to you - what are the expectations here?.
… we've had discussion, the core thing we're trying to get to is we need a formal RDF vocabulary for DID core.
… what I want to make sure is that we're all aligned on what we need to do to make that happen.
… we've got a human readable part of this which is the registries to a certain degree, it doesn't have rdfa.
… the expectation is that we will probably have some rdf schema definition somewhere.
… I believe that's primarily what you're asking for?.

Orie Steele: Some related conversation on this dead PR.

See github pull request did-spec-registries#170.

Orie Steele: I linked a PR. I had a conversation with amy related to this on the did spec registries.
… it's a PR for a thing that's probably not going to get merged but there's good conversation on there.
… my initial proposal would be lets set up the html side of the vocab definition the same way activitystreams have (see thread) and suse the DID spec registries repo for it for the time being, and work towards getting stable urls behaving in the way we want them to, and then work on setting up proper redirects for the vocab after that.
… from the perspective of what is needed, it's a versioned html and jsonld files and I think we can do all that in the did spec registries and work towards a working solution there before we worry about fixing the top level urls.

Ivan Herman: I was probably not clear in what I wrote, but manu, unfortunately, this is not what I meant.
… forget about RDF.
… I used RDF as an example simply because that's what I understand well.
… the problem is that if you read the document you have a bunch of properties that are defined there and it is not really clear .. two things.
… one is (maybe it has improved) what are the properties and classes that are defined as part of the core standard and what are there as an example.
… I have said that many times.
… and it hasn't been clearly spelled out somewhere.
… and what is even worse is that there are a number of constraints that are not clearly spelled out for the reader.
… the various examples is on the top of my head the id property which in some cases can have only this and this values and in other cases can be used with different values, sometimes only did as a value, some cases you can use whatever url you imagine.
… there are the property controller which can be used on some types of objects and cannot be used somewhere else, what do they mean.
… there is a set of clear definition and constraints that are missing.
… hard to have those constraints spelled out as part of the spec is largely an editorial thing.
… what I did was spell out those constraints for the RDF side by using shacl and I realise that this was more an exercise for myself because you don't necessarily know shacl.
… in a way what I saw done by jonathan for cddl is doing something very similar.
… I am not a real expert in cddl but what I understood is describing more or less the same things.
… I don't care in which format we do that, but the spec must be clear.
… in my reading it is not..
… Apart from that (second issue), yes we define a vocab for the usage of RDF and linked data and yes for that we should have a formal vocab that describes the vocab in terms of rdf.
… that is not the same as the context file.
… it's a RDF specification or something.
… that is something that should be done, it is not necessarily part of the spec, not normative, should be present and reachable, that's something we can do very easily once the spec is out.
… that's okay, that's not the real issue.
… the real issue for me is the proper specification.
… I think that amy began to think about doing some kind of table representation of the terms to express the various constraints.
… that can also work for me.
… it's up to the editors.

Manu Sporny: that is helpful, thanks ivan.
… there are a couple of things that have happened since your review that have improved things.
… we're in the process of really nailing down the ADM and the representations, so hopefully it's far more clear what you can have associated with each property across all representations.
… that's one level of improvement, not done we need another pass.
… hopefully that addresses one of your main concerns.
… the other thing we've got now, jonathan put all that work into the cddl stuff os we have a syntax enforcement mechanism now for things that support it, like json, jsonld and cbor.
… we have that shacl/json schema mechanism but in this case we're using cddl to do that.
… the third thing we have are the jsonld contexts which are in flux but we're getting them shaped up.
… the only missing item is a good rdf vocabulary.
… something that more of the Semantic Web folks could utilise.
… and we should probably look at gregg kellogg's tool that converts csv files into full blow human readable vocabs plus the rdf schema stuff in turtle format and whatever different formats we want.
… I feel like that's the last missing thing.
… that can be autogenerated into the RDF things.
… with all of those things in place my hope is it addresses all the concerns you raised.
… I just don't know if it would.

Ivan Herman: See Draft RDFS Vocabulary.

Ivan Herman: See Draft SHACL version.

Ivan Herman: as I said I am perfectly fine to use cddl for that. If that's the way we do it, that's fine..
… the only question I have and maybe it's an obvious answer, is the cddl part of the normative spec?.

Manu Sporny: no.

Ivan Herman: it should be.
… that's the major point.
… it's the definition of all the constraints in a clearly defined way.
… this is one point.
… the rdf vocabulary I did something back then which is already on our wiki page. this has to be updated because we don't have the whole thing.
… it was not perfect at that point.
… but that is there and as I said if the cddl is normative then this rdfs vocab doesn't have to be normative.
… it is perfectly fine to have it .. it should be properly doen but it's not something that holds up the CR.
… and we can start if from there.

Jonathan Holt: the cddl spec itself is normative and it extends the abnf rules, it is a normative description of the constraints and type definitions.
… which is what I think the rdf attempts to do with symbol representation for concepts.
… as ascii strings, it representation something.

Manu Sporny: Editor's note: It is not normative right now... it would take a lot of work to get it to be normative..

Jonathan Holt: I think cddl tries to not boil the ocean in that way, it really is a constraint satisfaction which is mathematically proven.
… you can do the closures to make sure it captures everything you hope.
… I've demoed that before.
… you can generate examples using cddl which I've found really helpful.
… cddl does have a lot of power to help along the spec and really help constrain it into what types are allowed.
… as we've been finding there are some challenges when you go into an abstracted version of that.
… still a challenge with numbers.
… the number representation in ipld is still challenging to represent when you have limitations with certain representations.

Michael Jones: I push back on the idea that rdf is necessary or even useful.
… if we do that it would be one more parallel representation of what we already have normatively in the spec.
… if we believe the spec is not sufficiently clear in places that ivan described we should make the spec clear.
… we shouldn't fall back to trying to maintain and create a separate parallel representation of the spec.
… I think we should do no rdf work.

Brent Zundel: if we could have those comments occur in the issues I think that would be best.
… issue 404.

@iherman
Copy link
Member Author

iherman commented Jan 5, 2021

As discussed on the call on 2021-01-05, current actions are:

@iherman
Copy link
Member Author

iherman commented Jan 12, 2021

@msporny I do not think this needs a special call any more. Should we remove the label?

@msporny msporny removed the needs special call Needs a special topic call to make progress label Jan 12, 2021
@iherman
Copy link
Member Author

iherman commented Jan 18, 2021

Admin comment in the issue status.

With the merge of #536 (and closure of #528) I would think there is no reason to keep the pre-cr-2 flag; we should remove it.

However, I would prefer not to close this issue, because w3c/did-spec-registries#182 and w3c/did-spec-registries#183 are also related to it; in other words, we should close the current issue only if those two PR-s are, in one way or other, handled and settled.

@msporny
Copy link
Member

msporny commented Jan 18, 2021

I would think there is no reason to keep the pre-cr-2 flag; we should remove it.

I'd rather not remove the flag as that's how we're keeping track of all open issues for this spec... we could downgrade to p3, or more preferably, get 182 and 183 merged so we can close this issue.

@iherman
Copy link
Member Author

iherman commented Jan 19, 2021

I'd rather not remove the flag as that's how we're keeping track of all open issues for this spec... we could downgrade to p3, or more preferably, get 182 and 183 merged so we can close this issue.

  • Using p3 is also an alternative
  • Merging the two PR-s on the spec registries is o.k. with me. We do have some open issues on those (mostly choosing the right URIs), and there will be refinements on the vocabularies and/or schemas (I already have some minor changes on my machine, and I also plan to work on a shex version of the JSON-LD constraints, as an alternative to SHACL) but those improvements can be made afterwards. After all, those vocabularies are not required to be final in all their details for CR.

I let the higher authorities decide :-)

@iherman
Copy link
Member Author

iherman commented Feb 8, 2021

With the merge of w3c/did-spec-registries#183 I believe this issue can be closed. @msporny ?

@msporny
Copy link
Member

msporny commented Feb 8, 2021

Agreed, closing.

@msporny msporny closed this as completed Feb 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants