Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is being discussed in issue 4 (clarification of TERMX via use-cases, spec pointers, and PR) #190

Closed
ewelton opened this issue Feb 12, 2020 · 48 comments
Assignees
Labels
discuss Needs further discussion before a pull request can be created pending close Issue will be closed shortly if no objections

Comments

@ewelton
Copy link

ewelton commented Feb 12, 2020

Issue #4 is about the label for a concept that is not "optimally clear" and could benefit from use-cases or other clarifications. We will denote the selected term for the relationship as TERMX so that connotations around the meaning of TERMX don't flow into the discussion. If we called it controller, delegate, or cryptographic megatronix overlord then we might find ourselves drifting into confusion based on our preconceptions of just what a "cryptographic megatronix overlord" can, or ought, or ought not be able to do.

The opening question by @jandrieu introduces the TERMX concept as follows:

The third role has been called "controller", to the dismay of some. This role is an entity capable of exercising one or more authentication capabilities (as specified in the authentication property of the DID Document). This authentication demonstrates the entity's legitimate authority to act on behalf of the DID Subject. In particular, this authentication to act on behalf of the DID Subject does NOT include the ability to change the DID Document. (If an entity has both the authority to act on behalf of the DID Subject and to change the DID Document, they simply have both roles).

The design goal, as discussed to a point of consensus at TPAC, is to support limited-use keys for Controlling a DID Document with more frequently used keys for authentication. This approach limits the exposure due to key compromise from authentication, which is anticipated to be a more frequent activity across more parties. It also supports use cases where control of who gets to authenticate on behalf of a DID is restricted (for example to an HR department), while the use of authentication can be exercised by a larger yet controlled set (for example to employees in a particular group).

The next challenge is to associate the above with specific elements within the spec. It is not yet clear exactly what elements in the spec refer to TERMX, but this will be clarified in the upcoming PR.

In addition, it is not clear that there is any essential use-case documented beyond one-line assessments such as the HR reference above. Several additional use-cases were briefly suggested throughout the discussion but it is not clear if there is a specific use case in the use-case document that expresses the need for TERMX clearly and focally.

A theme in the discussion was the tension between VC-mediated TERMX configuration and TERMX configuration in the DID/DID-Doc itself. It does appear clear that TERMX could be done in either place, and probably is done with VCs already, and that there are proponents both for and against TERMX in each place.

TERMX behaviour, in the context of issue #4 focused on authentication, but it is not clear if it pertains more broadly to "verificationRelationships" - as per this note from the spec:

NOTE: Verification methods
A public key is just one type of verification method. A DID document expresses the relationship between the DID subject and a verification method using a verification relationship. Examples of verification relationships include: authentication, capabilityInvocation, capabilityDelegation, keyAgreement, and assertionMethod. A DID controller MUST be explicit about the verification relationship between the DID subject and the verification method. Verification methods that are not associated with a particular verification relationship MUST NOT be used for that verification relationship. See Section § 7.4 Authentication for a more detailed example of a verification relationship.

It is also not clear whether TERMX is related to

Regarding cryptographic key material, public keys can be included in a DID document using, for example, the publicKey or authentication properties, depending on what they are to be used for. Each public key has an identifier (id) of its own, a type, and a controller, as well as other properties that depend on the type of key it is. For more information, see Section § 7.3 Public Keys.

It is not clear if the word controller here, is related to TERMX

  "authentication": [
    "did:example:123456789abcdefghi#keys-1",
    "did:example:123456789abcdefghi#biometric-1",
    {
      "id": "did:example:123456789abcdefghi#keys-2",
      "type": "Ed25519VerificationKey2018",
      "controller": "did:example:123456789abcdefghi",
      "publicKeyBase58": "H3C2AVvLMv6gmMNam3uVAjZpfkcJCwDwnZn6z3wXmqPV"
    }
  ],

Finally, when I read a comment like

Rather, it identifies the entity who is authorized to use that authentication means to authenticate on behalf of the subject.

it makes a lot of sense ('it' = TERMX) - but I get very confused as to how this is to be enforced, or is TERMX merely guidance? How is this substantively different from guidance that "my password is 'pa$$w0rb' and only user 'eric' is authorized to use it" (aside from the fact that it is not the passworb itself, but private-side of the pa$$w0rb-key pair"). If I advertise this guidance, then it is not clear that this "self-assertion of preferences" about proof-purposes is anything other than "helpful guidance" that can be incorporated into audit trails and other compliance statements.

There very well might be some 'standard libraries' or conventions, such that 'standard best practices' mean that implementers of services ought to play by the rules, but where do we find documentation if it is the case that there is some stronger way to enforce this "self-asserted wishlist in key use", and if that is not in the spec then perhaps the TERMX should be pulled out of the spec and packaged up with the documentation that describes how to apply TERMX?

All in all - I like TERMX - but I have a mental model in my head about where it belongs, and it turns out that the issue #4 TERMX is different from my TERMX - which is why I'm likely going to start using "cryptographic megatronix overlord" where I otherwise use the term TERMX.

Can anyone help me understand one or more of the following?
1 - specific, documented use cases demonstrating the expected use of TERMX
2 - understand where, specifically, in the DID-document and spec TERMX applies
3 - how to determine when TERMX is better done with a self-asserted VC and the inherent semantic richness, vs. when is TERMX appropriately done "in the DID-Doc"
4 - how TERMX relates, if at all, to the method and requirements upon method implementers

if the conversation grows, we can split this into multiple threads - but perhaps, by then, we'll have a PR and consensus on the the name of TERMX.

very interested in hearing more about TERMX, and perhaps this will help clarify TERMX for others.

@jandrieu
Copy link
Contributor

jandrieu commented Feb 12, 2020

@ewelton Thanks for kicking this off. Let me identify a few use cases then get into some of the questions you raise.

Use Cases for Authentication

  • UC.A0. Control
  • UC.A1. Hot & Cold Storage
  • UC.A2. Team Authentication
  • UC.A3. Guardianship
  • UC.A4. Signing
  • UC.A5. Assertion
  • UC.A6. Delegation
  • UC.A7. Pre-rotation
  • UC.A8. Scoped Updates

For anyone chiming in on this, please either speak to one of these use cases, or add your own, with an appropriate new UC.A* number so we can refer to these explicitly in the thread.

UC.A0. Control

Mike wants to enable additional proofs to be able to control the DID Document. One of these he keeps an offline as BIP39 passphrases. Another he shards and distributes to recovery advocates: his bank, his mom, and his best friend. All of these keys, as well as whatever control method supported by the method share the ability to update the entire DID Document. _Please note that this is an odd requirement that several current DID methods would find difficult or impossible to support. However, we have had people describe control as the scope of use for the "authentication" section.

UC.A1. Hot & Cold Storage

Ava uses a hardware wallet to store her primary keys offline (cold storage) and a cloud-based wallet for frequent transactions (hot storage). To minimize the exposure and process fatigue from regular use of her primary keys, she uses cold storage for controlling her DID and DID Documents while using her hot storage for authenticating into online services.

UC.A2. Team Authentication

Millie, Roxanne, and Anna are all members of the chaperone support club at Merriweather High School, which is administered by Mitzi, a teacher at the school. The club supports various extra-curricular activities such as field trips, dances, and sporting events to provide appropriate parental oversight. Every member is vetted by the school, including a background check and interview before being formally approved as a chaperone. Many events require a chaperone for specific interactions, such as accessing locker rooms. Any member of the chaperone support club can authenticate as a chaperone and open the locker room door. However, none of the chaperone club can change the list of authorities in the "authentication section". In this case, the authentication as a member is not the same as authenticating as the club. The club members have the privilege of authenticating on behalf of the Subject, but they are not the Subject.

UC.A3. Guardianship

A Guardian Paige has a court-appointed relationship with the Subject Jaqueline, her foster daughter. Paige is not Jaqueline and does not have the authority to control the DID Document, which is under the control of the Ventura Superior Court. When Jaqueline had trouble in school, Paige signed an application for transfer on her behalf.
I welcome a better example here. I was stretching a bit. In particular, this feels better handled by a VC from the court presented whenever Paige needs to assert her legal guardianship.

UC.A4. Signing

Abby uses a DID for all of her corporate contracts. She signs such contracts cryptographically. Her signature indicates that she has read the contract, understands and agrees to its terms and conditions, is legally authorized and mentally competent to do so, and upon completion of countersignatures (if any) does thereby enter into the contract.

UC.A5. Assertion

Constance uses a DID for signing parental notes for her daughter Margaret's teachers, such as a note that Margaret has a doctor's appointment and will need to be picked up at 10am on Tuesday the 12th of January. She creates these notes as Verifiable Credentials and signs them such that the school can verify the signature using the DID already on file with the School's registrar.

UC.A6. Delegation

Anastasia has a digital capability for accessing her bank account information, issued to a DID under her control. When invoked, this "initial" capability authenticates her as acting on behalf of the DID so the bank can allow her access to all the functionality of her account. She later delegates a read-only capability to her CPA so that he can inspect her banking transactions, but can't initiate any. When she creates that delegated capability from her initial capability, she signs the delegated capability such that the bank can verify SHE made the delegation, using the cryptographic material specified in the DID Document.

UC.A7. Pre-rotation

@SmithSamuelM Do you imagine pre-rotation being facilitated by an entry in the authentication section? That idea crossed my mind, but I couldn't get too far in a write up of how that would work without feeling like I was making stuff up. If it isn't in the authentication section, is there anything in the DID Document to support pre-rotation?

UC.A8. Scoped Updates

Isabelle has an information fiduciary, Jimmy, who she needs to be able to authenticate on her behalf, and is legally bound to do so only in case of her death. Because Isabelle has incredible foresight, she also authorizes Jimmy to update his cryptographic material in her DID Document. At some point in the relationship, Jimmy gets enamored with a new DID method and updates his entry in Isabelle's document to point to a new, different DID using his new preferred DID provider.

The big question

The most important question, to my mind: can any of these use cases be deferred to a later layer and still effectively be realized, if desired at all. For example, if we accept an external delegation mechanism we can probably reduce the complexity of the DID Document without loss of capability. Ideally, that mechanism should be open ended rather then a specific mechanism prescribed. Of course, some of these use cases may just be better off unsupported.

Partial language already in the spec

Note that the spec does call out "Verification relationships" in section 7.3 https://w3c.github.io/did-core/#public-keys. Unfortunately, that section refers to an explanation of verification relationships in the "Authentication" section which doesn't actually exist. Given that paragraph, I believe these "verification relationships" are meant to specify the kind of scope distinguished by these use cases (speaking broadly) and to be specified for each entry in "Authentication".

I believe this was the intention of the latest iteration of that section: each "authentication" entry would include the scope of the authentication as a "verification relationship". However, that did not flow through to the rest of the language and "verification relationship" is not a property in any part of the DID Document. Personally, I think this would be the lightest lift to address these use cases: make verificationRelationship or authenticationPurpose or some other term a property of entries in the authentication so that a relying party can understand the valid scope of operations appropriate for that authentication mechanism.

Method specific versus method-independent authentication

One important distinction I want to make with the language in the current spec is that all of the language describes how authentication and delegation work with respect to the DID Method's CRUD operations. However, only UC.A0, UC.A6 and UC.A7 suggest method-dependent behavior. The other use cases are method-independent. In particular, the spec states that "Each DID method MUST define how authorization and delegation are implemented, including any necessary cryptographic operations." It then lists the controller pattern and capabilities. However, there is nothing keeping someone other than a Method implementer from building a cryptographic delegation strategy completely independent of the Method. In fact, I would recommend such a separation using something like zCaps or even Verifiable Credentials that assert a delegation. We should clarify the DID Method requirement as stated is ONLY for authorization and delegation for DID Method operations.

In particular, every conversation I've had about DID-Auth assumed that once a DID was resolved to a DID Document, that authentication proceeds without further interaction with the DID Method. If that is off-base, please clarify.

Other questions

1. If we solve this generally, is "authentication" the right name for the property?
At the moment, I'm leaning toward delegationAuthorities.
2. Are there sufficiently high usage delegations that we want to embed them in the DID Document simply because a subsequent delegation invocation is too expensive at run-time?
Authentication, Signing, Assertion, and Delegation may be both frequent enough and common enough to justify the overhead.
3. Do we need to support both direct listing of cryptographic material and other DIDs?
If we are supporting any authentication (for any purpose) by an entity other than the controller, we probably want to support DIDs to enable the delegate to rotate their keys as necessary without requiring a DID Document update.
4. What do we name the property that scopes a given authentication entry?
_XXXPurpose is my inclination, where XXX is based on the decision in question 1. If we stick with "authentication" then "authenticationPurpose". If we switch to "delegationAuthorities" or "delegations", then "delegationPurpose".

Note to @philarcher, depending on how this thread resolves, we should pick some of these use cases to add to the UCR document.

Eric's specific questions

Can anyone help me understand one or more of the following?
1 - specific, documented use cases demonstrating the expected use of TERMX

Hopefully the above use cases provide sufficient illustration.

2 - understand where, specifically, in the DID-document and spec TERMX applies

Except for UC.A0 Control and UC.A8 Scoped Updates, the term doesn't apply to the DID Document, it applies to interactions outside the DID Document. For UC.A0, it applies to the entire DID Document. For UC.A8, it would presumably apply only to particular properties in the specific authentication entry.

3 - how to determine when TERMX is better done with a self-asserted VC and the inherent semantic richness, vs. when is TERMX appropriately done "in the DID-Doc"

This is a huge question that deserves a longer conversation, but here is a brief synopsis. I recommend doing as minimal as possible in the DID Document to keep interoperability as simple as possible. Further I'd argue for directed capabilities instead of VCs. VCs are assertions, which require the recipient to understand the meaning of the assertions as well as any intended logic wrt chained delegations. There is no guarantee that a VC delegating banking privileges to a CPA would be understood or accepted by the bank. Directed capabilities, such as zCaps, have explicit delegation semantics and explicitly support service-specific scoping and open-ended chained delegations. For example, in UC.A6. Delegation, Anastasia's CPA can delegate the capability further to allow a particular associate at the firm to invoke the capability, all without needing to bother either Anastasia or the Bank. And since the bank created the initial capability in the first place, they have seamless control over the scope of privileges that can be delegated, such as read only, deposit only, daily transaction limits, etc.

The short rule of thumb for capabilities versus credentials is that credentials are useful when presenting an assertion made by the issuer to another party. Capabilities are useful when triggering actions at the service provider who issued the capability. In the credential-based, self-asserted delegation as you described, the bank will ultimately be the "decider in chief" about whether or not your self-asserted delegation is something they support. And because VCs are open-ended semantically, banks are going to have a hard time accepting such free-form delegation until explicit, shared semantics are established. It's basically no better than a signed affidavit giving someone authority to withdraw funds. The bank may accept it. They may not. They may, but only if a notarized. And they may or may not accept subsequent delegations by your delegate. The bank also has a burden of identity-proofing the DID used to issue your delegated VC, which they may or may not be capable of.

Capabilities, in contrast, allow the Bank to specify exactly which scopes are delegatable, giving them control of the possible actions that are delegation-enabled in their software and the initial capability is always issued to a DID that has already been bound to your identity-proofed account. In short, they get to set the rules and the root authority of the capability, and you get to decide what you delegate to whom without operational involvement by the bank.

4 - how TERMX relates, if at all, to the method and requirements upon method implementers

I believe only UC.A0, UC.A6 and UC.A7 impact methods. Everything else is downstream, post-resolution.

@msporny msporny added the discuss Needs further discussion before a pull request can be created label Feb 12, 2020
@dlongley
Copy link
Contributor

However, that did not flow through to the rest of the language and "verification relationship" is not a property in any part of the DID Document.

A "verification relationship" describes a class of property; it is not a property itself. Examples of actual properties that are "verification relationships" include "authentication", "assertionMethod", "capabilityInvocation", "capabilityDelegation", "keyAgreement", and "contractAgreement". A DID controller authorizes a particular verification method for use for a specific purpose by associating the DID subject with the verification method via a verification relationship. These terms may also appear in proofs created by the controller of the private key associated with a verification method (as a "proofPurpose") -- clearly identifying the verification relationship that is required.

While many of the use cases above are related to the concept of authentication, creating a proof for the purpose of "authentication" more specifically means: "This proof can be used to authenticate me, e.g., prove to a particular party that I am who I say I am." This kind of proof is the sort of thing that is used to "log in" to a service or to demonstrate that a presentation of Verifiable Credentials came from an authentic holder. Creating a proof for this reason is fundamentally different from creating a proof as a method to demonstrate authorship over some assertion, e.g. "The sky is blue", that is never intended to, for example, log you into a service.

While digital signature based proofs all involve "authentication" at some level, the reason why a proof is created should be more fine grained to avoid a number of ambient authority/confused deputy problems. For example, suppose Alice signs the hash of a message that she believes is an assertion "Bob is my friend". She gives that signed assertion to Bob so he can share it to prove her relationship with him. It should be very difficult for this proof to be misused to allow Bob to authenticate himself as Alice, logging into her services. Even if Bob is able to fool Alice into signing an obfuscated authentication message that could be abused in this way, if Alice has different keys for authentication and for making assertions, then Bob's attempt should fail. Alice gets this protection by listing one verification method under "authentication" (the one she uses to log into services) and listing a different one under "assertionMethod" (the one she uses as a method for merely making assertions).

Alice only allows the software she uses to sign "The sky is blue" assertions to have access to the private key associated with the verification method that is authorized via a "verification relationship" of "assertionMethod". This protects Alice from this attack (or an honest mistake).

Similarly, many of the use cases listed above should be broken down along these lines. If Alice is going to invoke object capabilities to do things, she should list specific verification methods under the "capabilityInvocation" verification relationship for that purpose. Then software that verifies capability invocation proofs knows exactly where to look to see if the verification method referenced in the proof is authorized for that purpose. This should not be conflated with "authentication" for the reasons stated above.

@jandrieu
Copy link
Contributor

@dlongley writes:

A "verification relationship" describes a class of property; it is not a property itself. Examples of actual properties that are "verification relationships" include "authentication", "assertionMethod", "capabilityInvocation", "capabilityDelegation", "keyAgreement", and "contractAgreement". A DID controller authorizes a particular verification method for use for a specific purpose by associating the DID subject with the verification method via a verification relationship. These terms may also appear in proofs created by the controller of the private key associated with a verification method (as a "proofPurpose") -- clearly identifying the verification relationship that is required.

As is often the case, we have some teasing to do to align our words (everything in a JSON object is a property), but I'm fairly sure we are on the same page with the underlying approach, modulo where verification relationships are specified.

Are we in agreement that, as currently written the spec fails to illustrate how verification relationships other than "authentication" are expressed?

On my read, there is nowhere in a DID Document that allows a controller to specify the verification methods to be used for, "assertionMethod", "capabilityInvocation", "capabilityDelegation", "keyAgreement", and "contractAgreement".

There is an omnibus "authentication" property that is referred to in Section 7.3, but which has no specific way to do anything other than "authentication". Which I believe is the point of your last sentence about avoiding conflating authentication with capabilityInvocation.

Where does a controller "authorizes a particular verification method for use for a specific purpose by associating the DID subject with the verification method via a verification relationship."
Or with your other example, how does the controller express "a verification method[] under the "capabilityInvocation" verification relationship for that purpose." What property in the document is used for this capabilityInvocation verification relationship? Because there is no "capabilityInvocation" property, nor a verification relationship property.

Since this has been discussed in the context of the authentication property, my interpretation is that the "authentication" property is ill-defined and would be better called "delegations" or, perhaps if we adopt your language "verificationMethods", which would list the appropriate verification methods and their associated verification relationships. The structure of this is not important at this stage, as much as understanding what we MUST be able to express.

As it reads today, there simply is no discussion how how a DID Document expresses any of the verification relationships listed in section 7.3 other than "authentication".

@dlongley Have I missed something in the spec or are we on the same page with that starting point?

@dlongley
Copy link
Contributor

dlongley commented Feb 12, 2020

As is often the case, we have some teasing to do to align our words (everything in a JSON object is a property)...

Yes, we're just miscommunicating. The term "verification relationship" is not a property in a JSON DID Document. But "authentication", "capabilityInvocation", "capabilityDelegation", "assertionMethod", etc. are properties. They just fall under one class of properties, referred to as "verification relationships".

Are we in agreement that, as currently written the spec fails to illustrate how verification relationships other than "authentication" are expressed?

Yes, we are lacking in examples and have failed to explain that, in the examples we do have, that "authentication" is just one possible verification relationship (i.e., one possible property of this class of properties). Clearly, the spec needs to elaborate on this and this has come up in a number of different issues both in this repo and elsewhere in the community.

An example did:v1 DID Document includes other verification relationships:

{
  "@context": ["https://w3id.org/did/v0.11", "https://w3id.org/veres-one/v1"],
  "id": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3",
  "authentication": [
    {
      "id": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3#authn-1",
      "type": "Ed25519VerificationKey2018",
      "controller": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3",
      "publicKeyBase58": "4PKSV6Q4ags8ceqNBuP9xhgJht6rzSXaCJnA1Uxsv54M"
    }
  ],
  "capabilityDelegation": [
    {
      "id": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3#delegate-1",
      "type": "Ed25519VerificationKey2018",
      "controller": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3",
      "publicKeyBase58": "9hW8DYWJZzAdBDTTFFpo2zfXBC3fY4oDizQ46RZkTJkc"
    }
  ],
  "capabilityInvocation": [
    {
      "id": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3#invoke-1",
      "type": "Ed25519VerificationKey2018",
      "controller": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3",
      "publicKeyBase58": "EsP4zycDtPDT2K6YDnu2LS1hWFsyZDyLXrH6xkGkaW5s"
    }
  ],
  "assertionMethod": [
    {
      "id": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3#assertion-1",
      "type": "Ed25519VerificationKey2018",
      "controller": "did:v1:test:nym:z279wbVAtyvuhWzM8CyMScPvS2G7RmkvGrBX5jf3MDmzmow3",
      "publicKeyBase58": "EsP4zycDtPDT2K6YDnu2LS1hWFsyZDyLXrH6xkGkaW5s"
    }
  ]
}

As does the did:btcr example @ChristopherA just provided via the mailing list recently:

{
  "@context": "https://w3id.org/did/v0.11",
  "id": "did:btcr:xul5-9rzp-q3xh-z4l",
  "publicKey": [{
    "id": "did:btcr:xul5-9rzp-q3xh-z4l#satoshi",
    "controller": "did:btcr:xul5-9rzp-q3xh-z4l",
    "type": "EcdsaSecp256k1VerificationKey2019",
    "publicKeyBase58": "dfbXB9ZCgDYGTviaKxY4B5FDV52RKx6MZCYb2QPHVfHG"
  }, {
    "id": "did:btcr:xul5-9rzp-q3xh-z4l#vckey-0",
    "controller": "did:btcr:xul5-9rzp-q3xh-z4l",
    "type": "EcdsaSecp256k1VerificationKey2019",
    "publicKeyBase58": "dfbXB9ZCgDYGTviaKxY4B5FDV52RKx6MZCYb2QPHVfHG"
  }],
  "authentication": ["#satoshi"],
  "assertionMethod": ["#vckey-0"]
}

On my read, there is nowhere in a DID Document that allows a controller to specify the verification methods to be used for, "assertionMethod", "capabilityInvocation", "capabilityDelegation", "keyAgreement", and "contractAgreement".

The spec does say this, but very poorly (and thus it can be argued that a reader wouldn't think it does say it as you have). It should be clear that you specify these the same way you do "authentication" -- as you can see from the examples above. We're very much on the same page that this needs to be remedied.

@jandrieu
Copy link
Contributor

jandrieu commented Feb 12, 2020

Ok. I now understand your point that verification relationships are a class of property, but I had to look up the @context to figure it out.

EVERY single entry in the @context needs a verbal description in the spec explaining what it means and why.

It's a bit crazy that we pull in terms like "capability", "capabilityAction", "capabilityChain", "capabilityDelegation", etc., without explaining what those are or where they are used in the prose of the specification.

It's hard enough to get people to read the spec. We can't expect them to ALSO pull up https://www.w3.org/ns/did/v1 and try to make sense of it.

@dlongley
Copy link
Contributor

@jandrieu,

EVERY single entry in the @context needs a verbal description in the spec explaining what it means and why.

It's a bit crazy that we pull in terms like "capability", "capabilityAction", "capabilityChain", "capabilityDelegation", etc., without explaining what those are or where they are used in the prose of the document.

Agreed, but it's less "crazy" are more the result of resource starvation and other blockers of that sort. I think you would be hard pressed to find someone who thinks we should not be elaborating completely on this stuff in the spec.

@ewelton
Copy link
Author

ewelton commented Feb 13, 2020

@jandrieu @dlongley this is fantastic! This discussion addresses one of the key areas where I think people are broadly confused. I've taken up some of this in #193 .

I've opened the dialogue about capabilities in #194 - as per Joe's comment

This is a huge question that deserves a longer conversation, but here is a brief synopsis. I recommend doing as minimal as possible in the DID Document to keep interoperability as simple as possible. Further I'd argue for directed capabilities instead of VCs

With regards to

It's hard enough to get people to read the spec. We can't expect them to ALSO pull up https://www.w3.org/ns/did/v1 and try to make sense of it.

I would not think of it as an 'ALSO' - the sort of material we are talking about does not belong in the spec. Without some sort of external expansion point, we will force the spec to generically model all possible key use cases for all classes of people, organizations, and things, and all situations - and once we have "mapped the future of authentication technologies and DID uses" we can wrap up the spec and people can being implementing. The thought of modeling "verification relationships" and "verification methods" and "supporting material" was already done, I just think the community didn't understand it well - and when you factor in the LD-allergy there was a need to "re-invent" the solution.

To me, the pre-AMS model looked like this (in pseudo-grammar)
<verification-relationship> = ( <verification-method>.... )*
where verification-relationships were defined in @contexts which gave a handle around which institutions could say "I support this" - which gets to the zCap/VC issue above (see #194). For now, the confusion around how keys are related to DIDs (#193) feels like it is being turned into some version of this:
"authentication" = ( <public-key> + list-of-verification-relationships-supported )*
because the group primarily talks about authentication, is biased in favor of only using key-pairs for verification, and is seeking some sort of global, centrally managed, corporate controlled, register of allowable proof-purposes.

Syntactically the above are equivalent - personally, i found the pre-AMS context-driven syntax much more elegant and powerful, but I that is just personal taste - ultimately you can convert the data between these formats (and you could fold in the publicKeys field to this as well) systematically - each representation format is just a different version of the same abstract data.

In terms of 'method specific' delegation issues - i think we can table that for the time being, because those are, to me, more relevant when we are talking about enforcement. I think I saw something from @dlongley about how, in general, a relying party could choose to honor a public key, verification relationship, etc. but this is not related to the method. In short, because the proof purposes beyond "alteration of the DID-document" have no contact w/ the method, but only with remote 3rd parties operating on the honor-system, i question their participation in the DID-core relative to controller-asserted general-purpose information about the subject (see #194).

This leads us to the use cases - and 1000x thank you @jandrieu for this roadmap - it is excellent, and precise, and I hope we can use these to drive a productive group discussion of the whole 'verification relationship' and 'in-DID/out-of-DID capabilities' - which is why I want to run through them in summary, and suggest some extensions/classification.

Out of 'digital exhaustion' (in the sense of tired fingers) I will defer this a short bit - and likely open them as a separate issue.

@ChristopherA
Copy link
Contributor

I'll add to the mix that I believe that the authority(ies) that can deactivate a DID and/or rotate key lists in a DID is not a 'controlling' authority, it is something else.

Way back when (pre-anti-"ownership" thoughts), there were two explicit keys. A proof of control of a private key associated with a public DID Document, and then there was ownership of the DID (a bad name for this now but I don't really know what replaces it in current DID spec), which was proof that you could deactivate/rotate a DID, and this key was explicitly NOT a controller key.

In the original docs of the BTCR method (the first proof-of-concept of a blockchain-based DID Method), the controller was the private key in the spent blockchain transaction which could sign and thus authenticate a DID Document, and the owner was the holder of the hash-protected key (i.e. no public key that is public, only hash of that public key) in the unspent transaction that could only be used to revoke or rotate that spent transaction control key.

Though we never prototyped it, our thoughts were also that you could have a multisig key that could conditionally-deactivate a control key (a social group that watches out for my bad behavior), which then another multisig key (my cold keys or social recovery group) could be used within 48 hours to rotate the control key to new ones else the original key was permanently deactivated.

We've somehow lost good names/differentiation for these along the way, and EACH may have TERMX forms. You can TERMX the proof of control, and you can TERMX the capability to revoke, and TERMX the capability to rotate.

Another reason why I'm worried about TERMX is that I'm becoming particularly concerned about the security of DID Documents. As I see new proposed DID Documents from different methods grow and have more stuff in their bag, the bag becomes more correlatable. Adding lots of delegation/stewardship/endpoints/more keys, etc makes DID Documents not only more correlatable but also more censorable. I could for instance, not accept DID documents where control was being delegated to a party I didn't like.

Related, it isn't clear to me now what is the consumer of a DID Document — a DID Resolver needs different things than an app that just wants just enough keys to be able to do a proof of control of a Verifiable Credential or a DIDAuth/Comm connection. In fact, that app likely doesn't want to deal with endpoints either, and just wants the resolver to just give it the unrevoked/valid keys that it needs, and no more.

-- Christopher Allen

@dhh1128
Copy link
Contributor

dhh1128 commented Feb 24, 2020

I'm just adding a comment to see if I can get github subscriptions to work better for me. (I subscribed to this issue almost 2 weeks ago but have seen no notifications of any comments. Today when I came back and saw how much discussion has occurred, I was surprised. Github, please start telling me about this issue! :-)

@msporny msporny assigned jandrieu and unassigned ewelton Feb 27, 2020
@jandrieu
Copy link
Contributor

The question here is how a DID Document specifies what kind of authorizations/verificationMethods/delegations.

There are at least three different classes of such:

VerificationMethods

@dlongley likes to call a class of such properties "verificationMethods", but there is no explanation in the spec other than that they are a thing and that there are a few. Authentication is one.

Controller property

This apparently means that the DID method should somehow treat another DID as if it were the controller of this DID. This is not a proof mechanism, but literally a DID (if I understand what is currently specified).

The actual DID Controller

DID Methods have their own ways to manage the CRUD operations on a DID Document. They may or may not expose those proof mechanisms for changing a DID Document in that same DID Document.

Related to this is a bag of public-keys which are typically referred to in the VerificationMethod properties. These keys may or may not be related to how one actually controls a DID.

HOWEVER, the actual question here was raised by @ewelton and seconded by @Oskar-van-Deventer

Eric & Oscar, have your questions and concerns been addressed?

One action, no matter where this goes is that this ALL needs a PR or three describing those these forms of authorization work with DIDs and exist or not in DID Documents.

@SmithSamuelM
Copy link

By definition, at inception, a self-certifiable identifier (some DIDs are self-certifying but not all), has one and only one public private key pair that is authoritative for that DID. The self-certification property declares a digital crypto signing scheme for the associated key pair. This declared signing scheme is the sole verification method at inception. The authoritative key-pair is unambiguous. Its authority is independent of any infrastructure. This key pair is the root authority for for that DID. It provides the cryptographic root-or-trust and the single source of truth for any authorizations that may be made about that DID. These properties are what makes it self-certifying. Subsequent authorizations may authorize other signing keys and verification methods but at inception there is one and only one. Subsequent authorizations may transfer authoritative control to different key-pairs (as in the case of key rotation) or some other root-of-trust but those subsequent authorizations can be verified independently by construction.

A DID that derives it root authority from some infra-structure besides a self-certifying key-pair is not self-certifying by definition. It doesn't matter that the DID includes the public key as a prefix. If that key pair at inception is not the root authority i.e. the root-of-trust, and single source of truth for that DID then by virtue of it not satisfying those properties it is indeed not self-certifying and derives its authority some where else. This might be registration on a block chain. In this case the block chain or registrant or some combination is the root authority (root-of-trust) and source of truth for that identifier. In this case, one has to examine the block chain to determine what is or is an authoritative authorization.

Crud operations on a DID Doc may be constructed in such a way that the chain of authority may be established or not. This is up to the method. This is the problem. Every method has a different "method" for establishing control authority by which one can verify if a given authorization in a DID Doc is authoritative or merely a hack of the DID resolver. A verifiable signature does not by itself ensure the chain of authority leading to that signing key pair is valid.

A truly self-certifying identifier, on the other hand, provides a universally unique unambiguous decentralized infrastructure independent starting point for its chain of authority. It may then later authorize infrastructure as a secondary root-of-trust which means it can change that authorization to a different infrastructure. This makes self-certifying identifiers potentially truly portable. Because there is only one root key pair at inception, the DID-Docs for self-certifying identifiers can be very simple to start.

@SmithSamuelM
Copy link

The design goal of KERI is to provide a standard universal infrastructure independent construction for establishing and maintaining control authority over a self-certifying identifier.

@SmithSamuelM
Copy link

SmithSamuelM commented Feb 28, 2020

With respect to @ChristopherA, for a self-certifying identifier, by definition, the controller of the incepting public-private key pair is the only authority at inception. There is unambiguously one controller, that is the entity in control of the private key. In contrast a block-chain (totally ordered distributed consensus ledger) derived identifier gets its authority by definition from the ledger. In this case it could have multiple authoritative keys of various types at inception, all verifiable from the initially registration event in the block-chain ledger associated with whatever transaction scheme was used to create the identifier and register it on the block chain. The registration process determines the set of authoritative keys for that identifier. I believe that this type of identifier is implied by your comments above. What this means is that the ledger and registration scheme become the effective certificate authority for that class of identifier. The identifier is locked thereby to the ledger from which it derives the authority for all its key pairs. One could as your describe thereby at registration declare various keys for various authorized functions because the ledger registration event defines what is authoritative or not. The entity that is first to register an identifier is this sense the "controller". In this case no key-pair need be any more or less authoritative with respect to the identifier because the registration event can declare as many or as few or as many types as is desired. The ledger is the logically centralized source of truth and therefore a centralized authority despite being governed in a decentralized way.

In contrast, because a self-certifying identifier, by definition has one and only one authoritative key pair at inception there is one and only one controller of a single key pair at inception. No other key pairs can be authoritative. The construction of the identifier binds one and only one key pair to the identifier. With self-certification, by definition, any other key pairs may only be authorized by virtue of a later signed authorization with that incepting key pair. A subsequent authorization could transfer control (rotate) the incepting key pair to another key pair or a multi-sig set of key-pairs. But when or where this is done is at the discretion of the incepting controller. The advantage of this approach is that it allows the normalization of all authorizations as belonging to a chain of key events rooted in the inception event of the self-certifying identifier with a root control authority. This then means that this normalized set of key events may be provided merely with an immutable log that may be supported by differing infrastructures over time (including ledgers). As long as one consistent copy of the log exists, any reader of that copy may verify the chain of authority. This provides fully portable and therefore more completely self-sovereign identifiers than identifiers that by construction are ledger locked.

@ewelton
Copy link
Author

ewelton commented Feb 28, 2020

@SmithSamuelM thank you for that clear presentation of the self-certification challenge. I had originally thought of the rubric project as being a "report card" that could model exactly that sort of trust-consideration. For many use-cases, being registered on a block-chain, with some linkage between the identifier and the genesis document is "good enough" and "a definite improvement" - and, largely, these are the use-cases I see moving forward with the greatest vigor. By the time we have perfected DIDs, digital identity will be managed by private-corporate wallets with interoperability worked out as meat-management contracts between executives.

What I'm not clear about yet, in my head, is the degree to which we are blocked in achieving key-based (not infrastructure based) authority using current registries (like bitcoin, eth, your-local-chain,etc.) I'm not convinced yet that we can't layer what we need on top of just about any infrastructure - but I also can't convince myself that we can always do that - it is just a limitation in my understanding, and one I hope to clear up.

I need to clear it up because I see DIDs as becoming very secondary at this point. If we can layer it across any infrastructure, and we can access that infrastructure with URLs (w/o any intrinsic binding to the DNS), and if we can use those access points for VCs, then why use a DID at all?

The TERMX issue is still not clear to me - it is controller, and verificationMethod, and authentication, and delegation (the selected term in the PR, yes?) - and there is now this strange hodge-podge of registry-defined proof-purpose (or should I say delegation?) sections - these remain poorly defined and unclear. I understand the idea behind them, my concern is about placement.

We went from a decentralized extension model to a centralized, dictatorial model where we now have to wait to find out what sort of "model of the cryptographic universe" we must have in order to use DIDs - likewise, the first D in DIDs is intensely political - so the barrier to DIDs is now "I must have the politics and the right model of my cryptographic universe" - DIDs are only interesting if I fit that pigeonhole. DIDs are not a "new form of identifier" - they are limited in scope by politics and world-view.

The use cases above - i actually started an analysis of them - but have just been too busy to complete it. I really appreciate that work @jandrieu and I want to come back w/ something equivalent, i've just been struggling for time. I still do not see them as representative of what I perceive to be the space-of-dids - most of which would be devices and artifacts, and very, very, very few would be "public coordination and correlation points" like you would use for coordinating the chaperoning of a high school dance. Consider that use case relative to "Jerry the janitor will have keys to the rooms - call him if you need access" - the sort of solution that humans have employed for quite some time, warts and all.

What i was hoping to do was look at text like this

Constance uses a DID for signing parental notes for her daughter Margaret's teachers, such as a note that Margaret has a doctor's appointment and will need to be picked up at 10am on Tuesday the 12th of January. She creates these notes as Verifiable Credentials and signs them such that the school can verify the signature using the DID already on file with the School's registrar.

and figure out why the DID-core document is the appropriate place to have a TERMX field, and which TERMX fields would be involved. In the above use case, I do not think that "complex TERMX modeling" in the DID-core is warranted - rather, simple, minimal cryptographic information in the DID-core is required - and credentials matter. Note that I can do the credentials w/o DIDs, effectively saying that the above use case may not benefit from DIDs at all. I see nothing in the above use case that tells me about essential properties of DID-core, the part that needs to be modeled so that it can be in a lawyer's Adobe Inc. produced PDF.

Likewise, consider Coquelicotizers like myself (w/ a nod to @dlongley for the gift of Coqeulicot!)... i do not understand the need for DID-core to say "for the purpose of 'Coquelicotification' you can use any of the following keys" - what I do not understand is why I have to wait for someone in a global standards group to tell me whether or not Coquelicotification is a valid thing I ought to be doing, or if I am simply living my life incorrectly and fundamentally committing sins against Allah by Coquelicotizing? If I identify as Coquelicot and you don't like it, then damn it, I'll reach into my american roots and invite you to a civilized dinner where you, me, and Mr's Smith, Wesson, and Kalashnikov can discuss it politely while you fix your thinking.

When faced with these sorts of religious concerns my fallback is always "let the user decide" - and, in this case, let the did-controller say whatever they f***ing want about the subject. The role of this technology is not to proscribe the conversation, it is to protect the conversation and to document the cryptographic provenance. We are protecting and encouraging free speech, not dictating and limiting it (e.g. the registry approach).

In supporting the right of the controller to talk to the world we need some mechanism to advertise feature sets - if my Coquelicotizing buddies and I want to collaboratively Coquelicotize then we can advertise this to each other with the Coquelicotization support clearly indicated in our DID-doc - and we can advertise to the world what it means to be a Coquelicotizer by publishing some sort of.... oh, what's the word..... "context information?" in a place well known to Coquelicotizers. Right now I've got to get some whitebread in Europe to accept my intense desire to Coquelicotize and agree that i am not offending Allah - W3C Ahkbar!

What has happened is that we are not sure if our coquelicot peccadillo will or will not wind up in the spec - so we opt for a "companion document" - the expression, by the did-controller, about the did-subject, that relies on the minimal linkage with DID-core - namely some sort of public key tied into a root-of-trust infrastructure w/ properties ranging from 'self-certification' to 'according to walmart'. If I can get to that identifier/key binding in some way, and I have a companion document, then I am free and I can ignore the deliberations about TERMX, and that is exactly what is happening. If W3C resists my desire for Coquelicot we'll just use the companion doc - but the heart wants what the heart wants and I will Coquelicotize until the red corn rose drains from my veins.

The confusion over TERMX dictates that developers "back off from DIDs for a while, let W3C determine what sort of cryptography model we can have, and then evaluate whether or not W3C guessed right about our current and future needs" - in the meantime, focus on a "companion control document" that sits next to the DID, and describes the roles of the keys in the DID, service endpoints, and whatever else. The DID-companion document is the set of assertions a key-controller can make and link w/ DIDs - and the only link w/ the DID-document is the publicKeys section - once DIDs are perfected and in broad adoption with a stable spec - that is the time to step back and say "can I move some of the companion-information into the did-core and would that help me in some way?" - perhaps moving it from the companion-control document to DID-core would aid in interoperability - but we are 10-15 years away from that problem, and I think that traditional IAM and PKI w/ mild "SSI-like" enhancements will have effectively captured the opportunity during our deliberations.

So "delegation" - the value that #4 gave to TERMX - remains unclear. The concept is sensible, but the current implementation in DID-core and the relationship to the subject/controller nexus remains confusing. It is this confusion, and not the ability to include service_endpoints in a PDF document, that effectively blocks DID from useful adoption. It makes sense now to "see what happens w/ DIDs" but focus on VC ecosystems based on URLs and, to the extent possible, self-certifying identifiers rooted on traditional infrastructure - and once the W3C determines whether or not Coquelicot is a valid constituent of my rainbow, we can evaluate the cost/benefit tradeoff of migrating some portion of the "companion document" into the DID-core.

Based on TERMX, I feel the only professionally responsible path open is to suggest "follow DIDs", but focus on a simpler architecture based on root-of-trust principles, like self-certification, combined with the ability of a controller to express whatever they f***ing want about an identifier.

@ewelton
Copy link
Author

ewelton commented Feb 28, 2020

@SmithSamuelM thank you again for a very clear and precise expression of the relationship between inception and rotation.

I'm still struggling a little though and perhaps we could clarify one item:

is it possible to create a SCID (Self-Certifying IDentifier) at http://<dns-root>/<identifier> such that the document returned could be trusted?

clearly I could access that URL and get back a "I Has Cheezeburger" cat meme - and clearly I could say "nah, no good" - but assuming I get back a document containing proof material, could I "certify" that the document is in good standing (although not necessarily current).

these conditions - the conditions under which the resolution can be verified - scope the TERMX discussion. There is little point in negotiating the details of TERMX when the root-of-trust can not be established - but there is value in discussing TERMX if we can itemize the "threat model" attached to a resolution method.

@Oskar-van-Deventer
Copy link
Contributor

Oskar-van-Deventer commented Feb 28, 2020

Eric & [Oskar], have your questions and concerns been addressed?

I am afraid so. The discussion seems to settle on "answer d): any and all of the above".
A DID can identify
-a generic bag of keys with provenance and/or
-a self-certified bag of keys and/or
-a legal or natural person that has control over those keys and/or
-a legal or natural person that is different from whoever controls those keys and/or
-a piece of electronics that can control those keys and/or
-an entity that cannot control keys, see e.g. #199 and/or
-whatever else an implementer comes up with
The semantics of what a DID identifies could even change during a DID's lifetime.

The current W3C-DID-WG DID document specification allows any-and-all-of-the-above. My question is: do we want to be able to signal in a DID document explicitly which of the above applies to a particular DID document?

@ewelton
Copy link
Author

ewelton commented Feb 28, 2020

@Oskar-van-Deventer I agree with you, but I am not happy with where we wound up

You wroteL

I am afraid so. The discussion seems to settle on "answer d): any and all of the above".
A DID can identify
-a generic bag of keys with provenance and/or
-a self-certified bag of keys and/or
-a legal or natural person that has control over those keys and/or
-a legal or natural person that is different from whoever controls those keys and/or
-a piece of electronics that can control those keys and/or
-an entity that cannot control keys, see e.g. #199 and/or
-whatever else an implementer comes up with
The semantics of what a DID identifies could even change during a DID's lifetime.

The current W3C-DID-WG DID document specification allows any-and-all-of-the-above. My question is: do we want to be able to signal in a DID document explicitly which of the above applies to a particular DID document?

And out of that - the signature point, for me, is:

My question is: do we want to be able to signal in a DID document explicitly which of the above applies to a particular DID document?

[Comment removed by Chairs]

As @Oskar-van-Deventer shows above - there's a lot of crazy cases with different overlap. The idea of "1-size DID-doc to fit all cases" (the registry model, the ADM concept) is a DID-killer. Under the current model, a "registry convergent data model" - means that we all need to look up a "companion document" in order to be useful. This is the indication we need to "abandon the standard" and do something that helps people deliver product instead of wallowing in comfortable what-if beaches. DID-docs are 100% "right" and "cool" - but they do none of the work and their "features" are dramatically underutilized - this is the form of a successful, but fuckwit, standard.

What the world needs is a "new form of identifier" that is suited to "connecting public/private key pairs" to a public registry. This is not DIDs - this is something different. A key concept behind the neo-DIDs is the concept that the controller can assert, to the world, something about the subject.

Modern DIDs impose draconian, white-bread control and destroy any concept of interoperability - Credential Users should be advised to no longer support DID-based credentials, unless the did+method can be confirmed and translated into the appropriate
resolution on current infrastructure.

What is most important is a "pathway off of DIDs" and "onto URLs" - perhaps with self-certifying charactistics - we need to safeguard the VC space from the foibles of DID-land.

@Oskar-van-Deventer
Copy link
Contributor

Oskar-van-Deventer commented Feb 28, 2020

My question is: do we want to be able to signal in a DID document explicitly which of the above applies to a particular DID document?

[Comment removed by Chairs]

I am not sure why my remark warrants such a negative response. The purpose of signalling something is to aid interoperability. I am pretty sure that my six examples cover most of the relevant cases. We can include extensibility, if that would address your unspecified worry. However, if we don't signal something about what a DID is supposed to identify, and leave it to the DID-method implementors and VC implementors, then that could lead to a lot of confusion in the market and interoperability issues.

@burnburn
Copy link

burnburn commented Feb 28, 2020

[Comment removed by Chairs]

@ewelton, the above language/accusations/analogies/etc. is not appropriate. While disagreements are acceptable here, incivility and/or racially/religiously challenging language is not. This is a warning from the Chairs not to do this again, EVER. @brentzundel @iherman

@ewelton
Copy link
Author

ewelton commented Feb 29, 2020

My sincerest apologies to the community - i've been quite fuzzy from some heavy respiratory medication and not tracking quite clearly - clearly my judgement is impaired - i'm happy the content has been removed and I agree it was inappropriately vulgar. Again, please do accept my apologies for the aggressive tenor of the text - and i am happy it has been removed for the benefit of the community.

@dmitrizagidulin
Copy link

@Oskar-van-Deventer

My question is: do we want to be able to signal in a DID document explicitly which of the above applies to a particular DID document?

Oskar, I agree with you that this is a really important use case.

It would be incredibly useful to be able to (optionally, of course) determine whether a DID belongs to a person, an organization or a software agent. (And note that this would be only for public-facing DIDs, such as those of corporations, public officials, and other appropriate situations.)

On a historical note, one of the things about WebID Profiles (an early web-based form of DIDs + DID Docs) that's both a strength and a weakness, is that they bundled in a bunch of stuff into one document - public key material (like DID Docs), but also a type (person/organization etc), and various social media profile like info. It was convenient because there's only one document to fetch, but also a fairly major liability for all the reasons why DID Docs do not have those attribute (PII + immutable ledger concerns, privacy concerns, and more).

So how do we solve this use case with our current DID Core spec?

The strong consensus so far, among this group, is that DID Documents should be kept as spare as possible (so, only proof material and service endpoints).

Given that state of affairs, how can we give more information about (the public aspect of) the entity controlling the DID? It sounds like using a service endpoint to link to some sort of profile is the only option.

For example, one of the service endpoints that's been previously discussed is a GDPR Proxy (which links to an anonymizing service that nevertheless allows authorized parties to get more information). That endpoint could provide the entity type that we're discussing in this issue.

@jandrieu
Copy link
Contributor

jandrieu commented Mar 5, 2020

Unfortunately, I must oppose the notion of the DID Document revealing or easing the discovery of the Subject.

Yes, I understand the goal. People want to turn DID trust anchors into general purpose directories where you list anything you want, which, unfortunately, turns DIDs into a vector for PII.

There is a growing body of argument that even pseudonymous identifiers, when stored on a blockchain are de facto violations of GDPR.

My strong opinion is that the best practice that will emerge--and may ultimately be required--is to require that all additional information about the subject be explicitly conveyed via a separate channel.

You want my discovery endpoint? Ask for it.

You want credentials that attest to my personhood? Ask for it.

The architecture itself should not reveal anything about me. Including whether or not I'm a person, a corporation, a group, or a dog.

The question before us, IMO, is not what can we jam into a DID Document, but rather what can we remove and handle at another layer in the architecture.

If we do this right, DIDs will become the ubiquitous layer for digitally verifiable identity, online and off. That means we are talking about the privacy risks of essentially everyone on the planet. IMO, we should not be creating an architecture that encourages privacy-violating patterns. Rather, we should identify how to improve privacy through suitable separation of concerns.

It would be relatively trivial to stand up a discovery service where someone can register a DID and any associated resources that might be served up to inquiring parties after whatever consent or authorization process is desired.

Given the simplicity of that, my answer is to just set that up as a separate service. The slight benefit of leveraging the identifier layer for additional data is not worth the innate privacy violations that would result.

To answer your question directly:

My question is: do we want to be able to signal in a DID document explicitly which of the above applies to a particular DID document?

I do not. And I would prefer an architecture which does not encourage such behavior. Instead, manage those signals through explicit, consent-based, purpose-bound interactions.

@dmitrizagidulin
Copy link

@jandrieu I agree with much of what you say. Couple things to clarify though:

There is a growing body of argument that even pseudonymous identifiers, when stored on a blockchain are de facto violations of GDPR

This is not the case for public facing entities.

You want my discovery endpoint? Ask for it.

Ask for it how?

@jandrieu
Copy link
Contributor

jandrieu commented Mar 6, 2020

@dmitrizagidulin wrote:

There is a growing body of argument that even pseudonymous identifiers, when stored on a blockchain are de facto violations of GDPR

This is not the case for public facing entities.

Actually, I am a public facing entity and GDPR still restricts putting my Personally Identifiable Information on a non-redactible substrate.

You want my discovery endpoint? Ask for it.

Ask for it how?

I'd suggest using the same channel that you got my DID from.

If you don't have a channel to reach me, then you also don't have a channel through which I can secure consent and establish terms of use and purpose binding.

For example, at least one DID method developer recently championed:

  1. I need a decentralized substrate wherein I can deterministically find all IDs published on it & resolve PKI/routing states

  2. Any observer must be able to independently find all IDs & compute the same state

Building a DID method that allows data miners to enumerate all DIDs of that method, will inevitably enable those same miners to

  1. probe every service endpoint to see if they can get a response, including through service identification and social engineering that service,
  2. scour databases of VCs for those same DIDs to find attestations about the Subject, and
  3. analyze the nature of the service endpoints, verification methods, and any correlations found in (2) to perform critical intelligence analysis to figure out who the subject is, what they are doing, and who they are interacting with.

All of which is likely to be made available to anyone resolving the DID to the DID Document.

Fortunately, not all DID Methods enable enumerating all the DIDs of that method, but some do. That in itself is a privacy problem.

What makes it worse is the baked in exposure that WILL happen when consumer-facing services start integrating with DID infrastructure. Most people have no idea that their Phillips Hue smart light bulbs completely expose their usage patterns to anyone with a nearby receiver (and I have heard IoT security experts blithely declare that's just fine as long as control is secured). Just like they shrug and sigh when faced with Alexa's monitoring their every command. Just like they smile when they buy police-connected surveillance cameras to mount on their doors and in their houses. All of this in the rampant pursuit of convenience and fancy features.

If we care about building this infrastructure out in a morally responsible way, we can't simply line up the mechanisms of exploitation and smile all the way to the bank. We need to think critically about how these technologies are going to be used and abused, then adjust the system to minimize the unnecessary exposure to all parties, ESPECIALLY to those parties who are never going to understand what's going on under the hood. 99% of the world is never going to read a DID Document. But most of them are going to be exposed to whatever privacy implications we bake into this system.

IMO, if you have no channel with the subject, you have zero business contacting them, unless that is your business, Mr. Sales Person, Ms. Politician, Miss Charity Fundraiser, Mr. Criminal. And I have no interest in giving those parties added access to anything.

The more information you put into that DID Document, the more likely you are going to cause privacy harms and the more likely the application or business based on that practice will struggle with GDPR compliance.

@dmitrizagidulin
Copy link

dmitrizagidulin commented Mar 6, 2020

@jandrieu

Ask for it how?

I'd suggest using the same channel that you got my DID from.

Specifically - I'm a wallet implementor. I get a Verifiable Presentation, containing a VC that is issued by a DID. I'd like to show a bit more information (in the Issuer column). As in, my UI will be slightly different if it's an organization, or a person, or a software agent.

As a wallet, how am I supposed to know about where the user got the DID from?

@jandrieu
Copy link
Contributor

jandrieu commented Mar 6, 2020

IMO, this is like asking how a file system knows what to name a folder.

Just ask the user.

The illusion that you might "automatically" do something is exactly what led to surveillance capitalism in the first place. What you can glean from any "automatic" look up may or may not be accurate and it may or may not align with how the individual wants to organize their records. The only definitive answer you can get is to do what the user asks you to do.

IMO, the right answer for the wallet provider is to make it easy for individuals to select (and change) which entity type is used for rendering each DID.

HOWEVER, you could also host a directory service that turns any DID into a set of values for things like Icons, colors, names, entity type, etc. Then a wallet could use that directory to automagically use the representations from that directory.

THAT directory could be a third party screen scraping data hoarder, or it could be an opt-in listing by DID Controllers.

THAT directory could also manage deletion requests, consent mechanisms, and everything else GDPR requires.

My point here is that first and foremost, we are solving for an identifier registry problem. Some of us also want to build directories. If we conflate those two, we will create privacy harms and likely run afoul of regulations. If we keep them cleanly separate than ANYONE can build their own awesome directory service and ANY wallet can choose to use it as they see fit.

Just build your own directory and be done with it. We can even advocate a microformat that can be published on a web site to associate resources with a DID. You still need a spider crawling the web to build its index, but that gives a lovely form of decentralization along with the necessary central control over a directory where compliance and consent issues can be managed.

@dhh1128
Copy link
Contributor

dhh1128 commented Mar 6, 2020

I wanted to second everything that Joe said. True and well said.

I also wanted to throw out an idea that I've been ruminating on for a long time now, that may offer a partial compromise. It is only half-baked, but it may spark some useful innovation in this group. It is based on the insight that knowing a route to a person, and knowing/discovering an identifier to a person, may be partially separable, and that there may be reasons to promote the distinction.

To understand what I mean by this separation, consider the classic plot device from thriller movies where the establishment/law enforcement and the lone maverick on the run are talking on the phone/internet, and the establishment is trying to trace the call/network traffic. If the lone maverick hangs up in time, he's safe; if he talks too long, his location is pinpointed, and he's caught.

What I realized is that the lone maverick in this case is not particularly endangered by the establishment having a name (identifier) for him, or by exchanging words. It's the traceability that's problematic, because the knowledge of routing is what leaks dangerous metadata. The hacker cooperative known as Anonymous apparently doesn't mind that people know they exist, since they've given the world a handle to use. And they probably don't mind that people can speak to them (e.g., by posting in public places to get their attention). I speculate that in theory, they probably wouldn't even mind speaking back. In this respect, "Anonymous" is actually an ironic name; it's not the anonymity but the untraceability that they really value. But usually you don't get untraceability without anonymity, so they went with the word that would help all of us make the right assumptions.

Anyway, here's a writeup of the idea. I'd be interested particularly in @jandrieu and @dmitrizagidulin 's reaction.

https://docs.google.com/document/d/1M_PmELevT6uIGIENmZebM1oHFkU8OPTrHqORohGEdjA/edit#

@SmithSamuelM
Copy link

SmithSamuelM commented Mar 6, 2020 via email

@dmitrizagidulin
Copy link

@jandrieu @dhh1128 - If I'm understanding your comments correctly, does that mean you would like to see Service Endpoints removed from the DID Core data model?

@SmithSamuelM
Copy link

SmithSamuelM commented Mar 6, 2020 via email

@jandrieu
Copy link
Contributor

jandrieu commented Mar 6, 2020

For me, yes, although I realize that runs counter to expectations from most in the group.

I'm struggling with service endpoint versus the privacy costs.

To wit, if we don't have service endpoints, we may well have no need for matrix parameters, which themselves are a problem today.

We also have an ill-defined algorithm for aggregating path and query parts across the DID-URL and service endpoints. Still very much an open issue.

@dmitrizagidulin
Copy link

@dhh1128

Anyway, here's a writeup of the idea.

Oh interesting, that's a very cool design. I'll need some time to properly absorb it / think it through.

@Oskar-van-Deventer
Copy link
Contributor

Oskar-van-Deventer commented Mar 7, 2020

I'm struggling with service endpoint versus the privacy costs.

Different service end points offer different levels of untraceability (using @dhh1128 term). IP addresses and HTTP end points are fairly traceable. Throw-away email addresses are less traceable. TOR onion addresses are even less traceable. Using the TOR onion address from a aggregator on TOR may make you even less traceable. Not publishing a service end point (or publishing it only via PeerDID) makes you even less traceable. Not communicating at all makes you likely the least traceable. How much privacy do you want?

@jandrieu
Copy link
Contributor

jandrieu commented Mar 9, 2020

The ability to retrieve or verify the public cryptographic material associated with my DID is all that is required architecturally.

Everything else is an additional privacy risk for features which can easily be handled with other systems.

As @dhh1128 proposed, there are any number of schemes that can perform discovery. Some are more decentralized than others, some more privacy preserving.

My argument isn't that directories or discovery is bad. In fact, they provide a useful component in the resulting system. However, embedding discovery and/or directories into the DID architecture is both unnecessary and a fundamental privacy problem.

So, let's separate the two so people can more clearly make distinctions at the architectural layer.

The primary topological advantage of DIDs over other cryptographically verifiable identifiers is indirection between the identifier and the crypto, which enables key rotation without changing identifiers. In some Methods that indirection also enables verifiable proof that the "current" cryptographic material is the authoritative version. Also in some methods, the transitions between versions of that cryptographic material can be audited. In others, it can even be reversed through elevated authentication to enable "undo" features in case of improper rotation. It is the decentralized indirection of DIDs that enable these capabilities.

These are the features that dramatically increase the usability, flexibility, and security of cryptographic identifiers and cryptography in general.

These are the features that offer some hope to resolving the usability issues of cryptography-based authentication, authorization, and control.

THAT is what we need to standardize on.

Turning DID Methods into directories doesn't support any of that. It's an extra feature that is simply "nice-to-have".

Even listing verificationMethods is debatable (and is being debated elsewhere.)

Even if you don't buy into my harms-based analysis, there is a simplicity argument. The more we stuff into the DID Document, the more we shoehorn "convenient" features into the base layer, the more complicated it is to build, deploy, and use methods, resulting in less interoperability and less adoption.

If we can do it at another layer in the architecture, we should. "Nice to have" isn't sufficient at this layer. Our goal should be the simplest capability that achieves the architectural shift we are here to create.

@dmitrizagidulin
Copy link

@jandrieu I agree with most of what you say, again.

Except that - for current implementations, the service endpoint feature is not 'nice to have', it's critical (until we standardize some other mechanism to perform that same function).

@Oskar-van-Deventer
Copy link
Contributor

Except that - for current implementations, the service endpoint feature is not 'nice to have', it's critical (until we standardize some other mechanism to perform that same function).

+1
Note that Service End Point is optional, so big institutions may choose to use it to advertise their public reachability, and individuals may choose not to include it in public DIDs, or advertise it only via PeerDID.

@jandrieu
Copy link
Contributor

jandrieu commented Mar 10, 2020

@dmitrizagidulin Can you elaborate?

I'm open to learning about use cases that require service endpoints. Mostly what I'm seeing is features that are desired for convenience rather than fundamentally required.

Don't get me wrong, just like the architecture of VCs was built on the anticipation of DIDs, so too will DIDs be built on the anticipation of other layers in the architecture, such as directories and data hubs. There will be directories and data hubs.

However, just as VCs deferred the details of DIDs, we would probably do well to defer the details of directories.

As long as we have a cryptographically verifiable identifier, we can create discovery protocols, directory services, data hubs, and delegation mechanisms that use those identifiers. These "4Ds" can readily be built on top of DIDs without DIDs or DID Documents doing anything to proactively support them. There is, to my understanding, no functional requirement to put hooks into DID Documents to realize those layers. And, to wit, there are no use cases in the use cases document that demonstrates the value of such a requirement (although the document does, unfortunately, presume that service endpoints are a thing, which is leaking of the solution into the opportunity statement).

Just like the web spawned Google and AWS, DIDs will spawn services at a higher level. But none of the web stack (html, http, URLs, etc.) required those services. Heck, HTML didn't even have JavaScript or CSS when Netscape went public.

Doing any one of the 4Ds "right" is a hard problem, problems that almost certainly require explicit separation from the DID layer to achieve requirements related to consent, GDPR, and privacy. If we encourage the use of DID Documents to enable directories, discovery, delegation, and data hubs we have to both figure out and get consensus on how to do properly enable all four of those mechanisms.

We basically have three options.

Option 1 DIDs w/o built-in support for the 4Ds: limited in scope, simpler, and avoids certain harms. It will be easiest to do this right.

Option 2 DIDs w/ mechanisms for all 4Ds, done right: greater scope, more complex, but can avoid many harms--but at this stage we can't be sure if we can avoid all harms from conflating the layers. It will be hardest to do this in a compact timeframe and even harder to understand the ramifications of supporting these additional layers.

Option 3 DIDs w/ mechanisms for the 4Ds just bolted on: more complex with both known and unknown harms. This we can do faster than option 2, but not as quickly as option 1. It will introduce considerable risks of both complexity and harm. And it ignores the ramifications of supporting those layers.

All three of these options can support the discovery, directories, data hubs and delegations. Option 1 simply puts them at a different layer, a layer which would be out of scope for the DID Core spec.

Option 1, IMO, already seems hard enough. We have a ton of strong, smart contributors with passionate notions of how to do DIDs "right". It is going to take use time to work through the competing requirements, even at this simplest version of the architecture.

We have been pursuing Option 3. This conversation is, in part, a realization that Option 3 may be a bad idea. In which case, we can either hunker down and work through the complexities in Option 2 in order to keep hooks for the 4Ds or we can simplify the work and defer the 4Ds to future standardization (option 1). Or, of course, we can keep going down the current path.

My concern is that if we continue down either Option 2 or Option 3, we will end up with a complicated mess that is inseparable from its privacy problems. It won't meet the needs of those committed to privacy protections and it will introduce so much complexity that interoperability beyond Option 1 is a joke anyway.

For example, we still do not have a reasonable agreement on the right way to handle the processing of path and query parts of the DID URL when the service endpoint in the DID Document also specifies a service endpoint that has path and query parts. I have repeatedly raised this issue and I will continue to, as the answers I get keep changing. To wit: the DID resolution spec for the Service Endpoint Construction algorithm requires that

  • both the DID-URL and Service Endpoint may have a path
  • they can't both have a query component
  • they can't both have a fragment component
  • the service endpoint URL MUST be an HTTPs URL

These requirements are NOT in the DID core spec, and I was surprised to see that the algorithm isn't simply appending the query & path as described by others: before we even get to the algorithm, constraints are placed that simply invalidate many legitimately looking combinations. And, yes, there is no discussion of what happens when one of the three constraints listed above are violated. And this binary constraint between service endpoints and DID-URLs has never been raised to me in the discussions I've had.

The algorithm goes on to specify a set of transformations that is different than what I last heard from @peacekeeper (one of the editors of that spec), who had suggested that instead of defining a unique algorithm, we use the "Relative Resolution" algorithm from RFC3986. (To Markus's credit that suggestion is also proposed in the current DID Resolution spec). Unfortunately, the relative reference URL would still have the problem of stripping the filename from the service endpoint if it has a path-part that ends in anything other than "/". This would be an unfortunate hole if my service endpoint is https://example.com/joe, the and the DID-URL has a path part, e.g.
did:example;service=a:mike

All of which is just to say service endpoints are a huge complication.

I don't mean to say these issues are unsolvable problems. Rather, they are hard problems that, IMO, will take considerable resources to figure out how to solve them. The group has already established consensus around deferring resolution to a separate specification precisely because of this complexity.

So why do we think it is going to be effective to stick service endpoints in a DID Document when we can't even describe how these likely use scenarios are going to work (and break)?

What I see is the acceptance of a SIGNIFICANT amount of added, unbound complexity to support features that are not fundamentally required to realize the unique value of decentralized identifiers.

So, to restate my question, @dmitrizagidulin can you identify a use case for service endpoints that cannot be achieved as a second layer?

@dmitrizagidulin
Copy link

@jandrieu

So, to restate my question, @dmitrizagidulin can you identify a use case for service endpoints that cannot be achieved as a second layer?

Let's start with - first and foremost, service endpoints are needed to link to the second layer.
All those directories? The DID controller needs to be able to state which ones they've authorized to use, or which ones they're participating in.
Or is the expectation that there's going to be just one directory?

@jandrieu
Copy link
Contributor

Hmmm...

My web page doesn't need to link to Google. Nor to Duck Duck Go. Nor to Bing.

SSL certs don't link to Google's certificate transparency log.

So there is no explicit requirement that the DID Document itself link to a "preferred directory"

What you need is simply the existence of one or more directory services that might be used to retrieve publicly published information. Users will use the directories they like or trust. If such information is to be authoritative by the Controller, it's easy to sign credentials to demonstrate that. And, just like one might submit a website to a search engine, people can submit their DIDs and associated service endpoints to their preferred directory/ies.

If you want information that is not publicly published, then get it directly from the DID Controller.

@brentzundel
Copy link
Member

brentzundel commented Mar 10, 2020 via email

@SmithSamuelM
Copy link

SmithSamuelM commented Mar 10, 2020 via email

@Oskar-van-Deventer
Copy link
Contributor

If we would strip down DID and DID document to their minimum essential function ...

bootstrap of information to establish control authority over a DID.

..., and hence remove all communication-related stuff (encryption keys, service end points), then how does the result differ from classic X.509?

@philarcher
Copy link
Contributor

This thread is now too long to read and absorb everything that everyone's said, so I apologise for the less-than-detailed response. But what I take from it - and I stand very much ready to be corrected - is that the decision point is around what should be in the core spec, specifically, whether things like service endpoints should be included in the DID doc, or rather, whether the core spec should define service endpoints as a thing. I'm struck by the arguments here and elsewhere about "wishing we hadn't ever coined the term DID document."

My 2 cents is that no, service endpoints should not be mentioned in the core spec. And if we can write the core spec without mentioning DID documents, so much the better.

Justification:

I agree with those commenting that DID core is about the identifier and the means of proving control of that identifier. Anything else is another layer that can be defined and added separately. The URI spec does not define HTTP, HTTP does not define any data format and so on. Layers have a history of working well - or being replaced with the benefit of hindsight.

In my own work - which has a good deal in common with aspects of this discussion - we first defined a structure for HTTP URIs that included our (GS1) identifiers.

That was version 1.0 of the spec, formally published as a GS1 standard in August 2018.

Then we worked on resolvers, link types (our version of service endpoints), semantics and more. That was published as version 1.1 of the spec. It could have been - and at W3C probably would have been - a separate standard in the same suite. We're about to start work on version 1.2 that will include the next layer. Version 1.1 did not change what was in version 1.0 - the core - and nor will version 1.2.

I don't think it makes any sense - I'm being polite - to talk about service endpoints without also defining the resolution piece. As the discussion here shows, resolution and service endpoints are so closely inter-dependent that discussion of one without the other is bordering on meaningless.

So fwiw, my preferred way forward would be:

  1. Define DID-core purely in terms of the URI syntax and the provision of necessary material to prove ownership. We might hope to get that done very quickly.
  2. Take all the service endpoint stuff and make that part of the resolution spec that also goes on the Rec Track.

That does not make the current ideas around DID docs redundant, indeed, we should be careful not to annoy people in the community by making their current DID doc-based services suddenly non-conformant, but working in layers, I believe, will give us the flexibility to get each bit right in its turn.

@SmithSamuelM
Copy link

SmithSamuelM commented Mar 11, 2020 via email

@ewelton
Copy link
Author

ewelton commented Mar 11, 2020

Would a self-signed certificate establish me as a root of trust and allow me to begin a chain of key-event traces? It is not clear that I'd be any less trustworthy than some arbitrary blockchain - but it would begin a witnessed lifecycle for a public key, against which evidence of trustworthiness could accumulate.

I'm curious how different did:key would be from a self-signed certificate issued against the distinguished name "key=X" - not in terms of "common use of certificates", but just mechanistically? I mean, we definitely do not want to value the distinguished name component attested to by any CA - and such certs don't have to be viable in the common roles of certificates, but they feel similar - i'm curious how on/off target that is?

The model I've been working with recently is purely URL based for VC and ZCAP - but in order to avoid the "centralization" aspect, I've been exploring requiring the anchor domains to act as lightweight CAs, (which provides a certain x 509 interop), and to provide mutual witness support for key-event traces. Under the hood the model uses self-issued VCs to build up the verification method and service endpoint profile which can likely be rendered as a did:web DID document but would also work by simply dropping the did:web part. It's just something I'm evaluating - so would love any feedback.

Note - of related interest: https://github.com/WebOfTrustInfo/rwot9-prague/blob/master/topics-and-advance-readings/X.509-DID-Method.md

@ewelton
Copy link
Author

ewelton commented Mar 18, 2020

This issue was really intended for discussion - and I think we got a lot of discussion here. There is target PR here, just a quest for clarification and understanding about what's going on with DIDs.

I'll close this in a few days if there is no objection or if there is any concern that needs to be addressed.

@msporny msporny added the pending close Issue will be closed shortly if no objections label May 26, 2020
@msporny
Copy link
Member

msporny commented May 26, 2020

This will be closed in 7 days unless there are objections.

@brentzundel
Copy link
Member

no objections raised, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Needs further discussion before a pull request can be created pending close Issue will be closed shortly if no objections
Projects
None yet
Development

No branches or pull requests