Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove matrix parameters from the DID specification #159

Closed
selfissued opened this issue Jan 29, 2020 · 73 comments · Fixed by #285
Closed

Remove matrix parameters from the DID specification #159

selfissued opened this issue Jan 29, 2020 · 73 comments · Fixed by #285
Assignees
Labels
high priority Discuss during the next WG call PR exists There is an open PR to address this issue

Comments

@selfissued
Copy link
Contributor

We should enable the use of standard URI parsers by removing matrix parameters from the DID specification.

As I wrote at #137 (comment), in my view, the working group should make every attempt to not introduce the matrix parameters syntax at all. There are already two mechanisms for passing parameters in URLs - query parameters and fragments. One or the other should suffice in all cases.

Even if it's necessary to do something like dedicate particular query and/or fragment names for DID purposes, that would arguably be preferable to introducing yet a third parameter passing mechanism that requires non-standard URL parsing to use.

I'd taken an action item during a recent WG call to file an issue to drive discussion on removing or defining the use matrix parameters. This is that issue.

I'll also note that the specification currently appears to be inconsistent on whether matrix parameters are actually supported or not. For instance, they're missing from the descriptions of DID portions in https://w3c.github.io/did-core/#terminology, but they're present at in the Generic DID Syntax at https://w3c.github.io/did-core/#generic-did-syntax. The specification should eventually be made self-consistent in this regard.

@selfissued selfissued added the high priority Discuss during the next WG call label Jan 29, 2020
@selfissued selfissued self-assigned this Jan 29, 2020
@peacekeeper
Copy link
Contributor

Thanks for creating this issue. To help with the discussion, I recently shared some materials about matrix parameters in DID URLs:

@selfissued
Copy link
Contributor Author

Note that the specification does not uniformly recognize matrix parameters as a component of a DID URL. For instance, the definition of DID URL at https://w3c.github.io/did-core/#dfn-did-urls is:

A DID plus an optional DID path, optional ? character followed by a DID query, and optional # character followed by a DID fragment.

This implies that, if present, any matrix parameters are part of the DID, because they are not enumerated in the list of DID URL components above. Yet the DID definition at https://w3c.github.io/did-core/#dfn-decentralized-identifiers does not state that matrix parameters are part of the DID.

@selfissued
Copy link
Contributor Author

The intended semantics of this text in the Generic DID Parameter Names section at https://w3c.github.io/did-core/#generic-did-parameter-names is unclear:

Some generic DID parameter names (for example, for service selection) are completely independent of any specific DID method and MUST always function the same way for all DIDs. Other DID parameter names (for example, for versioning) MAY be supported by certain DID methods, but MUST operate uniformly across those DID methods that do support them.

@selfissued
Copy link
Contributor Author

This text means that the specification is not self-contained:

The exact processing rules for these parameters are specified in [DID-RESOLUTION].

Having the definitions of parameters defined in this specification in another spec that is not normatively referenced by this specification and that is not being standardized by this working group is unacceptable.

@peacekeeper
Copy link
Contributor

Just a quick note that this topic was discussed at the recent DID WG F2F meeting in Amsterdam.

@jandrieu and I have a task to describe

  1. at least one use case enabled by matrix parameters, and
  2. if/how that use case could be implemented without matrix parameters.

We will share a document with the WG once we have an initial draft of this. Also see the slides that were used at the F2F meeting on this topic.

@peacekeeper
Copy link
Contributor

The following issues are dependent on this one: #35, #36

@peacekeeper
Copy link
Contributor

As discussed during the F2F, I have created (and @jandrieu has reviewed) the following document:
https://docs.google.com/document/d/1ttRWB2lwYSw7bZMRY6wTY9lzGaHcSvOKFYfNZaBBS_4/

In my opinion there are now numerous resources (the above document, plus the F2F slides, plus a RWoT#10 paper) that explain how matrix parameters enable the "Web Address Portability" use case as well as several other functionalities. I'd also like to note that this use case had strong support at the F2F meeting.

@OR13
Copy link
Contributor

OR13 commented Feb 25, 2020

@csuwildcat initial values use case for sidetree, #70

@OR13
Copy link
Contributor

OR13 commented Feb 28, 2020

Another example of potential use of matrix parameters: decentralized-identity/didcomm-messaging#33

In transitioning from ephemeral to permanent dids.

In supporting immediate resolution of permanent / ledger anchored dids.

@msporny @dlongley interested to hear your thoughts on this issue.

@OR13
Copy link
Contributor

OR13 commented Feb 28, 2020

Related issue here: digitalbazaar/did-method-key#5

@awoie
Copy link
Contributor

awoie commented Mar 2, 2020

Another example of potential use of matrix parameters: decentralized-identity/didcomm-messaging#33

In transitioning from ephemeral to permanent dids.

In supporting immediate resolution of permanent / ledger anchored dids.

@msporny @dlongley interested to hear your thoughts on this issue.

initial-state could help to make some DID methods more privacy-friendly by having less information about the DID subject on the Blockchain. Especially, onboarding of DIDs should have as less friction as possible and requiring a Blockchain transaction is cumbersome for most users. For advanced features which I don't need to detail here this would be still beneficial though. initial-state could help to provide everything that is needed to establish a secure connection with DID controllers from the beginning without requiring a Blockchain transaction. So, I would also see value for matrix parameters.

@philarcher
Copy link
Contributor

philarcher commented Mar 10, 2020

I've finally got around to looking into this topic and reviewing the arguments. They are very pertinent to me as they mirror a lot of the discussion around a standard with which I am intimately involved, GS1 Digital Link. I don't expect anyone here to read 145 pages of PDF but there is a lot in common between GS1 Digital Link and what DIDs are about, especially in terms of resolution and addressing service endpoints. So, here goes...

TL;DR - I do not believe that DID URLs need matrix parameters.

I'd go further - I am sceptical that we need anything other than HTTPS.

The use case for us at GS1 is making barcodes something you can look up on the Web. Whether that's the stripy barcode you see on just about everything or a QR code, a Data Matrix or whatever, what those things carry is one or more identifiers. There are many others but I'll stick with the best known - the GTIN (Global Trade Item Number). Let's use 9506000134352 as an example.

Imagine a product on the shelf with that number encoded in a 1D barcode.

You can use any number of apps to scan that and they'll take you to wherever the app developer thinks is a good idea - usually a proprietary, locked-down data store that they operate.

What GS1 Digital Link does is to provide a URI structure into which you can put that GTIN. Again, there are much more complex examples I could use but let's keep it as simple as it can be:

https://example.com/gtin/9506000134352

Now, I've used example.com there deliberately as I want to emphasise that there are two separate things there:

  1. The GTIN - an identifier for the product that is guaranteed to be unique within the GS1 system.
  2. The location of a resolver (in this case example.com, but it could be example.com/foo/bar/ before you get to the bit that says "here's a GS1 identifier" and you're into a structured string.

We completely separate out the resolver's location and the ID to be resolved.

You can look up that GTIN on any number of resolvers. Unlike say, DOIs or ORCIDS, each resolver is free to return whatever information it wants. Each resolver is sovereign. In plain terms, you're asking each resolver "what can you tell me about the thing identified by this GTIN?" And you may not get the same answer from each one.

We want to be able to attach multiple resources/APIs to that GTIN - in DID language, call them service endpoints. Product master data, consumer information page, recall status API, promotions, instruction manuals and more.

If you're a member of staff in a retailer, you're likely to want something like the recall status of the item before you put it on the shelf, if you're a logistics company, you want to know where to record the fact that you picked the consignment up at time X from location y. Those are specialised operations for which you want to use a service endpoint associated with that GTIN (or pallet identifier or whatever).

In DID world, the default is to return a DID document. The proposal is to use matrix parameters to bypass the DID doc and go straight to a service endpoint.

In GS1 Digital Link world, the default is that you go to wherever the brand owner decides is the default, most likely a consumer-facing product information page or, perhaps a current promotion. But specialist apps can go straight to the service endpoint they need.

Let's try that... this is a GS1 Digital Link URI that works

https://id.gs1.org/gtin/9506000134352

So it's using the GS1 Global Office resolver to look up GTIN 9506000134352.

You'll get a simple redirect to a product information page. That's the default. There's a different default if you happen to speak Vietnamese. Set your browser to that language or, for simplicity, add ?lang=vi and you'll get the page in Vietnamese.

Now try this:

https://id.gs1.org/gtin/9506000134352?linkType=gs1:recipeInfo

You'll see what just happened. The query string parameter linkType does what the DID WG is considering using matrix parameters for - i.e. providing the resolver with an instruction to process the request in a particular way, in this case to provide a recipe idea for the product. By the way, we don't have the recipe in Vietnamese so that https://id.gs1.org/gtin/9506000134352?linkType=gs1:recipeInfo&lang=vi still goes to the same English language default.

Notice that when you're redirected, the query parameter is passed on.

You'd get the same result with

https://id.gs1.org/gtin/9506000134352?linkType=gs1:recipeInfo&foo=bar

That is, the resolver passes on whatever is in the incoming query string - because it doesn't matter. Well, it shouldn't. Try any Web page - add on any old junk in the query string and it won't matter because those pages will ignore what they don't understand. At least, that's the idea. We have found examples where this doesn't hold and so we have a feature that you can suppress the query string if you need to, but the default is that it gets passed on.

There are several ways of getting the full list of available links (service endpoints). Try this:

curl -I https://id.gs1.org/gtin/9506000134352

That is, a HEAD request - and note the (long) Link header.

If you want all that as JSON, try

curl -H "Accept: application/json" https://id.gs1.org/gtin/9506000134352?linkType=all

Or just click https://id.gs1.org/gtin/9506000134352?linkType=all for the HTML page with the JSON embedded.

That's our equivalent of the DID-doc. It's the full list of available service endpoints although of course there's none of the crypto authentication material that is so important in DIDs.

What about passing on the GTIN in a template? That is, imagine a service like

https://example.com/recallStatus?gtin={gtin}

We can do that on the resolver too - that is, provide a rewrite rule to take the GS1 identifiers from the incoming URL and put them into a different template. We haven't formally standardised that yet but we soon will.

Yes, this is a grab on URL space. However, we state clearly that all URIs are dumb identifiers. Applications must be aware that https://example.com/gtin/9506000134352?linkType=all is a perfectly valid URL that may or may not end up at a GS1-conformant resolver. Deal with it. And if you find a QR code with a GS1 Digital Link URI in it, you're perfectly free to swap the embedded resolver for another of your choice. So there's no single point of failure (although we do define canonical URIs as being on id.gs1.org).

We do 'reserve' the linkType parameter and one or two more (again, recognising that in other contexts, those same params can be used for something else - we don't control the whole URL space).

Including instructions to the resolver in the query string is not a problem. Passing on params in a query string to your service endpoint shouldn't be a problem.

I don't claim credit for any of this, nor does anyone else at GS1. It's sort of HATEOAS and Linked Data and... well, it's the Web.

@csuwildcat
Copy link
Contributor

Matrix parameters play a valuable role in the DID document processing phase, if properly scoped strictly to that phase, imo. Matrix parameters should:

  1. Be confined to the resolution phase of DID URI parsing to determine the correct DID Document.
  2. Be limited in function (beyond document resolution) to selecting a portion of the document and/or forming a URL in the process.
  3. Not be used for things outside of the two purposes above.

If we were to use URL params alone, you would absolutely need to 1) define a namespace for DID-related parameters (e.g. _did-PARAM-NAME=), and for generic reserved parameters, you would need to further create a subspace within the general DID param namespace to distinguish any reserve params, for example: _did__version=.

There also exists the strange question of what to do with DID-related and DID-reserve parameters after resolution? Are they passed after parse to downstream application-level code via a generated URL? Are they removed after processing?

Using URL params alone does not deliver you any off-the-shelf simplicity or ease of integration, as you will certainly need to:

  1. Define the custom namespacing and reserve param syntaxes I highlighted above
  2. Define how processing DID resolution-only params
  3. Define whether or not certain DID resolution-only params are dropped from the generated URLs resolution may produce
  4. Take into account how user agents should represent a DID URI that contains a mixture of jumbled params, some of which may be active for DID resolution-only, and others that are meant for userland code for traditional URL param handling. Also something to consider: will ordering of DID resolution-only params ever require order-dependent processing? If so, what new DSL syntax will we need to invent to define this?

Could you do all this without matrix params? Sure, but it's a fantasy that we're going to do it without introducing a bunch of convoluted, specialized processing steps and library code that significantly diverges from how URL params are ordinarily handled.

@msporny
Copy link
Member

msporny commented Mar 11, 2020

@csuwildcat wrote:

Be confined to the resolution phase of DID URI parsing to determine the correct DID Document.

If you're going to confine it to the resolution phase (which I do agree, is a good idea), then it should be an argument to the resolution process, possibly in the resolution request (instead of in the DID URL itself). It feels like this is an argument /against/ matrix parameters instead of for it.

At this point, I do think that we have consensus that DID parameters are used during the resolution phase or URL rewriting phase... do we have any use cases for using a DID parameter outside of those two phases?

@msporny
Copy link
Member

msporny commented Mar 11, 2020

As a data point, Digital Bazaar has never needed or used DID parameters or matrix parameters to date. Not in the Veres One implementation. Not with any of Digital Bazaar's customers' use cases. I do think that's instructive, as we do have a variety of very complex use cases and none of them require the use of matrix parameters. In fact, none of them need DID parameters encoded in the DID URL.

We do need to express what are now called DID parameters, but doing so during the resolution phase, in a resolution request is good enough for all of our use cases, IIRC.

@peacekeeper
Copy link
Contributor

@philarcher thanks for this demo and very clear explanation! I especially like how you use the Link: header to be compatible with the basic concept of Web Linking.

Here are some comments:

I am sceptical that we need anything other than HTTPS.

Keep in mind that DID Resolution is an abstract function that can be bound to HTTP(S) but doesn't require it. You can resolve a DID / dereference a DID URL by calling an HTTP(S) endpoint, but you can also do that by invoking a library, command line tool, etc.

We completely separate out the resolver's location and the ID to be resolved.

Nice idea, this reminds me of some other persistent identifier (PID) concepts such as ARK ID which I learned about at a conference on this topic (see this blogpost if you're interested).

https://id.gs1.org/gtin/9506000134352?linkType=gs1:recipeInfo
You'll see what just happened. The query string parameter linkType does what the DID WG is considering using matrix parameters for

That is, the resolver passes on whatever is in the incoming query string - because it doesn't matter.

Yes, this is a grab on URL space.

We do 'reserve' the linkType parameter and one or two more

I think this design is a mistake for some reasons outlined in the Google doc that describes the "Web Address Portability" use case. Here is a summary:

  • This "steps" on the URL information space that should be left to the service endpoint to define. What if a service endpoint has its own use for a "linkType" query parameter that conflicts with your resolution "linkType" query parameter?
  • This gets worse when you start considering other parameters besides this one. For example, we have a matrix parameter for resolving a specific version of a DID document. In your example, how would you distinguish between specifying the version of your [equivalent of the] DID document, and the version of the risotto recipe?
  • This will "break URLs" to some extent, since it will only work with service endpoints that have query strings which follow the key=value pattern. Yes many do, but RFC3986 also allows query strings in other formats.
  • I could imagine privacy and security implications if a parameter that is meant for selecting a service is forwarded to that service (similar to an HTTP Referer header).

I think it's inherently dangerous to intermix two separate sets of parameters (parameters for resolution, parameters for service endpoints) into a single syntactical construct. Note that URNs (RFC8141) also have two separate syntactic constructs for this, for good reasons.

@peacekeeper
Copy link
Contributor

peacekeeper commented Mar 11, 2020

@philarcher one more question, do GTINs support what is called "partial redirection" in PURLs?

E.g. could you do something like

https://id.gs1.org/gtin/9506000134352/photo.jpg?linkType=gs1:recipeInfo

and expect to get redirected to this? (note the path /photo.jpg that is added to the URL).

https://dalgiardino.com/mushroom-squash-risotto/photo.jpg?linkType=gs1:recipeInfo

@philarcher
Copy link
Contributor

Thanks @peacekeeper.

If linkType is used for something else by another server, there's no problem since we only 'reserve' it for the GS1 ecosystem. Outside that, of course, all URLs are dumb strings. We make it plain that applications should be aware of this. How do you know that you're addressing a GS1 resolver? There MUST be a Resolver Description File at /.well-known/gs1resolver. No file there? Don't assume anything about the linkType parameter. That's not bullet proof, but it's a start.

And yes, other query string formats are usable, sure. We don't stop anyone using those. You can define a rule in the resolver that turns a conformant URI into whatever template your service endpoint needs. It makes no demand on the target. And if needs be, you can suppress the default behaviour of forwarding the full query string.

To your separate question, no, https://id.gs1.org/gtin/9506000134352/photo.jpg?linkType=gs1:recipeInfo is not a conformant GS1 Digital Link URI and would return a 400 bad Request error (you can't add random stuff to the path segments, only the query string). This does not affect the behaviour of other servers which, of course, remain sovereign.

I don't expect to persuade you, Markus, but I wanted to record how we're doing it and thus show that alternatives are possible. There are factors at play for DIDs that are not relevant to us, but we do have a distributed system for resolving identifiers and discovering related resources.

@OR13
Copy link
Contributor

OR13 commented Mar 12, 2020

Why can't we reserve the query string param "matrix-parameters", and use EncodeURIComponent on it?

That way people who don't want to use them can use that, and people who do can translate from that to matrix parameters if they encounter it.

Of course we will still have the problem of query parameter sorting...

https://support.cloudflare.com/hc/en-us/articles/360031777052-Caution-when-enabling-Query-String-Sort-with-WordPress-admin-pages

Can't I issue a 302 redirect based on processing query string params?

http://example.com/resolve/did:ex:123;service=gs1Resolver/gtin/9506000134352?linkType=all

becomes:

https://id.gs1.org/gtin/9506000134352?linkType=all

http://example.com/resolve/did:ex:123?matrix-params=encodeURIComponent(service=gs1Resolver/gtin/9506000134352)&linkType=all

becomes:

https://id.gs1.org/gtin/9506000134352?linkType=all

Does this work?

@philarcher
Copy link
Contributor

I see no objection to those redirections @OR13. After all, they're performed by the example.com resolver which can do whatever it wants - all domains are sovereign. And your example shows the independence of the resolver from the GS1 identifier to be resolved. I'm all for that!

Reserving the matrix-parameters name for DID URLs might be OK, sure, but, as we did, you'll need to warn that it's only in the specific context of a DID resolver that the query parameter has any meaning and applications cannot assume that it means that everywhere - it can mean something quite different in any other context.

As for query string sorting? Really? No. No way, never. No. Stop that nonsense. Query params are un-ordered. If a param is repeated, the last value wins. If a server can't handle that then it's a mess and needs to be put out of its misery with the judicious use of the delete button.

You might want to define a canonical form of a URL so you can hash and sign it - OK (we might be about to do just that) - but as a URL it is isomorphic with any URL that has the same params in a random order. Or am I making an impetuous fool of myself here ;-) ?

@OR13
Copy link
Contributor

OR13 commented Mar 12, 2020

+1 for canonical URLs... I imagine that will involve sorting ;)

However, if you canonize a wordpress admin URL in order to sign it, it would probably not work any more... I'm all for deleting wordpress, but I suspect thats not going to happen :)

I think its worth having a fallback for matrix params that works with traditional URL parsers, because there won't be any software that supports them out of the box.

@peacekeeper
Copy link
Contributor

You can define a rule in the resolver that turns a conformant URI into whatever template your service endpoint needs

I don't think we want the DID URL dereferencing process to be dependent on custom resolver rules or templates.

All we want is take the service endpoint URL from the DID document as a "base URL", and apply the DID URL's path+query+fragment as a standard relative URI reference, as shown in slides 159 and 160 of the F2F meeting.

I haven't seen any proposals yet on how to achieve this without matrix parameters.

@dlongley
Copy link
Contributor

All we want is take the service endpoint URL from the DID document as a "base URL", and apply the DID URL's path+query+fragment as a standard relative URI reference, as shown in slides 159 and 160 of the F2F meeting.

I understand that there has been interest in solving the problem that way, but I do think that's actually a potential solution to the problem vs. the actual problem. The problem, as I understand it, is that we want to be able to move the authority for some path+query+fragment for a relative-ref "at will" by changing the authority part of the URL via a service description in a DID Document -- whilst keeping a stable URL for consumers. This is, in fact, precisely what slides 159-160 show happening.

I haven't seen any proposals yet on how to achieve this without matrix parameters.

I think what others are saying is that we can address this use case by solving it in a different way. To get specific, looking at slide 159, instead of this:

did:ex:123;service=files/myresume/doc?version=latest#intro

It seems we could do this:

did:ex:123?service=files&relative-ref=%2Fmyresume%2Fdoc%3Fversion%3Dlatest%23intro

And reserve service and relative-ref for DID URLs. The same HTTPS URLs from the slide would result from the resolution process. The resolution process would involve using standard URL parsing on the DID URL, for example:

const u = new URL('did:ex:123?service=files&relative-ref=%2Fmyresume%2Fdoc%3Fversion%3Dlatest%23intro');

Which yields:

URL {
  href: 'did:ex:123?service=files&relative-ref=%2Fmyresume%2Fdoc%3Fversion%3Dlatest%23intro',
  origin: 'null',
  protocol: 'did:',
  username: '',
  password: '',
  host: '',
  hostname: '',
  port: '',
  pathname: 'ex:123',
  search: '?service=files&relative-ref=%2Fmyresume%2Fdoc%3Fversion%3Dlatest%23intro',
  searchParams: URLSearchParams { 'service' => 'files', 'relative-ref' => '/myresume/doc?version=latest#intro' },
  hash: ''
}

From here, the searchParams would be used to obtain the fragment name for the service (files). This would be appended to the full did path along with the hash character to produce the service ID: did:ex:123#files.

This would be used to obtain the service description from slide 159:

{
  "id": "did:ex:123#files",
  "serviceEndpoint": "https://filestore.org/user123/"
}

And the serviceEndpoint URL would be retrieved: https://filestore.org/user123/. Then the relative-ref query param value would be URL-appended to that value producing:

https://filestore.org/user123/myresume/doc?version=latest#intro

As you can see, the same HTTPS URL is output from this process as the slide. There's also no conflict with the DID URL query parameters and whatever the HTTPS server may use -- as processing must be done on a DID URL independently to produce the HTTPS url. Once the serviceEndpoint is changed to https://selfhosted.me:8080/ the process correctly outputs: https://selfhosted.me:8080/myresume/doc?version=latest#intro just like the slide.

The same works for slide 160:

did:ex:123;service=socialnetwork => did:ex:123?service=socialnetwork

Again, with the same HTTPS URLs resulting from the resolution process. Note that any query parameters/fragments that are part of the DID URL itself are only handled by DID resolvers, not mixed or combined in any way with query parameters/fragments intended for the server. It doesn't look as pretty, but we shouldn't have any "mixing" issues, because you just have to encapsulate service and relative-ref values using URI encoding.

The difference with this approach may only be in where the transformation from a non-DID URL to a DID URL would occur. You can't "edit" the DID URL in place in the same way you would edit the HTTPS URL; e.g., you can't just add/remove path components using regular URL tools. You have to understand that it's a DID URL and work within the "relative-ref" value. That would seem to be the main trade-off and perhaps that's where the point of contention is. If so, I think it would help to surface that better.

Is that right? You'd prefer to have consumers be able to edit an existing DID URL without knowing it's a DID URL -- to make changes to the path, query, fragment, etc.? This as opposed to actually just resolving it?

IMO, I think it's not too much of a burden to have to either parse or resolve the DID URL first before editing it (and then, subsequently translate it back to a DID URL as needed). I think that's a better trade-off vs. creating new URL parsers.

@jandrieu
Copy link
Contributor

Using the relative URI relative reference architecture will strip the last part of the service endpoint (everything after the last "/"), which, if the endpoint wanted to be a DID itself, e.g., did:example:joe could remove the entire DID URL (if it has no '/').

One thing I'm seeing--even in my own thinking & writing--is the desire to dereference a DID and, ultimately, return either a resource or a URL. So, despite my concern over privacy issues, this expectation may need to be supported.

However, here's a proposal achieve what you want (redirection) without matrix parameters:

  1. Allow one and only one service endpoint, which we might as well call a "redirect"
  2. When resolving a DID, you return the DID Document
  3. When dereferencing a DID, you return the single redirect
  4. In the service endpoint (redirect) property, we add an aggregationMethod property that specifies how any "extra parts" of the DID URL are merged with the extra parts of the service endpoint. This supports both the portable hierarchy use case as well as situations where different rules might be appropriate for a particular service endpoint type.

At least four algorithms are immediately apparent as useful:

  1. replace (DID URL path/query parts replace service endpoint path/query parts)
  2. ignore (DID URL path/query parts are dropped, the service endpoint is dereferenced unmodified)
  3. aggregate (a modification of the relative reference URL that preserves the file part)
  4. relative (use the relative reference algorithm directly)

Another proposal would be to select the service using a reserved query term, like
did:example:abc?_DID_service=myService

If you want to do it without a possible collision in the query name space, just have one and only one service endpoint that is always used when dereferencing.

Most privacy advocacy on this issue have suggested that the best way to deal with my concerns vis-a-vis consent, privacy, and gdpr, is to put information behind a single service. Thus, a single service requirement would support BOTH service dereferencing and minimizing privacy risks without needing matrix parameters.

So, there are two proposals for you.

@peacekeeper
Copy link
Contributor

@dlongley thanks for this great analysis and write-up. I agree this would work as an alternative to matrix parameters and that it would fulfill the use case.

Is that right? You'd prefer to have consumers be able to edit an existing DID URL without knowing it's a DID URL -- to make changes to the path, query, fragment, etc.?

Yes pretty much. I think the idea that the path+query+fragment "fully belong to" the consumer (the DID controller) is elegant and powerful, just like in the case of HTTP URLs the path+query+fragment "fully belong to" the domain owner. The path+query+fragment of the DID URL could be freely edited, and the relative URI dereferencing algorithm would just continue to work. Personally I prefer this to having to introspect the "relative-ref" value. But I can understand if others see it differently.

If we decide to do it that way, I would probably propose next to remove the "path" component from DID URLs, since I can't think of any use for it anymore, and the following spec text would not be accurate anymore:

A DID path SHOULD be used to address resources available via a DID service endpoint.

@peacekeeper
Copy link
Contributor

@jandrieu

Allow one and only one service endpoint, which we might as well call a "redirect"

I agree this would work, but it would be a bit like having an HTTP URL that is dereferenced to a web page (or an RDF graph) which is only allowed to have one link to another HTTP URL. This is not how the web and relationships between resources (see Web Linking) should work.

At least four algorithms are immediately apparent as useful: replace, ignore, aggregate, relative.

Funny, we had a feature similar to this in XRI Resolution (an "append" attribute in your XRD document - see section 13.7.1. of XRI Resolution 2.0)

Most privacy advocacy on this issue have suggested that the best way to deal with my concerns vis-a-vis consent, privacy, and gdpr, is to put information behind a single service.

I believe this idea has been brought up before by @dlongley (see w3c-ccg/did-spec#90 (comment)) and is being tracked as an issue in DID Resolution (see w3c-ccg/did-resolution#35). But it would introduce a dependency on an intermediary service you'd have to trust, no?

@iherman
Copy link
Member

iherman commented Mar 17, 2020

@kdenhartog

In state 2 (did:example:123;version=2) this did refers to Alice who's the controller of the keys and the subject of the did.

Forgive me if I sound like a broken record. In my reading of did syntax, did:example:123;version-2 is not a DID, it is a DID URL (in today's terminology). Again per current spec (and, I believe, the spirit of the spec, too) the subject as well as the controller is identified by a DID and not a DID URL.

If the president changes after a new election, then the DID document must be (somehow) updated and the controller's value must be issued a new DID and not have the matrix parameter change.

I am actually more and more concerned that too many possibilities in creating a DID URL muddles the water and create confusion... (This is not picking on matrix parameters, but the usage of paths, queries, fragments, matrix parameter, and the combination of all the above.)

@jandrieu
Copy link
Contributor

In the Google doc on the hierarchical portability use case, I just outlined how that use case actually has no need for a service matrix parameter. The assumption that we might need one was driven by another assumption: that we need more than one service endpoint. However, we have no use cases that demonstrate that allowing explicitly correlated (not just correlatable but correlated) service endpoints for a single DID is necessary.

Yes, it's convenient, but it is not required for DIDs to be decentralized and do what they can uniquely do.

If we can accept that zero or one service endpoint per DID Document is all you need, then we can further simplify the DID-URL that @csuwildcat wants so that it need not be two-tier at all. Simply use a query parameter to pass your initial state--because the service endpoint need not be specified in the DID-URL. For what its worth, if you absolutely need the path hierarchy portability, then don't use initial state. Is there a use case that needs both? Not one that has been articulated.

@csuwildcat
Copy link
Contributor

csuwildcat commented Mar 17, 2020

In the Google doc on the hierarchical portability use case, I just outlined how that use case actually has no need for a service matrix parameter. The assumption that we might need one was driven by another assumption: that we need more than one service endpoint.

There are certainly many reasons one might want more than one service endpoint - for example: You could be a DID owner who wants to list endpoint descriptors for professional profiles, git/code presence, social presence, etc.

However, we have no use cases that demonstrate that allowing explicitly correlated (not just correlatable but correlated) service endpoints for a single DID is necessary.

^ For others, please note: the person who left the comment above actively maintains multiple public, explicitly correlated endpoints that link to their singular persona presence in the world. They are using their real name, headshot photo, and bio descriptions to link a singular persona to multiple services, activities, and public data presentations. They are essentially arguing against enabling the very activity they are engaged in.

If we can accept that zero or one service endpoint per DID Document is all you need, then we can further simplify the DID-URL that @csuwildcat wants so that it need not be two-tier at all. Simply use a query parameter to pass your initial-state

This (again) divulges a clear misunderstanding about the technical/utility aspects of the initial-state parameter. If you have a DID you want to use instantly after generation that has implicitly, securely resolvable endpoints associated with it, you need a standard way to convey your DID's initial state, including its endpoint(s) <-- hopefully folks notice the obviousness of this chicken/egg situation. We now have 4-5 different DID Methods and their dev/reps who understand the technicals/utility of this, but I don't want to go into it here any further just because one person seems unable to grasp what they do.

because the service endpoint need not be specified in the DID-URL. For what its worth, if you absolutely need the path hierarchy portability, then don't use initial state.

^ This is simply unacceptable, and I will object if folks try to box critical features out and neuter the ability of Method implementers to provide important functionality.

Now here's where Joe's comment gets game-theoretically interesting: though he conflates a bunch of personal feelings/misunderstandings about initial-values, his effort to force everyone to use only one service endpoint for a DID is bittersweet for me. Why? Because if that happened, it would all-but-assure everyone would be subsequently forced to support a shared Identity Hub-like standard that enables a single endpoint to support all forms of relay, routing, comms, semantic data exchange, and all other activities a person will engage in. Essentially, Joe is arguing to hand the One Endpoint Ring to folks like me who strongly desire the outcome it would lead to. Though Joe's suggestion would likely accelerated realization of my ambitions on a front he probably wasn't considering, I don't think it would be ethical to achieve my ends by supporting something that essentially forced it on everyone.

@jandrieu
Copy link
Contributor

@csuwildcat wrote

There are certainly many reasons one might want more than one service endpoint - for example: You could be a DID owner who wants to list endpoint descriptors for professional profiles, git/code presence, social presence, etc.

Yes. I said it would be convenient.

It isn't necessary.

^ For others, please note: the person who left the comment above actively maintains multiple public, explicitly correlated endpoints that link to their singular persona presence in the world. They are using their real name, headshot photo, and bio descriptions to link a singular persona to multiple services, activities, and public data presentations. They are essentially arguing against enabling the very activity they are engaged in.

That is absolutely correct. I choose to live a public life, others don't. Our job isn't to empower those of us making similar choices, but to protect those who don't.

This (again) divulges a clear misunderstanding about the technical/utility aspects of the initial-state parameter. If you have a DID you want to use instantly after generation that has implicitly, securely resolvable endpoints associated with it, you need a standard way to convey your DID's initial state, including its endpoint(s) <-- hopefully folks notice the obviousness of this chicken/egg situation. We now have 4-5 different DID Methods and their dev/reps who understand the technicals/utility of this, but I don't want to go into it here any further just because one person seems unable to grasp what they do.

I appreciate that you believe you are correct on this and therefore everyone who disagrees with you is incompetent or malicious. But I know exactly how initial state works and what you want from it. Three points you seem to be missing or at the least, dismissing. FIRST, you can do that with query parameters, NOT matrix parameters as multiple contributors have pointed out. SECOND, initial state is a convenience for DID Methods. They are not a requirement. THIRD, you still have not produced a valid use case demonstrating the value that will be created because someone gets a fully formed DID Document instantly instead of after a period of time.

Several DID Methods, such as did:key, did:ethr, and did:jolo, already enable instantly usable DIDs with deterministic DID Documents, the later even support registration to get all the fancy features people want to cram into DID Documents.

But DID creation itself is not a value-creating use case. People don't create DIDs just because DIDs are awesome. They aren't aren't or innately joy inducing. People also don't get more value out of creating DIDs that are instantly convertible to sophisticated DID Documents. Doing a thing that doesn't create value FASTER doesn't make it more valuable.

Let me put it differently, what is so darn important to an actual user that their DID is instantly resolvable to a sophisticated DID Document? What are they DOING that actually benefits from that?

  • Are they minting a DID in order to open a first aid kit in an emergency?
  • Are they minting a DID in order to gain access to a house on fire?
  • Are they minting a DID to make a timely market transaction they had no idea they would be making 10 minutes ago?

These are use cases. Can you provide one compelling one for instant use DIDs that resolve to a sophisticated DID Document?

^ This is simply unacceptable, and you can expect a formal objection from me if folks try to box critical features out and neuter the ability of Method implementers to provide important functionality.

Your convenience features are not a requirement of decentralizing identifiers. If you feel it is appropriate to object because your missing feature isn't in the spec, you are certainly entitled to do that. Just as anyone else is entitled to oppose the unnecessary features you desire. However, your saber rattling at this stage of the conversation is unnecessarily provocative. There is no consensus on this issue one way or another. Since you can do what you are asking for with query parameters, no one is threatening your pet feature. What we are having here is a conversation about how we reconcile competing visions for what DIDs and DID Documents do.

I favor a minimalist, privacy-respecting approach the ensures DIDs maximize the ability for individuals and organizations to act privately without encouraging privacy leaking behaviors that are likely to pull DIDs into an unwinnable battle with GDPR regulators.

Others are also hooked on conveniences fundamentally unnecessary for the decentralization of identifiers. They are NICE to have. YES, people want them. But are they required to decentralize identifiers? Mostly, NO. And unless the fundamental shift enabled by DIDs NEEDS a particular feature, IMO, that feature should be provided at another layer or standardized in a later revision. We are over-complicating this layer of the infrastructure, adding huge amounts of complexity to support features that often are just not well ironed out, from matrix parameters to a surfeit of service endpoints and verification methods. These are useful. Valuable even. But forcing them into 1.0 of the DID Core spec is premature complexification without due process to the architectural harms that may be created.

@csuwildcat
Copy link
Contributor

@jandrieu there's a lot of confusion, misunderstandings, and topical deviation in your reply, so it's hard to know where to start. I'll number this in an effort to compartmentalize:

  1. The entire thrust of my replies over the last couple days is that it may be possible to avoid Matrix parameters entirely and still basically do all the things everyone has mentioned. Manu and others seem to be fine with the proposed direction, which happened to be almost identical. I feel like you're not even attacking me, you're attacking a Matrix-parameter-strawman at this point.
  2. I literally can't understand how anyone could fail to recognize the value in having a full, resolvable DID Document at the inception of a DID that you could turn around and confidently use milliseconds after generation. To only have access to a key for N time afterward (e.g. until decentralized systems propagate something) is very limiting, and there are many use cases that would be hindered:
    1. I desire to create a DID and immediately want to communicate encrypted messages over a standardized DID-linked data/relay endpoint without inventing a one-off, nonstandard handshake protocol to connect us in-band.
    2. I meet someone, generate a DID, and mirror some subset of info about me to a new personal data store/relay instance, which others can instantly access via a standard, resolvable conduit.
    3. I want to create a DID that can be instantly resolved, so someone can resolve it and drop me a message a minute or two later, without requiring an in-band, active connection.
  3. You are certainly free to personally decline to use such features/capabilities, and disagree with all the people who recognize the utility of robust parameter support for this and other uses (for example: Augmentation with DID Parameters w3c-ccg/did-method-key#5). But at the point you make moves to actively block the ability for many, many others to accomplish their needs is when I go from being merely annoyed, to justifiably ticked-off.

Joe, at this point I honestly can't tell if your responses are due to a technical misunderstanding, narrow view of possible usage scenarios, or some other indiscernible reason you have for attacking the general, generative functionality we have mentioned in this Issue, but it has ended up hijacking a thread that seemed close to a resolution with rather broad support.

@kdenhartog
Copy link
Member

kdenhartog commented Mar 17, 2020

@iherman

In state 2 (did:example:123;version=2) this did refers to Alice who's the controller of the keys and the subject of the did.

Forgive me if I sound like a broken record. In my reading of did syntax, did:example:123;version-2 is not a DID, it is a DID URL (in today's terminology). Again per current spec (and, I believe, the spirit of the spec, too) the subject as well as the controller is identified by a DID and not a DID URL.

If the president changes after a new election, then the DID document must be (somehow) updated and the controller's value must be issued a new DID and not have the matrix parameter change.

I am actually more and more concerned that too many possibilities in creating a DID URL muddles the water and create confusion... (This is not picking on matrix parameters, but the usage of paths, queries, fragments, matrix parameter, and the combination of all the above.)

That's a compelling argument for the question "Are matrix parameters the correct way to do this?". Since the inclusion of the matrix parameter does change it from a DID to a DID URL it would leave this as a resounding no it's not the right way to handle this case. I had to go reread the URI and URL specs after your comment to understand this better. It's an important point that I missed. I'll abstain from objecting now.

@csuwildcat
Copy link
Contributor

@kdenhartog if we removed Matrix parameters, would you support (as I believe @msporny and @dlongley do - but please chime in if not!) using basic URL parameters at the DID level for DID/DID Method resolution (either prefixed or not), while other URL params that are passed to subsequently generated 'normal' non-DID URLs would reside inside the associated parameter values themselves?

@kdenhartog
Copy link
Member

kdenhartog commented Mar 18, 2020

yeah, I'm in favor of the method proposed above by the combination of people in this thread. I think this solves the point that @peacekeeper has been raising around PURLs too without being ambiguous. This comment by Orie is what makes me believe that: #159 (comment)

@kdenhartog
Copy link
Member

kdenhartog commented Mar 18, 2020

@jandrieu

In the Google doc on the hierarchical portability use case, I just outlined how that use case actually has no need for a service matrix parameter. The assumption that we might need one was driven by another assumption: that we need more than one service endpoint. However, we have no use cases that demonstrate that allowing explicitly correlated (not just correlatable but correlated) service endpoints for a single DID is necessary.

I believe the requirement of one service endpoint is an unnecessary constraint personally. Say I wanted to accept DIDComm messages over XMPP, HTTP, and SMTP because I'm a DIDComm mediation service provider, then I would like the ability to list multiple service endpoints. Forcing me to use different DIDs to use this service makes the routing layer in DIDComm overly complex. I would have to specify different DIDs to the recipient then what the recipient would specify to me in order to route the message through the mediator service.

With the use of only one DID, it makes the route far simpler to reason about and to implement and so, therefore, I can't build on that same assumption.

@burnburn
Copy link

The heated exchange that recently occurred between @csuwildcat and @jandrieu has now been Overtaken By Events. They have had some offline conversation on this topic that should result in some new PRs and/or points on this thread. Stay tuned.

@peacekeeper
Copy link
Contributor

peacekeeper commented Mar 19, 2020

I think @csuwildcat 's example in #159 (comment) helps to further illustrate how query parameters can be used instead of matrix parameters, building on @dlongley 's earlier analysis in #159 (comment).

I tried to capture both approaches in this image, please correct me if this is not an accurate summary:

did-parameters

@peacekeeper
Copy link
Contributor

peacekeeper commented Mar 19, 2020

Following up on my previous comment, I acknowledge that there many good comments on how matrix parameters could be removed from DID URL syntax. Nevertheless, I still think they are the better solution. I believe they are simpler for a number of reasons, including:

  • they don't require URI encoding of standard URI components (path, query, fragment).
  • the example with matrix parameters is shorter and easier to read, and the standard URI components of DID URLs map directly to the standard URI components of the DID's service endpoints.
  • the standard URI components (path, query, fragment) would be controlled by the DID controller, in analogy with HTTP URLs being controlled by the domain name controller.
  • in the two-tier query string approach it's unclear what a path of the DID URL would mean (it should then probably be removed from the spec)
  • in the two-tier query string approach it's unclear what a fragment of the DID URL would mean when there is also an encoded fragment inside the service-path. The example has a URI-encoded fragment #frag1 and a regular fragment #key-1, but a single DID URL can't be dereferenced to BOTH a key in the DID document AND a service endpoint URL.

We should enable the use of standard URI parsers by removing matrix parameters from the DID specification.

Quoting the very first sentence in this very long thread, I want to re-emphasize one thing: Matrix parameters do NOT break URI parsers. URI syntax in RFC3986 actually defines a class of characters called "sub-delimiters". The semicolon is one of these sub-delimiters. As the name suggests, they are meant for URIs that want to define what RFC3986 calls "other subcomponents" that are part of the "generic URI components". This is exactly what we are doing.

   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="

@csuwildcat
Copy link
Contributor

csuwildcat commented Mar 19, 2020

I think @csuwildcat 's example in #159 (comment) helps to further illustrate how query parameters can be used instead of matrix parameters, building on @dlongley 's earlier analysis in #159 (comment).

Yes, this is correct. If we did this, it would preserve the ability to have DID resolution parameters, as well as any others that were targeted/meant for transformed output paths. Of all the proposed solutions, I believe this is the best compromise that still supports all the uses/utilities folks have mentioned.

@OR13
Copy link
Contributor

OR13 commented Mar 19, 2020

I agree with @peacekeeper regarding URI parsers... sounds more like we are worried about "URL" parsers.... or at least the popular ones... in any case, current URL parsers don't support DID URLs... and @selfissued is in favor of keeping the name DID URL, but against matrix parameters... so worth noting that removing them will NOT solve the URL parser issue... and MIGHT help with URI parsers that don't comply with RFC3986... but does not in fact solve any parsing issues completely...

I'm in favor of (in priority order):

  1. reserving matrix parameters, creating a bi-directional equivalence between them and query params... and recommending the use of query params...

pro: more readable
pro: separation of concerns
con: 2 ways to do things...
con: maybe problems with parsers (beyond what we have already)

  1. removing matrix parameters from the spec entirely.

pro: one less special thing to argue about
con: unreadable URIs

@iherman
Copy link
Member

iherman commented Mar 20, 2020

Thanks @peacekeeper for the example in #159 (comment). Let me just note that there are some hidden assumptions there which all need precise specifications in the document.

  • service=files: what you mean is that you choose the service did:ex:123#files for the action. But that means the 'system' (probably the resolver) must know that in a service=xxx parameter it must look for a service whose identifier is DID#xxx, right? In other words, the value of xxx equivalent to an <a href='#xxx'> as some sort of relative URI/URL. This means all these must be precisely defined in the spec.
  • does this dynamic redirection a valid mechanism for all services? This is probably not true because notion of service is open-ended. What if the service endpoint refers to a URI that does not have the notion of path (say, a URL for an ISBN)? How does one specify that a dynamic redirection does make sense or not? How should a resolver react to an error?

There are probably other assumptions. My impression that if we want to do this properly spec-wise, we will have to work quite a lot to get it right, test it right, etc, and all this for, after all, a feature which is not essential for DID.

If I am right with this assessment, I wonder whether a possible approach may be:

  • We should not standardize matrix parameters in DID v1.
  • The whole topic should go back for a more detailed incubation into the CCG, making decisions on whether a really detailed specification can be incubated there with implementations, proper spec text, etc, and bring this back to this or, more probably, the continuation of this WG as an additional feature for DID v2.

W3C will, most probably, introduce a lighter process this year that allows a quicker and easier update of recommendations when the update means adding a new feature.

@peacekeeper
Copy link
Contributor

This means all these must be precisely defined in the spec. [...] There are probably other assumptions. [...] a feature which is not essential for DID.

@iherman I agree that some details of this and other features need to be further specified. But if you/we think that this is not currently mature enough, then it seems this would be an argument not only against the matrix parameter approach, but also against the alternative query parameter proposal?

@peacekeeper
Copy link
Contributor

Just for historical reference, the query parameter only approach was also explored by @dmitrizagidulin in an analysis in the CCG about 2 years ago, named "Option 2" in this comment: w3c-ccg/did-spec#90 (comment).

@OR13
Copy link
Contributor

OR13 commented Apr 7, 2020

Excited by the IRC poll:

RESOLVED: Use query parameter syntax for encoding DID Parameters, reserving matrix parameter characters for possible use in a future specification.

@peacekeeper
Copy link
Contributor

For the record, I continue to believe that this change is a mistake, since it means that the DID Core spec and DID registries - rather than the DID controller - will be in control of the semantics and functionality of the query string in DID URLs.

However, I really appreciate the large amount of time the WG has put into this discussion, and I respect the resolution on today's DID WG call (link here).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority Discuss during the next WG call PR exists There is an open PR to address this issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.