Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add section on service privacy #515

Merged
merged 7 commits into from
Jan 10, 2021
Merged

Add section on service privacy #515

merged 7 commits into from
Jan 10, 2021

Conversation

dhh1128
Copy link
Contributor

@dhh1128 dhh1128 commented Dec 22, 2020

This is an alternative embodiment of the "Service Privacy" guidance as proposed by Adrian Gropper in this comment. If accepted, it would supersede #511 and address issue #382.

Signed-off-by: Daniel Hardman daniel.hardman@gmail.com


Preview | Diff

Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
index.html Outdated Show resolved Hide resolved
@agropper
Copy link
Contributor

agropper commented Dec 22, 2020 via email

@agropper agropper mentioned this pull request Dec 23, 2020
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated Show resolved Hide resolved
index.html Outdated
service endpoint as the root. In some cases, the publication mechanism might
reference a DID Document with no service endpoints at all. For category 2,
prefer using only one service that points to an authorization server or to a
mediator / proxy that can provide a kind of herd immunity, or both. For
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mediator / proxy that can provide a kind of herd immunity, or both. For
mediator/proxy that can provide a kind of herd immunity, or both. For

index.html Outdated
category 3, avoid the use of multiple service endpoints for a DID because some
of these (e.g. an authorization server) are likely to be reused with other,
related DIDs. Place correlatable service endpoints behind a “firewall”, if
possible, or introduce a mediator / proxy as a sole service endpoint in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
possible, or introduce a mediator / proxy as a sole service endpoint in
possible, or introduce a mediator/proxy as a sole service endpoint in

@msporny
Copy link
Member

msporny commented Jan 3, 2021

This PR is waiting on @dhh1128 to respond to review comments and either accept changes, or reject them with reasoning.

@dhh1128
Copy link
Contributor Author

dhh1128 commented Jan 4, 2021

I'm just noting that I'm aware of these suggestions/points of feedback, and working through them is on my to-do list.

I don't have strong opinions about most of this stuff; I submitted the the content in this PR but attributed @agropper as author. So, while I can easily merge updates to polish the PR, I think Adrian should express an opinion about how much of the feedback aligns with the thinking he wanted to propose.

@agropper
Copy link
Contributor

agropper commented Jan 4, 2021 via email

@agropper
Copy link
Contributor

agropper commented Jan 4, 2021 via email

@agropper
Copy link
Contributor

agropper commented Jan 4, 2021 via email

dhh1128 and others added 5 commits January 4, 2021 14:41
Co-authored-by: Dave Longley <dlongley@digitalbazaar.com>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Co-authored-by: Manu Sporny <msporny@digitalbazaar.com>
Co-authored-by: Manu Sporny <msporny@digitalbazaar.com>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@TallTed
Copy link
Member

TallTed commented Jan 4, 2021

@agropper --

I think my revision was not communicated clearly by github's email gateway. This was the revised suggestion (which @dhh1128 merged, anticipating your OK) --

unintended consequences. DIDs can identify documents, services, schemas, and
other things that may be associated with individual people,

I don't feel strongly about "services" vs "service endpoints" here.

I think that the "... as well as ..." phrasing implies that the thing that follows "as well as" is implicit (and so could have been left unstated), while the things that precede "as well as" had to be explicit -- i.e., that the "as well as" phrasing implies that DIDs are mostly about identifying "services" (or "service endpoints") and "documents, things, and schemas" are surprising bonus things to be identified by DIDs.

(I am now a bit concerned that this text now implies that DIDs cannot identify people because people are not listed as being identified by DIDs, only as being inversely associated with things that are identified by DIDs...)

@agropper
Copy link
Contributor

agropper commented Jan 5, 2021 via email

@TallTed
Copy link
Member

TallTed commented Jan 5, 2021

DIDs are indeed Identifiers, which take the form of URIs/URLs/IRIs in the did: scheme, which we are defining, and otherwise conform to RFC 3986. Just like any other URI/URL/IRI, they can be used to identify (a/k/a name, a/k/a denote) people, places, services, concepts, schemas, documents, and any other thing, material or imaginary, any of which might be associated or correlated with individual persons or groups thereof.


Note 0 — My list of classes of entities of which DIDs may Identify individuals is longer than your original, in an effort to increase clarity -- i.e., DIDs may Identify anything (though there will be many classes which just don't need DIDs, and the world will be perfectly fine continuing to Identify them with CIDs such as HTTPS-based Github or Twitter or Facebook or example.com URIs).

Note 1 -- I put "any other thing" at the end of that list of classes, because every other class (every other "thing") in the list is a sub-type of "thing". "Things" don't belong in the middle, because there's nothing else -- no super-class "xyz" -- that fits as an "any other xyz" phrase to end the list.

Note 2 -- A "person" was the first "thing" conceived of as needing a DID to identify him/her/it-self, i.e., to be the Controller of their own Decentralized Identifier a/k/a DID. Leaving "people" out of the list of "things" which a DID may identify is problematic, even if another longer list is found elsewhere in the sam document. It would inevitably cause confusion as readers encounter that truncated list and say, "Clearly people can't be identified by DIDs — people aren't in this list of things that DIDs can Identify!" I think the privacy focus calls even louder for inclusion of "people".

@agropper
Copy link
Contributor

agropper commented Jan 5, 2021 via email

@TallTed
Copy link
Member

TallTed commented Jan 7, 2021

@agropper

The phrase "...most documents, things, and many schemas are..." seems to suggest that "schemas" are not "things", and that does not work for me. That short phrase could be changed to "...many documents, schemas, and other things are..." and I'd be satisfied with your latest paragraph -- but I don't think that paragraph flows with the rest of this section of the document.


I think this is getting harder to work on without larger context. Below, there are 3 versions of the paragraph of immediate focus. The original from the PR; my previously suggested revision (adjusted a bit after feedback); and my currently suggested revision.

I stand by the second version below. I would also accept the third.

Original:

The degree of additional privacy risk caused by using multiple service 
endpoints in one DID document can be difficult to estimate. Privacy 
harms are typically unintended consequences. DIDs can identify documents, 
things, and schemas as well as services. As such, they will be associated 
with individual people, households, clubs, and employers &mdash; and 
correlation of their service endpoints could become a powerful 
surveillance and inference tool.

My almost-original suggested change:

The degree of additional privacy risk caused by using multiple service 
endpoints in one DID document can be difficult to estimate. Privacy 
harms are typically unintended consequences. DIDs can identify documents, 
services, schemas, and other things that may be associated 
with individual people, households, clubs, and employers &mdash; and 
correlation of their service endpoints could become a powerful 
surveillance and inference tool.

My current suggestion, trying to incorporate what you've responded with:

The degree of additional privacy risk caused by using multiple service 
endpoints in one DID document can be difficult to estimate. Privacy 
harms are typically unintended consequences. DIDs can directly identify 
specific individuals or groups of people. DIDs can also indirectly and 
unintentionally identify individual people, households, clubs, employers, 
and other groups of people by directly identifying documents, schemas, and 
other things that may be associated with those individuals or groups
through correlation, traffic analysis, AI/ML techniques, etc.

@agropper
Copy link
Contributor

agropper commented Jan 7, 2021 via email

@TallTed
Copy link
Member

TallTed commented Jan 7, 2021

@agropper

The third version adds: "DIDs can directly identify specific individuals..." I really don't think we want to go there. This has been discussed by @jandrieu in other threads. I see no reason to bring it up here.

I don't know why you don't want to go there, in a privacy section, where the fact that DIDs can directly identify individuals is the most obvious top-level privacy risk, and the fact that DIDs which identify non-human-things (including services, documents, schems, etc.) can be used (with various technologies) to indirectly identify individuals, being a less-obvious second- or third- or lower-level privacy risk. I believe both should be noted in a privacy section!

The third version also introduces "unintentionally" which feels wrong.

I don't understand your objection to "unintentionally", which is essentially equivalent to "unintended consequences" (your words).

A thing is not a service. In the context of a privacy issues section it is simple and clear. This section is not about defining what DIDs are or are intended to be.

Yes, this section is not about defining (nor redefining, which seems to me one unintended side-effect of leaving out the "direct identification of individual people") DIDs, but it is about discussing how what DIDs are can lead to privacy issues.

"Services" is a subclass of "things" (a/k/a "entities", a/k/a "concepts"). All instances of class "services" are also instances of class "things". On the other hand, many instances of class "things" are not instances of class "services".

My couch not a service; my refrigerator is not a service; my house is not a service -- but all three of these are things, and they may be identified by DIDs. Correlating the facts that :alice sits on my couch, and eats from my refrigerator, and sleeps in my house, one might infer that { :alice foaf:knows :tallted } and { :tallted foaf:knows :alice }. That's a real-world privacy risk.

Please stay with the first version or get other people involved.

I'm happy to have other people involved. I don't believe anyone else who saw my initial suggested edit took issue with it. Indeed, @dhh1128 merged it, anticipating that you would agree with it. I wonder if @msporny, @dlongley, maybe @burnburn have some thoughts to contribute?

@dhh1128
Copy link
Contributor Author

dhh1128 commented Jan 7, 2021

I have been quiet for a while, but just wanted to say that all of the versions of the paragraph in question feel okay to me -- Adrian's, or any of Ted's revisions. The key point is that the ideal from a privacy perspective is to disclose nothing beyond control keys in a DID document. If a DID identifies a person and its document is publicly viewable, then the more we step away from that ideal, the more we incur a privacy risk; leaked metadata is exploitable. The language differs in its precision and the finer points of how it explains this concern, but the principle is right in all versions of the text.

index.html Outdated Show resolved Hide resolved
@msporny
Copy link
Member

msporny commented Jan 10, 2021

Editorial, multiple reviews, changes requested and made, no objections, merging.

@msporny msporny merged commit c709e4b into w3c:main Jan 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants