New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more meta data in the Tracking Status Resource #22

Closed
rvaneijk opened this Issue Mar 12, 2017 · 14 comments

Comments

Projects
None yet
5 participants
@rvaneijk
Contributor

rvaneijk commented Mar 12, 2017

This issue is raised to address the need to add more meta data in the TSR. I would like to further discuss to which extent we could add meta-data fields to the TSR. The aim is to provide as many fields as possible in order to make the DNT protocol fit for purpose as a consent protocol. For consent to be valid in the EU, providing information is key (fairness and transparency principle). Under the draft) ePrivacy Regulation (EU) 2017/0003 (hereinafter: ePR) proposal, consent may be expressed by using the appropriate technical settings of a software application enabling access to the internet. Furthermore, users shall be given the possibility to withdraw their consent at any time.

Below I list the legal requirements under (1) the ePR and (2) te GDPR. Both go hand in hand. This issue is not about what constitutes personal data. Morever, this issue is not about normative requirements (MAY, MUST, SHOULD). This issue is about exploring what we have covered already, where we want to refer with a hyperlink and what properties we can add.

EPR:
The TSR should at least contain, (a) the modalities of the collection, (b) its purpose, (c) the person responsible for it and the (d) other information required under the GDPR where personal data are collected, as well as (e) any measure the end-user of the terminal equipment can take to stop or minimize the collection. My observation is that we have covered some requirements in the TSR already.

GDPR:
The other information required under the GDPR referred to under point d above is as follows:
(a) the identity and the contact details of the controller and, where applicable, of the controller's representative;
(b) the contact details of the data protection officer, where applicable;
(c) the purposes of the processing for which the personal data are intended as well as the legal basis for the processing;
(d) where the processing is based on point (f) of Article 6(1), the legitimate interests pursued by the controller or by a third party;
(e) the recipients or categories of recipients of the personal data, if any;
(f) where applicable, the fact that the controller intends to transfer personal data to a third country or international organization and the existence or absence of an adequacy decision by the Commission, or in the case of transfers referred to in Article 46 or 47, or the second subparagraph
of Article 49(1), reference to the appropriate or suitable safeguards and the means by which to obtain a copy of them or where they have been made available.
2. In addition to the information referred to in paragraph 1, the controller shall, at the time when personal data are obtained, provide the data subject with the following further information necessary to ensure fair and transparent processing:
(a) the period for which the personal data will be stored, or if that is not possible,
the criteria used to determine that period;
(b) the existence of the right to request from the controller access to and rectification or erasure of personal data or restriction of processing concerning the data subject or to object to processing as well as the right to data portability;
(c) where the processing is based on point (a) of Article 6(1) or point (a) of Article 9(2), the existence of the right to withdraw consent at any time, without affecting the lawfulness of processing based on consent before its withdrawal;
(d) the right to lodge a complaint with a supervisory authority;
(e) whether the provision of personal data is a statutory or contractual requirement, or a requirement necessary to enter into a contract, as well as whether the data subject is obliged to provide the personal data and of the possible consequences of failure to provide such data;
(f) the existence of automated decision-making, including profiling, referred to in Article 22(1) and (4) and, at least in those cases, meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject.
3. Where the controller intends to further process the personal data for a purpose other than that for which the personal data were collected, the controller shall provide the data subject prior to that further processing with information on that other purpose and with any relevant further information as referred to in paragraph 2.

EMBEDDED_RESOURCES PROPERTY:
Now that we have all the pieces of information, we should at least discuss whether and - if not already covered - how we could address these transparency elements as attributes in the TSR. One example is adding an array of known embedded third party resources. The example webpage is my own natuurlijkehaarkleuring.nl/afspraak. It includes a third-party online scheduling API of acuityscheduling.com. The third party uses external embedded resources of to measure JavaScript errors (usage.trackjs.com, js-agent.newrelic.com, bam.nr-data.net). Adding a embedded_resource property makes sense to me.

Regards,
Rob

@royfielding

This comment has been minimized.

Show comment
Hide comment
@royfielding

royfielding Mar 13, 2017

Collaborator

The consent mechanism is a web page provided by the data controller that informs, asks for specific consent, and registers that consent in the API (and perhaps a cookie).

The TSR provides basic information to obtain tracking policies, pre or post-consent, and enough of the site's general policies to satisfy a user agent's "minimum bar" for presenting a consent dialog. It doesn't play much of a role in the consent itself. The TSR doesn't even need to be accessed; a site cannot assume the user has looked at the TSR, so it will repeat the same (or more specific) information when requesting consent.

I don't see any point in having an embedded resources property. The actual embedded resources are obtained by the user agent when it requests an HTML page and the user agent has full control over whether or not to access those resources. A separate list in the TSR is redundant (and guaranteed to be less accurate).

However, in general, the tracking status representation is an extensible JSON object. That means a site can add whatever additional metadata it might want, and such additional metadata might be required by whichever specifications they claim in Compliance. Recipients are required to ignore additional elements that they do not understand, so there should be no limit to the TSR's future (or regional) extensibility.

Collaborator

royfielding commented Mar 13, 2017

The consent mechanism is a web page provided by the data controller that informs, asks for specific consent, and registers that consent in the API (and perhaps a cookie).

The TSR provides basic information to obtain tracking policies, pre or post-consent, and enough of the site's general policies to satisfy a user agent's "minimum bar" for presenting a consent dialog. It doesn't play much of a role in the consent itself. The TSR doesn't even need to be accessed; a site cannot assume the user has looked at the TSR, so it will repeat the same (or more specific) information when requesting consent.

I don't see any point in having an embedded resources property. The actual embedded resources are obtained by the user agent when it requests an HTML page and the user agent has full control over whether or not to access those resources. A separate list in the TSR is redundant (and guaranteed to be less accurate).

However, in general, the tracking status representation is an extensible JSON object. That means a site can add whatever additional metadata it might want, and such additional metadata might be required by whichever specifications they claim in Compliance. Recipients are required to ignore additional elements that they do not understand, so there should be no limit to the TSR's future (or regional) extensibility.

@mschunte2 mschunte2 added this to the TPE-CR-April-2017 milestone Mar 13, 2017

@rvaneijk

This comment has been minimized.

Show comment
Hide comment
@rvaneijk

rvaneijk Mar 13, 2017

Contributor

This ties loosely in to issue # 2

Contributor

rvaneijk commented Mar 13, 2017

This ties loosely in to issue # 2

@rvaneijk

This comment has been minimized.

Show comment
Hide comment
@rvaneijk

rvaneijk Mar 20, 2017

Contributor

Ideally the TSR should contain information about the purpose of the interference by the first party, the person or organisation responsible for the interference, information how the end-user can withdraw his consent and all further information required based on the GDPR when personal data are collected.

Moreover, a distinction between the first party and (sub) contractors could IMHO be made in seperate TSR-properties, i.e., one for the first party (which is the same party property), and another property listing known and trusted/contracted web resources (which would be a new web resource property).

Contributor

rvaneijk commented Mar 20, 2017

Ideally the TSR should contain information about the purpose of the interference by the first party, the person or organisation responsible for the interference, information how the end-user can withdraw his consent and all further information required based on the GDPR when personal data are collected.

Moreover, a distinction between the first party and (sub) contractors could IMHO be made in seperate TSR-properties, i.e., one for the first party (which is the same party property), and another property listing known and trusted/contracted web resources (which would be a new web resource property).

@royfielding

This comment has been minimized.

Show comment
Hide comment
@royfielding

royfielding Mar 20, 2017

Collaborator

The first party includes its subcontractors. Third parties would be parties not controlled by the first party.

Collaborator

royfielding commented Mar 20, 2017

The first party includes its subcontractors. Third parties would be parties not controlled by the first party.

@rvaneijk

This comment has been minimized.

Show comment
Hide comment
@rvaneijk

rvaneijk Mar 20, 2017

Contributor

Hi Roy,
Agree, but what about a common subprocessor use case?
For example, the webpage natuurlijkehaarkleuring.nl/afspraak includes an API from its processor Acuity Scheduling. A signed contract is in place. However, Acuity Scheduling relies on its own subprocessors, i.e. a pixel from TrackJS, a JavaScript and a UID cookie from New Relic, both for measuring API problems through Acuity's install base. No signed contract in place with the first party. The subprocessors are not controlled by the first party. My point is, that property to list the subprocessors not controlled by the first party would be useful to be more specific about the span of control of the first party. What are your thoughts on this use case?

Contributor

rvaneijk commented Mar 20, 2017

Hi Roy,
Agree, but what about a common subprocessor use case?
For example, the webpage natuurlijkehaarkleuring.nl/afspraak includes an API from its processor Acuity Scheduling. A signed contract is in place. However, Acuity Scheduling relies on its own subprocessors, i.e. a pixel from TrackJS, a JavaScript and a UID cookie from New Relic, both for measuring API problems through Acuity's install base. No signed contract in place with the first party. The subprocessors are not controlled by the first party. My point is, that property to list the subprocessors not controlled by the first party would be useful to be more specific about the span of control of the first party. What are your thoughts on this use case?

@mschunte2

This comment has been minimized.

Show comment
Hide comment
@mschunte2

mschunte2 May 1, 2017

Collaborator

2017-05-1: Draft Agreements:

  • Mutual Transparency: If a site-wide exception exists, browser SHOULD inform site what subset of third parties was hindered (i.e. not loaded and/or not with DNT;0)
  • Constraining Third Parties: A site should be able to tell a browser to only load third parties that are explicitly listed.
    OPEN:
  • If a site-wide exception exists, to what extent can a user (via its UA) decide what subset of third parties to accept or not (while being transparent about it).
Collaborator

mschunte2 commented May 1, 2017

2017-05-1: Draft Agreements:

  • Mutual Transparency: If a site-wide exception exists, browser SHOULD inform site what subset of third parties was hindered (i.e. not loaded and/or not with DNT;0)
  • Constraining Third Parties: A site should be able to tell a browser to only load third parties that are explicitly listed.
    OPEN:
  • If a site-wide exception exists, to what extent can a user (via its UA) decide what subset of third parties to accept or not (while being transparent about it).
@mschunte2

This comment has been minimized.

Show comment
Hide comment
@mschunte2

mschunte2 May 9, 2017

Collaborator

2017-05-1: Consensus
Agreements:

  1. We introduce an optional field otherParties that MAY contain a machine-readable list of URL patterns that indicates what third parties may be used by the site.
  2. If a site-wide exception exists, browser SHOULD inform site what subset of third parties did not receive DNT;0
Collaborator

mschunte2 commented May 9, 2017

2017-05-1: Consensus
Agreements:

  1. We introduce an optional field otherParties that MAY contain a machine-readable list of URL patterns that indicates what third parties may be used by the site.
  2. If a site-wide exception exists, browser SHOULD inform site what subset of third parties did not receive DNT;0
@mschunte2

This comment has been minimized.

Show comment
Hide comment
@mschunte2

mschunte2 May 9, 2017

Collaborator

Text proposal for otherParties (from Rob):
.5.x Other-party Property

Since a user's experience on a given site might be composed of resources that are assembled from multiple domains, it might be useful for a site to distinguish those domains that are not subject to their own control (i.e., no information might be obtained via the controller property or the same-party property). An origin server MAY send a property named other-party with an array value containing a list of (sub)domain names that the origin server claims to include, to the extent they are referenced by the designated resource, and if all data collected via those references do not share the same data controller as the designated resource.

Collaborator

mschunte2 commented May 9, 2017

Text proposal for otherParties (from Rob):
.5.x Other-party Property

Since a user's experience on a given site might be composed of resources that are assembled from multiple domains, it might be useful for a site to distinguish those domains that are not subject to their own control (i.e., no information might be obtained via the controller property or the same-party property). An origin server MAY send a property named other-party with an array value containing a list of (sub)domain names that the origin server claims to include, to the extent they are referenced by the designated resource, and if all data collected via those references do not share the same data controller as the designated resource.

@michael-oneill

This comment has been minimized.

Show comment
Hide comment
@michael-oneill

michael-oneill May 10, 2017

Collaborator

It does not need the (sub). The set of "domain names" includes sub-domain names

Collaborator

michael-oneill commented May 10, 2017

It does not need the (sub). The set of "domain names" includes sub-domain names

@michael-oneill

This comment has been minimized.

Show comment
Hide comment
@michael-oneill

michael-oneill May 10, 2017

Collaborator
Collaborator

michael-oneill commented May 10, 2017

@michael-oneill

This comment has been minimized.

Show comment
Hide comment
@michael-oneill

michael-oneill May 10, 2017

Collaborator

Full wildcards (i.e. just "*") would be equivalent to not having otherParties at all, so is pointless. Partial wildcards (e.g. *.example.com") would be OK but they do end up making the code more inefficient (if it is iimplemented in JS). It is a lot faster to use the Array.indexOf function to match against URL:s, partial wildcards means you have to search in JS.
We have not documented partial wildcards up to now in the TPE.

Collaborator

michael-oneill commented May 10, 2017

Full wildcards (i.e. just "*") would be equivalent to not having otherParties at all, so is pointless. Partial wildcards (e.g. *.example.com") would be OK but they do end up making the code more inefficient (if it is iimplemented in JS). It is a lot faster to use the Array.indexOf function to match against URL:s, partial wildcards means you have to search in JS.
We have not documented partial wildcards up to now in the TPE.

@dwsinger

This comment has been minimized.

Show comment
Hide comment
@dwsinger

dwsinger May 15, 2017

Collaborator

I'm still unsure what this means to the user or user-agent. Perhaps something like:

An origin server MAY send a property named otherParty with an array value containing a list of third-party domain names of sites incorporated by reference into the origin site, that the origin server is aware of, but those references do not share the same data controller as the designated resource.

Collaborator

dwsinger commented May 15, 2017

I'm still unsure what this means to the user or user-agent. Perhaps something like:

An origin server MAY send a property named otherParty with an array value containing a list of third-party domain names of sites incorporated by reference into the origin site, that the origin server is aware of, but those references do not share the same data controller as the designated resource.

@dwsinger

This comment has been minimized.

Show comment
Hide comment
@dwsinger

dwsinger May 15, 2017

Collaborator

Another possible direction.

Insert into the site-specific exception API call: The arrayOfDOMStrings in the site-specific exception call MUST only contain DOMStrings that also occur in the union of otherParties+sameParty(+ other fields?) arrays. When the explicit or default arrayOfDOMStrings value of "" is used, otherParties MUST also have an explicit or default value of "".

and in the otherParties documentation: say that the default value is "*"., so it reads in total:

An origin server MAY send a property named otherParty with an array value containing a list of third-party domain names of sites incorporated by reference into the origin site, that the origin server is aware of, but those references do not share the same data controller as the designated resource. The default value is "*", i.e. any other party may be involved in this site.

Collaborator

dwsinger commented May 15, 2017

Another possible direction.

Insert into the site-specific exception API call: The arrayOfDOMStrings in the site-specific exception call MUST only contain DOMStrings that also occur in the union of otherParties+sameParty(+ other fields?) arrays. When the explicit or default arrayOfDOMStrings value of "" is used, otherParties MUST also have an explicit or default value of "".

and in the otherParties documentation: say that the default value is "*"., so it reads in total:

An origin server MAY send a property named otherParty with an array value containing a list of third-party domain names of sites incorporated by reference into the origin site, that the origin server is aware of, but those references do not share the same data controller as the designated resource. The default value is "*", i.e. any other party may be involved in this site.

@rvaneijk

This comment has been minimized.

Show comment
Hide comment
@rvaneijk

rvaneijk May 22, 2017

Contributor

Reply to https://lists.w3.org/Archives/Public/public-tracking/2017May/0108.html

Hi David,

I think your text may be merged with my proposal. my suggestions are in [[ ]]

[[Since a user's experience on a given site might be composed of resources that are assembled from multiple domains, it might be useful for a site to distinguish those domains that are not subject to their own control (i.e., no information might be obtained via the controller property or the same-party property).]] OtherParties is a list of [[domains]] that are operated by data controllers other than the first party, that may be referenced by the first party site and are [[domains]] to its operation, and for whom the first party has [some confidence? assurance? belief?] that they will at least respect the DNT header. There is no assurance or statement made about parties encountered on the first party that are not in the union of the first party, the sameParty array, and the otherParties array. [[An origin server MAY send a property named other-party with an array value containing a list of (sub)domain names that the origin server claims to include, to the extent they are referenced by the designated resource, and if all data collected via those references do not share the same data controller as the designated resource.]]

-----Original message-----
From: David Singer
Sent: Friday, May 19 2017, 6:39 pm
To: Rob van Eijk
Cc: Shane Wiley; Matthias Schunter (Intel Corporation); public-tracking@w3.org
Subject: Re: Issue-22, possible other direction

Do we have proposed spec. text?

Rob, I am still concerned that the ‘transparency’ may be a myth if I am right and the array can be wrong:

a) by omission; the first party site may pull in sites not mentioned in the otherParty array (quite likely, full coverage may be very hard to achieve);
b) by inclusion: the array might mention sites that are not, in fact, pulled in on a given visit (quite likely, as what other sites are pulled in depends on a host of factors)

If these are both true, then the array could be a complete myth and still conformant. In that case, what use is it to anyone?

Contributor

rvaneijk commented May 22, 2017

Reply to https://lists.w3.org/Archives/Public/public-tracking/2017May/0108.html

Hi David,

I think your text may be merged with my proposal. my suggestions are in [[ ]]

[[Since a user's experience on a given site might be composed of resources that are assembled from multiple domains, it might be useful for a site to distinguish those domains that are not subject to their own control (i.e., no information might be obtained via the controller property or the same-party property).]] OtherParties is a list of [[domains]] that are operated by data controllers other than the first party, that may be referenced by the first party site and are [[domains]] to its operation, and for whom the first party has [some confidence? assurance? belief?] that they will at least respect the DNT header. There is no assurance or statement made about parties encountered on the first party that are not in the union of the first party, the sameParty array, and the otherParties array. [[An origin server MAY send a property named other-party with an array value containing a list of (sub)domain names that the origin server claims to include, to the extent they are referenced by the designated resource, and if all data collected via those references do not share the same data controller as the designated resource.]]

-----Original message-----
From: David Singer
Sent: Friday, May 19 2017, 6:39 pm
To: Rob van Eijk
Cc: Shane Wiley; Matthias Schunter (Intel Corporation); public-tracking@w3.org
Subject: Re: Issue-22, possible other direction

Do we have proposed spec. text?

Rob, I am still concerned that the ‘transparency’ may be a myth if I am right and the array can be wrong:

a) by omission; the first party site may pull in sites not mentioned in the otherParty array (quite likely, full coverage may be very hard to achieve);
b) by inclusion: the array might mention sites that are not, in fact, pulled in on a given visit (quite likely, as what other sites are pulled in depends on a host of factors)

If these are both true, then the array could be a complete myth and still conformant. In that case, what use is it to anyone?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment