Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing use cases with “subsets” #96

Open
helenyc opened this issue Aug 2, 2022 · 10 comments
Open

Addressing use cases with “subsets” #96

helenyc opened this issue Aug 2, 2022 · 10 comments

Comments

@helenyc
Copy link

helenyc commented Aug 2, 2022

[Note: This issue is related to the changes proposed in PR #91 and summarized on issue #92]

The updated First-Party Sets proposal uses “subsets” to correlate different domains added to a set to map to a defined use case.

The previous First-Party Sets proposal required that all domains that were added to a set needed to have a common privacy policy, and common brand, and common ownership. We received feedback from the ecosystem that this joining of all of these different characteristics did not support certain use cases well. For example, a domain may have a more restrictive privacy policy for a domain that is targeted to children, or may have a different policy depending on the regulatory concerns in the country where the company is operating.

The new approach ensures that user privacy is thus met by accommodating the different use cases that a multi-domain entity may have. We have introduced the “subset” approach to design rules for the different use cases in each subset.

Does the “subsets” approach to FPS meet all intended use cases? Where are the limitations?

@alexmerk-ringieradvertising

We are interested at Ringier Advertising to set up a subset and hope this feature will be adopted by most browsers, as it is important for us to share our first party cookie data between our own large portfolio of sites.

@rblanck
Copy link

rblanck commented Aug 22, 2022

I would like to note that we as Axel Springer believe that firstpartysets should not be restricted/defnied by a pure technical definition of associated domains/domains with a vague definition of affiliation but by using a ownership centric model as a baseline for the discussed use-cases.

Ownership is a clear legal process, which also includes clear responsibility and governance for the websites exposed in the imprint and legal compliance requirements (e.g. responsible data controller under GDPR).

Even if a publisher operates different websites, it is still one publisher and one legal entity. What is the point behind this, if the technical ideas here go beyond known and used legal principles? Especially if you violate legal requirements at the same time for technical reasons (service subsets (CDN)).

We also think that this part of the discussion is not technical in nature, but a pure governance discussion which should not be held using Github tickets. This questions is crucial for publisher and should be discussed with publisher and website owners and their business or legal advisors.

@dmarti
Copy link

dmarti commented Aug 22, 2022

Hi @rblanck -- ownership was previously considered as one of the criteria for forming a set, but previous discussions on this project raised several issues that would have made it impractical to apply for many categories of sites. (Summary and links, that mostly still apply to the current version).

The current version of FPS would eliminate the Independent Enforcement Entity (IEE) and replace it with a public review process, on a GitHub repository or similar. All of the corporate ownership research that would have been difficult and costly for the IEE would also be hard for public reviewers.

@rblanck
Copy link

rblanck commented Aug 22, 2022

SSL certs are a practical live model for ownership that can be easily tweaked.

Being somebody who had gone through the process of extended validation cert can say that this proof is stable. It's a trust mechanism also for a highly security-needed banks. Of course, this has costs, but if you are interested in building a first-party set for nontechnical reason this kind of cost should not be a problem.

Did you make any quantitative exploration on this topic? I believe it could be crucial for website owners and legal entities, but I think they dont even know about this discussion from the specialists :)

some short answers to the summary points one by one:

data usage:
I'm sure that data usage matters to users, but the regulatory environment is the space we act in. And that is very clear that ownership is the dimension we are responsible for regarding privacy and all other dimensions.

costs:
Website owners and browsers are not the organizations to enforce or do the proof. Even now, its Trustees are doing this Job in a scalable way. I don't talk here about standard SSL certs with nothing behind it.

building entity:
There should be something which prevents from building up "mass structures" but I think thats easy so solve

attackers:
Please show me the case where you can easily falsify the ownership on a highly trusted extended validation cert. Perhaps there were a few small cases in the past, but otherwise, nobody would rely on ownership of domains.

But on a summary if governance and political people are willing to this it would work. Its sounds for me more like and "agenda" where everyone brings their arguments. But this is a discussion which should not be held in a w3c forum in some github rather then with relevant people outside of this tech forums and also some European people which have some other regulation and law in place.

Would be interesting to see some people directly from publishers in this discussion and a good non tech explainer about pro and cons of this proposals and the different opinions.

@dmarti
Copy link

dmarti commented Aug 22, 2022

@rblanck There was an early discussion of the use of EV certificates for FPS, but it turned out to have some complex problems (see #12 for why this was rejected).

@rblanck
Copy link

rblanck commented Aug 23, 2022

because the other issue is closed I also copy my answer from #12 into here:

_Everyone is explaining was does not work in the current infrastructure under EV.
And yes, that system seems not perfect at the moment, but it's still in use, although browsers do not show this EV status anymore.

But the question stays, what are the requirements for a safe, legal entity check (ownership check), and does it make sense to build up a body or a third party trustee to take that role to produce proof of ownership? In FPS subsets, ownership requirements still seem much lower than the business ownership proof. Sounds a bit strange to me. I´m not sure if the public suffix list process for example, is so safe and governed correctly.

I believe there is a need for trusted legal entity ownership in many of these processes._

@npdoty
Copy link

npdoty commented Aug 30, 2022

I don't understand how this proposal "ensures that user privacy is thus met by accommodating the different use cases that a multi-domain entity may have".

I can understand how this change accommodates more use cases for companies that wish to share data across multiple domains, but I don't see any documentation on how that helps people or protects user privacy.

For example: why does the user wish to have their data from three (or more, maybe up to an infinite number?) companies with different privacy policies automatically combined? Does having a common logo somewhere on the page of multiple companies match to an important concept of privacy that many people have? Do people generally review the "About" pages of multiple domains prior to deciding whether to present a single identity to all of them?

@johannhof
Copy link
Member

Yeah, that sentence depends a bit on the context/perspective of the reader.

The goal is for users to be able to use services that for technical, business or other reasons have a need to implement user flows that cross multiple sites while maintaining some common state, in a world with otherwise restricted cross-site data flow. This should happen while minimizing risk of abuse (sharing data without specific user benefit, understanding and participation), i.e. things that hurt user privacy. As Helen mentioned, the previous criteria weren't optimal for achieving either and we've heard this feedback. As an alternative approach, it's interesting to consider that not all sites that need to share data exhibit the same risk, technical needs and user understanding.

This is where the new FPS helps, by "tagging" use cases for cross-site data sharing, so that browsers may utilize this information when considering Storage Access API requests. Chrome would likely choose to automatically grant access where we're confident that contextual integrity between two sites is ensured for users based on FPS (such as same-name ccTLDs or members of a small subset that per policy clearly presents its association). Other browsers may decide to prompt (possibly in addition to looking at FPS) or apply different heuristics to gain that confidence.

@npdoty
Copy link

npdoty commented Aug 30, 2022

Please forgive my continued lack of understanding.

As I understand it, there was feedback that some companies wanted to combine data across sites that had different privacy policies and practices, like combining data across child-focused sites and adult-focused sites, or combining data across sites that have different regulatory requirements in different countries, or combining data between multiple companies that want to connect their data about a user. Those don't sound like examples of maintaining contextual integrity; rather, they sound like prototypical examples of crossing contexts that have different norms, laws or expectations.

There was also feedback, from the Privacy CG (including myself), that sought in vain to understand how users would benefit or why users would want to automatically combine their data across more sets.

User benefit, user understanding and user participation sound like promising (although perhaps not exhaustive) ways to evaluate different proposals. And contextual integrity could also be one framework for considering privacy. I haven't yet seen in the documentation how expanding to cover more business use cases with different kinds of subsets with fewer restrictions contributes to user benefit, user understanding, user participation or contextual integrity. My questions above could be one place to start, or I'd be happy to provide more if that would be helpful.

@dmarti
Copy link

dmarti commented Aug 30, 2022

@npdoty I agree. The way to tell if a set is valid is to see if there is an existing user expectation that the two domains are the same context or "thing" they're interacting with. One way to think about whether a set is valid could be: would the user be more surprised if a piece of information (like a preference) is not shared between set member domains than it it was shared?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants