Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permissioning layer for match keys #39

Open
csharrison opened this issue Feb 28, 2023 · 2 comments
Open

Permissioning layer for match keys #39

csharrison opened this issue Feb 28, 2023 · 2 comments

Comments

@csharrison
Copy link
Contributor

Currently in IPA, a match key set by a site can be used by any other site. There is no mechanism in the system where a site could choose to keep a match key to itself. I have serious concerns about this.

In practical terms, to use IPA to achieve cross-device attribution, match keys would likely be derived from PII. This means there is a tension between using IPA to its fullest, and pervasively sharing your device graph / PII-derived user data with the rest of the web ecosystem.

Like most other APIs that store user data, IPA should abide by the Same Origin Policy by default, and not expose read access to match keys, even in an encrypted form. If sites want to share their match keys with others, we can support selectively exposing access in an opt-in way with a permissioning system. This could look something like a set policy declared at setMatchKey time:

setMatchKey(<key>, {
  exposeToOrigins: ['https://foo.com', 'https://bar.*.com']
}

Where we could support pattern matching using the URLPattern API infrastructure, which will let a provider allow a specific set of report collector origins (or everyone with "*") to consume their match keys.

Additionally, we could support dynamic, just-in-time permissions, e.g.

  • getMatchKey could automatically grant permission to a report collector if the API was called within a document whose origin matches the provider.
  • If we support an HTTP API (Match keys without JavaScript (for browser implementations) #25) for getting match keys, we could allow a provider to redirect to a report collector and append its match key.

These changes would allow sites to use IPA without fear of leaking their user’s data to parties outside their control.

Note: We might need something more sophisticated if we want match key setting to persist beyond a single browser / app (e.g. storage mediated by an operating system), but I think we can discuss that later on.

@benjaminsavage
Copy link
Collaborator

Before responding to the suggestion about permissioning match keys itself, I'd like to more fully understand your concerns with the current proposal.

You said:

In practical terms, to use IPA to achieve cross-device attribution, match keys would likely be derived from PII. This means there is a tension between using IPA to its fullest, and pervasively sharing your device graph / PII-derived user data with the rest of the web ecosystem

And later said:

These changes would allow sites to use IPA without fear of leaking their user’s data to parties outside their control.

Could you please flesh out in more concrete terms what exactly the threat / risk is?

You've alluded to "pervasively sharing your device graph / PII-derived user data with the rest of the web ecosystem", but I do not understand what you're referring to. In the current proposal, I do not see any way for another site to learn any of the following:

  1. Any value of any match key
  2. Any PII
  3. Any way to link one record to a record from another site
  4. The number of devices / browsers utilizing the same match key
  5. Any metadata of any use or interest about the user/device graph of the match key provider

Maybe I'm missing something and there is a specific attack vector I haven't considered which allows and attacker to exfiltrate one or more of these pieces of information? If so, could you outline the attack?

As it is, the phrase "leaking their user’s data to parties outside their control" sounds very scary, but I just don't see that reflected in fact.

@csharrison
Copy link
Contributor Author

There are a few pieces here:

  1. Sites may not want to expose their user's data to others in the event of a security compromise. Obviously in the event of a security compromise of the helper system (e.g. the keys are leaked), all privacy for the users is gone. It is worse for this to be a major security event for match key providers who have exposed user data to a possibly unbounded number of third parties.
  2. Relatedly, some company policies consider encrypted data "user data" and treat it similarly to cleartext data in terms of limiting the scope of its sharing.
  3. It may be possible to leak aggregate proprietary information about a competitor using the IPA system. While I don't have an exhaustive list of techniques for how to do this, IPA also does not have any formal protections against it (similar to e.g. DP which boasts formal protection for the users privacy in the event of any possible attack). Rough attacks along these lines could involve comparing a competitor's match key performance against a same-device match key, or using an auxiliary system that uses PII joins to generate output. I also want to note that we expect IPA to evolve its capabilities over time to support new use-cases, and if this is something we want to prevent, it would possibly constrain this future innovation (e.g. using IPA for reach reporting).

cc @michaelkleber

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants