-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early design review request: IPA #823
Comments
Hi @benjaminsavage, @martinthomson and @@eriktaubeneck. We are looking at this in our W3CTAG meeting today. Two questions for you.
|
Hi @hadleybeeman, glad to hear that you will be looking at it today! Let me do my best to answer your questions, but please follow up if you need any clarity.
|
Some general feedback and thoughts:
|
I just want to pick up on this one point:
In IPA, websites (advertisers, publishers, etc...) pay, not user agents. |
It is not clear what the relationship is between the User Agent Vendor and the three Helper Parties, except for the fact that the User Agent must trust the Helper Parties to not collude. Given that (by design) 2 parties colluding would be disastrous for user privacy, there's a strong incentive for the User Agent Vendor to operate one of the Helper Parties. |
Operating a helper node is something Mozilla has considered, but we're not inclined to do so for a few reasons. Foremost of those is that we're looking to use something like the CA/Browser forum as a reference model for governance. That is, we want to have a common set of helper party networks that are trusted and overseen by a group formed by multiple browsers. In other words, having browsers oversee the operators of those networks. Having the browser involved both in operation and oversight would introduce some fairly gnarly conflicts of interest that seemed best to avoid. As for the other points you make @ShivanKaul: Regarding users vs. sites (point 1). Yes, this is directly acknowledged in the explainer. It's come up a number of times in PATCG meetings specifically in the context of the priority of constituencies. This is something that I recognize that each of us weight differently, but we'll note that the priority is necessarily loose, so there are a few ways that I think you can justify doing something like attribution. The magnitude of the benefit here needs to be considered. The IPA design deliberately imposes a very low cost on users. Leaving aside trivial amounts of bandwidth and compute, the primary cost is the privacy loss (in the formal DP sense) that accrues through providing sites with the ability to perform aggregated attribution. Mozilla's position here is that - provided that we can find an acceptable set of parameters, especially for the Again, we acknowledge that benefits that users see are likely to be indirect, at best. Access to ad-supported content is not automatic here. The advertising industry has some pretty bad incentive structures and it might be that the current trend away from ad-supported content will continue, with the benefits to users not be realized. But we do believe that advertising has demonstrated an ability to provide support to sites that can be more equitable than other business models as it largely shifts the burden on to those who are more willing or able to support advertisers. A progressive taxation system, if you will. Ultimately, there are a lot of things to consider here. It's understandable that you might distrust the advertising industry. There are a lot of shady practices that probably won't stop as a result of us building this stuff. Many actors are unhappy with the share of revenue taken by intermediaries (what's new). Hell, some of that will probably get worse despite our efforts, but that is a risk for a lot of the stuff we build. We could, as I think you are implying, refuse to do anything here, but there are those of us that think that leads to undesirable outcomes, like a far less equitable web. What we are trying to do here is to avoid the worst pitfalls and build safeguards for the rest, technical if possible, procedural and policy-based for the gaps. You identify a few areas that are particularly challenging for IPA. Some of its flexibility comes with inherent trade-offs around things like user transparency. You also identified one area that continues to be challenging for us with the point you make about match key providers. All I can really say is that these represent some of the harder trade-offs we've made in the design. Having some more discussion about these choices relative to some of the alternatives might be the best way to proceed, because some of those choices can be hard to rationalize without putting them into the broader context. I also want to acknowledge explicitly that the context I'm talking about includes not only how sites receive support, but how browsers support themselves. (You might also add bad behaviour from information brokers and regulatory interventions into what is turning out to be pretty complicated.) |
Sorry for the late response, you know how IETF weeks are...
I don’t think the Priority of Constituencies is “loose”; it’s plain, and the exceptions that are listed in the “Web Platform Design Principles” document are unrelated to what’s being proposed here.
While the (formal, DP) privacy loss for users in IPA is definitely something we should reason about, I suspect that it is also the more attractive one to solve for us as engineers; the more important user concerns here are i. around transparency and trust, and ii. piercing the privacy boundary of the browser by intentionally linking events that happen outside the browser with events that happen within the browser. The proposed governance model is especially concerning to me: it looks like we’re building complicated and expensive new Web infrastructure/governance structures here, similar to the CA/Browser forum like you mentioned, except that with IPA, there is not even a security or any other similar benefit to users. I really don’t think CAB is the model to be emulating. This is the first W3C proposal (we’re aware of) that requires the use of trusted, non-user auditable centralized servers for privacy protections. Beyond the clear privacy risk for catastrophic harm here (e.g., misconfigured server), this approach seems incompatible with several TAG findings / W3C principles, including “enhancing individuals control and power”, “the web is transparent” and “the web must make it possible for people to verify the information they see”. This proposal has the goal of intentionally linking behaviors in the browser with behaviors outside the browser. This is a new category of privacy harm that the proposal would enable, and the first time we’ve seen it as an explicit goal in a proposal. This has already resulted in attacks like patcg-individual-drafts/ipa#57. As best we can tell, this technology is being proposed to benefit sites and browser vendors, and at the risk to users and the openness and transparency of the platform as a whole. |
Regarding priorities and "loose", I was loosely referring to this important qualification:
That said, even a strict ordering justifies our conclusion, though it requires acknowledging that some benefits are indirect. That is, the indirect benefit to users as a result of serving the needs of authors (again, via an ability to more effectively support their work with advertising) outweighs or is neutral with the loss associated with those users participating in an aggregated measurement system. And the benefit to authors is potentially significant. |
Thank you everyone for the feedback thus far. I wanted to update the group about a change that we have recently made to the IPA proposal. In light of both:
We've opted to remove the setMatchKey API from this proposal. Perhaps, in future, we will find solutions to these problems, but until that time, we would like to explore a simpler proposal which only includes a The underlying identifier being secret shared in this case would just be a random number, generated by the user-agent, which would never be revealed to any party, just stored on the device. We hope this simplification will address a number of the concerns listed above. |
To clarify, Match Key Providers (and their associated API call,
It would also be good to update the Explainer then. |
We talked about this today during our call, and it's our understanding that there is a promising path forward to merge IPA, PAM and the relevant portions of ARA. Given that, we don't think it's prudent to review the details of IPA since this is subject to change. We're happy to see these attempts to converge on a way of measuring advertising effectiveness that is more privacy preserving. We encourage you to keep fine-tuning the privacy properties of your proposals, and then to open a new design review request when it's ready and we'll take a look then. Thanks! |
That makes sense. More details on this hybrid proposal are forthcoming. Once there has been time for more discussion of it, we'll open a new design review. |
こんにちは TAG-さん!
I'm requesting a TAG review of Interoperable Private Attribution (IPA).
IPA proposes a system that enables cross-site attribution. The idea is to provide businesses that use advertising with a way to measure how their advertising is performing without having to rely on tracking. To do this, IPA assigns users with an identifier - a match key - that cannot be used outside of a multi-party compute (MPC) system. The MPC system only executes a specific protocol that has been vetted to ensure that it only provides aggregated information.
Further details:
You should also know that...
The security and privacy questionnaire covers two key challenges, that I will highlight again here:
This proposal uses information - match keys - that might be used to perform cross-site tracking if the protections in the proposal were to fail. The API allows any web site to request and receive this information from user agents. The proposal includes a number of measures that are designed to protect this information.
The aggregated information that is provided to sites is based on the use of match keys. The use of differential privacy ensures that there is some protection for the contribution of individual users. The design limits the rate at which sites gain this information, so while the amount of information each week has strict limits, over time this limit always increases without bound.
Any conclusions about the privacy properties of the API will depend on an assessment of the adequacy of these protections.
We'd prefer the TAG provide feedback as 🐛 open issues in our GitHub repo for each point of feedback. We're happy to engage with general feedback, commentary, and questions in this thread; we expect some feedback to be very broad in nature.
The text was updated successfully, but these errors were encountered: