Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take up Private Attribution Proposals #9

Closed
AramZS opened this issue Jan 13, 2022 · 23 comments
Closed

Take up Private Attribution Proposals #9

AramZS opened this issue Jan 13, 2022 · 23 comments
Labels
agenda+ Request to add this issue to the agenda of our next telcon or F2F

Comments

@AramZS
Copy link
Contributor

AramZS commented Jan 13, 2022

Much work has been done on the Attribution API proposals as different proposals coverage on a single standard. It has been proposed that this group take the current set of proposals up and move them towards the next step of standardization through this CG. Currently we are seeing proposals in the repo under issue 2 which intends to cover private attribution processes and an API proposal in issue 1.

As per discussion here, our goal should be to examine current proposals in this specific space, start establishing the problems these proposals address, what use cases should be approached, and more. I think comments below do a good job explaining desired process here.

We will attempt to move into this section quickly, after we set aside initial set up and editorial groups, but I suspect day one slots will start at the half way point of the meeting and continue until the end, while day 2's slots we will aim to start at 30m into the meeting.

Speaking Slots:

@AramZS AramZS added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Jan 13, 2022
@martinthomson
Copy link
Contributor

For me, this is the most important action we can take right now (with the possible exception of kicking off a working group). Having something concrete on our slate is what will really start the group off.

What I suggest we do is discuss whether this is the first technical work item and then - assuming that I'm right and we agree - we ask the various proponents to give us a brief overview of their proposal. Not details, but the shape of the problem as they see it, and the principles they applied to their solutions.

My sense is that we will pick up on a couple of themes in that discussion. Discussing those and maybe reaching some conclusions on them might help us decide between the 5+ different options that are currently floating around in this space. A great many things are possible here.

There are some challenging decisions we might have to make in the abstract before getting into the details of proposals. This includes deciding that providing information about cross-site interactions is necessary to perform attribution, which in itself might be hard to handle for some. How that release might then be managed opens up a host of further questions: Can information be tied to individuals? Is it enough to provide try to break the connection between the information released and individuals, or do we want to aggregate results from multiple users? What other sorts of constraints or conditions might we apply to improve privacy? What costs are we willing to ask advertising businesses to pay in order to access this information? (I've a longer list of these sorts of questions in a document I might share, though it's now looking a bit dated.)

We probably need to work through at least some of those questions so that we have some goals for the project. Then we might start to make progress on picking a proposal, set of proposals, or synthesis of proposals as a starting point.

In terms of people that might help here, I would start with @benjaminsavage, @csharrison, and @erik-anderson. I would love to include @johnwilander in that set, but Apple are not currently members of the CG. All of these folks have thought very deeply about the problem and have made proposals in the space.

@ekr
Copy link
Contributor

ekr commented Jan 13, 2022

I agree with @martinthomson's points here. I would further add that we should make actually picking a proposal out of scope for this next meeting unless there is just obvious consensus. Rather, we should use the time to flesh out requirements and the option space and make sure everyone is up to speed. Then the proponents can go away and do whatever changes they want, including merges, and then we can try to pick one a bit later. This will allow us to have a good discussion without people feeling proprietary about their ideas at this stage.

@marianapr
Copy link
Contributor

I would like to propose to discuss here also aggregation proposals based on existing instantiations such as Prio and DPFs of the Privacy Preserving Measurement framework (https://datatracker.ietf.org/doc/draft-gpew-priv-ppm/)

@ekr
Copy link
Contributor

ekr commented Jan 28, 2022

I'd definitely like to see the WG look at those. Can you link to some of the proposals for how to use PPM for those applications.

@marianapr
Copy link
Contributor

The functionality supported by the aggregate API https://github.com/WICG/conversion-measurement-api/blob/main/AGGREGATE.md of Chrome matches the aggregation that Prio and DPFs support

@bmayd
Copy link

bmayd commented Jan 28, 2022

... we ask the various proponents to give us a brief overview of their proposal. Not details, but the shape of the problem as they see it, and the principles they applied to their solutions.

I think it would be helpful to list, at an abstract level, of the primitives each proposal relies on so that we can get an idea of what general services might provide support for various models and to make it easier to compare and contrast models.

@AramZS
Copy link
Contributor Author

AramZS commented Jan 28, 2022

I agree with the group here, our approach here should be to understand these proposals more abstractly, what they intend to do and how they intend to do it and how other proposals that similarly approach this process do so. I think that our main goal here should be to understand how we can join the variety of related proposals together with the goal of a single standard that covers these proposals' use cases being the product that we would hand off to a Working Group. My inclination was to cover each of these in order and start to talk through their similarities, especially with a mind towards specific behaviors. I see from @martinthomson's response how that can be unclear in my phrasing, with that in mind I will edit the top post and remove patcg/proposals#8 so that it is clearer that this is intended to be a single process.

@AramZS AramZS changed the title Take up Attribution API Proposal Take up Private Attribution Proposals Jan 28, 2022
@AramZS
Copy link
Contributor Author

AramZS commented Jan 28, 2022

I think that we can decide as to if this is a technical work we wish to bring up in advance of the meeting, I would assume yes? I see all participants on this thread as interested. I understand we likely will not see a rep from Apple at this time but I am looking towards any others who make it clear they are available to talk during one of the days we have available?

@erik-anderson
Copy link

@AramZS Microsoft (not me specifically) would like some time to provide an overview of our understanding of various private attirbution/measurement proposals and how they compare. Would you like a separate issue in this repo for that or should we coordinate folks that also want to present in this space within this issue? Understanding how much time we expect to allocate per speaker will be helpful as well.

@csharrison
Copy link
Collaborator

A few of us on the Google side would also want to have some time to present, but it's hard to prepare for this upcoming meeting without a more concrete agenda. How much time will be devoted to the topic? I think this discussion will look very different if we have 1 hour vs. 3.

@AramZS
Copy link
Contributor Author

AramZS commented Feb 7, 2022

I think it is reasonable to assume that the first meeting day will be, once we set up and say go, mostly dedicated to this conversation. @csharrison @erik-anderson I would like to spend the time diving as deep as we can into the varied proposals so that we can come out of this meeting with a good enough understanding to start reconciling various proposals. Do you have preferences on length to speak to these individually or bringing forward multiple people from your teams?

@ekr
Copy link
Contributor

ekr commented Feb 7, 2022 via email

@csharrison
Copy link
Collaborator

@AramZS yes that's reasonable. I think spending a whole day on this is totally fine with me :)

@ekr that sounds reasonable to me. In particular I think it will be beneficial to discuss known trade-offs in the design space.

@AramZS
Copy link
Contributor Author

AramZS commented Feb 8, 2022

Hello all, after reviewing discussion here and talking through a structure with @seanturner we think that the best approach is to move in the direction discussed here. We'll have an introduction in our first meeting and a short period to go through work mode and review agenda (details soon to be entered into the Github repo) and will set into the agenda: five 30m slots with three on day 1 and two on day 2 and some flexibility in the schedule if we see additional folks stand up to speak to this.

We would then follow this up on day 2 with 30m-1h to review discussed constraints, conditions and comparisons to paraphrase @erik-anderson and @martinthomson here. (I will open up an issue to track that part of the agenda).

I think we can look at 3 of the five slots taken by existing proposal authors

I believe that even though he is not a formal member of this group, @johnwilander will be in attendance as an expert observer and open to answering questions during a dedicated time slot, leaving 1 more time slot open. We can include @rmirisola's presentation of #12 or, if participants here feel that Prio and related concepts represent a unique enough approach perhaps @marianapr can talk through them with this group or suggest someone who would.

@AramZS
Copy link
Contributor Author

AramZS commented Feb 8, 2022

@AramZS Microsoft (not me specifically) would like some time to provide an overview of our understanding of various private attirbution/measurement proposals and how they compare. Would you like a separate issue in this repo for that or should we coordinate folks that also want to present in this space within this issue? Understanding how much time we expect to allocate per speaker will be helpful as well.

@erik-anderson I've set up a specific time for this type of review towards the 2nd half of day two. Would 30m work for the person on your team who wishes to present? Let's talk it through on this other issue yes - #17

@erik-anderson
Copy link

@AramZS yes, I think a day 2 overview would be good. I am confirming availability. I'll note ahead of time that there may or may not be some differences of opinion in terms of statements made during such an overview/comparison (we're trying to minimize opinions and instead focus on our understanding), but either way the conversation should be helpful to get aligned.

For the main slots, @joelpf will talk about Masked LARK.

We would also like to have at least 15 minutes for @betuldurak to give an overview of the "bucketization" proposal which is attempting to tackle similar use cases as are being covered by Prio and DPFs with a similar threat model but a different underlying approach. This is separate from Masked LARK and could be potentially be integrated with other proposals. Ideally this content would be presented around the same time as @marianapr's presentation on Prio and DPFs since we are providing comparisons with those.

@AramZS
Copy link
Contributor Author

AramZS commented Feb 8, 2022

@erik-anderson That sounds reasonable, I've added @joelpf in on the main schedule and moved things around to give an extra 15m slot to @betuldurak on day 2 in the agenda. fca1a4f

@marianapr
Copy link
Contributor

marianapr commented Feb 9, 2022

Hello, I see a presentation from me on Prio and DPFs referenced above, when is this scheduled? I see that the schedule says that @betuldurak will talk about Prio/DPF, but I think what @erik-anderson meant was that I will talk about the Prio/DPF and @betuldurak will talk about a related bucketinization approach.

@AramZS
Copy link
Contributor Author

AramZS commented Feb 9, 2022

@marianapr Yes, are you prepared to present on Prio/DPF? I was unclear from your comment if you were asking for such a presentation or if you were prepared to present. If you are prepared to present I've set aside a slot for the discussion following @betuldurak on Day 2 if that works?

@marianapr
Copy link
Contributor

Sorry if it was not clear. Yes, I am prepared to present. Since I know both Prio/DPF and also the work of @betuldurak, it might be better to swap the order and first cover Prio/DPFs and set the stage for Betul's work.

@AramZS
Copy link
Contributor Author

AramZS commented Feb 9, 2022

Noting to speakers @benjaminsavage, @csharrison, @joelpf, @marianapr, @betuldurak: @johnwilander will be available to answer questions in a limited way as an observer from Apple but only during the first two hours of our meeting today so I have shifted discussion slots to put a conversation about Apple's PCM in the first slot and shifted everyone down.

@rmirisola
Copy link

Not sure where we landed on #12 but let me know if I should present.

@AramZS
Copy link
Contributor Author

AramZS commented Feb 11, 2022

@rmirisola We didn't have enough time in the agenda for this meeting, but I think let's aim to talk it through in the next one.

@AramZS AramZS closed this as completed Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agenda+ Request to add this issue to the agenda of our next telcon or F2F
Projects
None yet
Development

No branches or pull requests

8 participants