Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agenda Request - Private Conversion Optimisation #117

Closed
benjaminsavage opened this issue Apr 21, 2023 · 1 comment
Closed

Agenda Request - Private Conversion Optimisation #117

benjaminsavage opened this issue Apr 21, 2023 · 1 comment
Assignees

Comments

@benjaminsavage
Copy link

Agenda+: Private Conversion Optimisation

One important advertising use-case we have tangentially discussed a number of times is "conversion optimisation". The problem statement is simple:

  • A site has an opportunity to show an advertisement.
  • There are many, many ads to choose from.
  • Strategy: for each eligible ad, estimate the likelihood that showing it will lead to a "conversion event" (some valuable business outcome, e.g. a purchase)
  • Select the ad expected to generate the maximum business value (e.g. the likelihood of generating a "conversion event", multiplied by the value of that event to the advertiser)

The step of "estimate the likelihood that showing [the ad] will lead to a conversion event" is the tricky part. How does that work? Let's walk through an example:

  • Assume the site selecting the ad is a news website
  • For each ad, they have some metadata (e.g. the topic, the dimensions, the format, etc.)
  • They have some contextual information (e.g. the topic of the article on the page)
  • They might have some information about the person to whom the ad would be shown (e.g. What other articles they've recently read, what other advertisements they have clicked on in the past, the region in the world where this reader is located as indicated by their IP address, maybe this person has registered an account and chosen to provide additional information to the news website, such as the type of ad-topics they are interested in)

In this case, the task is to find a function F(x), which estimates the likelihood an ad impression will lead to a conversion, for some set of parameters x that include the things listed above. If the site has this function F(x), it can just run through all the available ads, compute the parameters for that ad in this opportunity, and invoke the function N times.

But how does the site find this function F(x) that does a not-terrible job of predicting the likelihood an ad will lead to a conversion? This is generally done by looking at historical data that comes out of a conversion measurement system, and finding the function F(x) that does the best job predicting the historical results that were observed over the past few weeks.

So this use-case is closely related to our discussion of "private measurement". The connection is that one approach to "Private Conversion Optimisation" is to attempt to train an ML model directly on the outputs of a "Private Measurement" system.

The Google Chrome team has explored an approach that I will call "Event DP", and it has shown promise. In short, the idea is to just have the "private measurement" system emit noisy event-level reports.

  • The Chrome team has proposed the "ARA - Event level API" with this goal in mind. link.
  • @csharrison also recently filed an issue to discuss this type of approach in general link
  • The Criteo team recently published an article, documenting their efforts to use the "ARA - Event level API" for exactly this purpose link, and concluded that "Event-level reports provide data that can be used for Machine Learning and campaign optimization".

But I would like to discuss a more general question:
As we endeavour to standardise a "Private Measurement" API, do we want to explicitly try to solve for the "Private Conversion Optimization" use-case?

If we do want to support this use-case, there are a variety of approaches we can consider. Returning event-level reports is not the only approach to this problem.

First Goal for this discussion:

Introduce the group to a few high-level approaches to the "Private Conversion Optimization" problem that have been explored:

  1. Event DP
  2. Label DP
  3. Private Logistic Regression
  4. DP SGD

While there are more, I think that should provide a "taste" of the various types of approaches, and I believe I can cover these in a reasonable amount of time.

Second Goal for this discussion:

Having laid out a bit of a landscape of potential approaches we could take to this problem, see if the PAT-CG members are interested in taking up an explicit goal to try to support this use-case with whatever "private measurement" system we eventually standardise.

Third Goal for this discussion:

Assuming we do want to take up the objective of ensuring our "private measurement" system can support this use-case, get a quick temperature check from the group about how they feel about the 4 approaches outlined above. Are there any approaches the group does NOT want to explore? Are there any they feel particularly optimistic about?

Time

I think I will need 20 minutes to cover the 4 approaches listed above, and I think we need at least 25 minutes for discussion, so a total of 45 minutes.

Links

I've put all the content and links inline here in this issue.

@benjaminsavage benjaminsavage added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Apr 21, 2023
@AramZS
Copy link
Contributor

AramZS commented Apr 24, 2023

Thanks for this clear detailed session proposal! Added to day 1.

@AramZS AramZS removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label May 24, 2023
@AramZS AramZS closed this as completed Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants