Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preventing conversion fraud: trust token integration w/ event-level API #13

Open
csharrison opened this issue Nov 7, 2019 · 8 comments
Labels
anti-fraud / auth possible-future-enhancement Feature request with no current decision on adoption

Comments

@csharrison
Copy link
Collaborator

csharrison commented Nov 7, 2019

Today on the w3c web advertising call Ben Savage from Facebook mentioned an interesting case where fraud might occur in the event-level API:

  1. User clicks on an ad on publisher.com, and publisher.com scapes the impression id from the tag
  2. User does not convert
  3. Some time later publisher.com sends a fake conversion report

Publishers are incentivized to show that they are converting more users than they actually are, so this case seems plausible.

The suggestion would be to augment the API to, at conversion time, have the reporting domain also issue the browser a token attesting that this conversion was legitimate. This token would be included in the subsequent conversion report.

Privacy implications

Since the browser can just drop conversions that have invalid tokens, the presence of a token does not reveal any extra information about the conversion metadata.

However, there are implications to how much data the token can sign over. In particular, we can't sign over the conversion metadata because it makes it clear when the browser sends a noise value.

@dvorak42, @michaelkleber FYI.

@csharrison csharrison changed the title Preventing conversion fraud: trust token integration Preventing conversion fraud: trust token integration w/ event-level API Nov 7, 2019
@benjaminsavage
Copy link

Thanks csharrison for filing this issue!

In addition to this particular aspect of the threat model I wanted to call out 2 more. I think the level of concern is far lower here than they are on the Safari proposal: https://github.com/WICG/ad-click-attribution/

  1. If an entity dislikes a publisher and just wants to interfere with their ads measurement capabilities, they may attempt to send non-existent conversion reports, just to ensure that publisher cannot provide correct reports to their customers. In the Safari proposal this is trivial since randomly generated campaign_ids are very likely to have a high overlap with real campaign IDs given the low number of bits. This does seem like less of an issue for this proposal, as 64-bit campaign IDs are much harder to guess. Also, since the conversion event would be linkable to a particular ad impression, individual abusive users might stand out as outliers in the data and one could attempt to filter them out. Of course, one would not have to do this if there were some kind of token from the reporting domain to certify the conversion was legitimate.

I suppose a sufficiently motivated entity could create a browser plugin that scraped campaign_ids, and attempt to convince a large number of people to install it. Once scraped, this plugin could send these campaign_id and reporting domain pairs onwards to a command and control center that could generate fake conversion events en-masse.

  1. If a particular advertiser wants to interfere with a direct competitor who also advertises on the same platform, they might also have an incentive to send fake conversion reports. The aim would be to either make their competitor believe they were getting far better return on ad spend than they actually were, in an effort to get them to spend a lot of budget on fake conversions, or to just render their competitors ads data worthless by flooding it with millions of fake conversions.

As with #1, this is far, far easier in the Safari proposal, where it's trivially easy to guess legitimate campaign IDs. One would probably have to employ a similarly complex browser plugin scheme for the Chrome proposal.

  1. I think you definitely captured this one already. Publishers who utilize ad networks have a strong financial incentive to report fake conversions, as their CPMs are determined by their post-impression conversion rate.

Knowing that a real ad was actually shown to a real person who was actually using their website is not enough. The publisher can as you say, scrape the campaign_id and send a fake conversion.

You mention not being able to sign the conversion metadata due to the differential privacy. This is unfortunate. Could there be an attack whereby a publisher can fraudulently "upgrade" the conversion metadata from low-value "landing page view" conversions up to "purchase" conversions?

@csharrison
Copy link
Collaborator Author

Thanks for the response and clarification. I actually think the problem in (1) and (2) is worse than you say, because in lots of cases in the display advertising world, ads are rendered in same-origin iframes so other third parties running in the publisher context might be able to see these ids as well.

In this case, no browser extension is needed to scrape impression ids.

You mention not being able to sign the conversion metadata due to the differential privacy. This is unfortunate. Could there be an attack whereby a publisher can fraudulently "upgrade" the conversion metadata from low-value "landing page view" conversions up to "purchase" conversions?

If a publisher has access to tokens, they can perform this attack. However, it shouldn't be possible to extract tokens from a client's device without infecting it with malware, so the publisher would either need to:
a. Infect legitimate clients (those that get tokens) with malware, and perform the steps you list as well
b. Figure out a way to trick the conversion endpoint that the conversion was actually legit, and get tokens

@benjaminsavage
Copy link

Would it be possible to hide the campaignID attribute entirely from JavaScript code to make these scraping attacks more difficult? It seems like the campaignID attribute only needs to be readable by the browser itself, not client code on the page.

@csharrison
Copy link
Collaborator Author

Yes I think this is possible. Here are some quick and dirty ideas:

  1. Do this with some HTTP-only hooks. We'd need to think through the design but you could imagine e.g. the reporting_domain getting a request on click that asks for an id. An open question is what metadata to include in that request if third party cookies are unavailable. You could imagine a "public" id that is set on the tag, which is sent to the reporting domain, which returns back a "private" id that isn't visible to JS.

  2. Have the impression id generated from a cross-origin iframe in the reporting domains context. This is really wacky, but you could imagine an ad click sending an event to the reporting domains iframe, which calls some JS to register a private id for.

  3. Just do the blind signature approach, same as conversion registration. Send the clicked impression id to the reporting domain which sends back a blind signature thing. The id is still scrapable, but they are useless without a token. Because this id is not subject to noise, we could probably include it as sort of "public metadata" in the token, and have the signature sign over the entire id.

@csharrison
Copy link
Collaborator Author

I thought about this a bit more the other day and I think you can sign the conversion metadata in a way to prevent upgrade attacks.

In the trust token model, you "blind sign" a nonce, without revealing the nonce to the signer. We could do something similar where we blind sign something that embeds the (possibly noised) conversion metadata. The signer won't know the true value of the metadata, but if it trusts the client (i.e. a real transaction occurred), it will sign the blind value.

Later on, the browser will submit a conversion report along with an anonymous token. Critically, the token is bound to the conversion metadata in the previous step. This means that you can't just take any old token and pretend it is the "high value" one.

@benjaminsavage
Copy link

This is similar to the idea from our proposal: https://github.com/siyengar/private-fraud-prevention

The difference is that in our proposal, the nonce which is being signed is a totally random value that was generated by the browser.

We liked the idea of the signature only being valid for the specified value of the conversion metadata. Assuming there are only 3 bits of conversion value, this is trivially achieved by just storing 8 Public-Private key pairs, and using the appropriate one. Anyone can look up the public key corresponding to the conversion metadata in the report and validate that the signature is valid for the value of "nonce" in the report.

@benjaminsavage
Copy link

BTW - there is a very similar issue opened on the Webkit proposal. I would love to all converge on a common approach to preventing conversion fraud. Care to chime in on that issue @csharrison?

privacycg/private-click-measurement#27

@csharrison
Copy link
Collaborator Author

Thanks, I am familiar with the proposal. The problem with using public metadata (i.e. 8 different key pairs) is that it is difficult to integrate with noise as I mentioned in the original post.

In the simplest case, if you sign the conversion with your public metadata, when you receive a conversion report, you now have essentially an unnoised piece of conversion metadata. To support noise we need to force the server to sign using only a single key-pair. The simplest way to implement this (just generate a generic token associated with a conversion) is vulnerable to upgrade attacks, but those can be mitigated.

@maudnals maudnals added the inactive? Issue may be inactive label Jun 17, 2021
csharrison added a commit that referenced this issue Nov 10, 2022
Partially addresses #13, for aggregatable triggers only.

* Add trigger_attestation.md

* Fix typo

Co-authored-by: John Delaney <40465606+johnivdel@users.noreply.github.com>

* review comments

* update TOC

Co-authored-by: John Delaney <40465606+johnivdel@users.noreply.github.com>
@csharrison csharrison added possible-future-enhancement Feature request with no current decision on adoption and removed inactive? Issue may be inactive labels Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
anti-fraud / auth possible-future-enhancement Feature request with no current decision on adoption
Projects
None yet
Development

No branches or pull requests

3 participants