Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple concerns with this proposal #20

Closed
fregas opened this issue Mar 19, 2020 · 6 comments
Closed

Multiple concerns with this proposal #20

fregas opened this issue Mar 19, 2020 · 6 comments

Comments

@fregas
Copy link

fregas commented Mar 19, 2020

Hello all. I have read the spec twice and I believe I have the gist of it, even if I don't completely understand how all the details would work in practice. Full disclosure, I work for a DSP in the Real Time Bidding programmatic advertising industry. As such, my concerns and questions will be catered towards specifically what this would mean for my company and similar ones in our space and what it will do to our clients and their expectations of our product offerings.

Here are some of my questions/concerns around specific proposals from the README:

the site operator can add people to a number of interest groups. 

How are they going to make a consistent list of "interest groups? (think of one site that uses "automobiles" and another site uses "cars")? Doesn't this create a lot of work for website owners? Every new piece of content will then need to be tagged w/ interest groups for advertisers?

This model can support the use case where an advertiser is unwilling to run their ad on pages about a certain topic,

the gist of this seems to indicate that ad networks must rely on accurate topic choosing by the web publisher. This also seems ripe for fraud, for low quality sites that want to put as many people as possible into as many interest groups, since the ad network has no way of determining this information for themselves or to block "low quality domains" who try to show as many ads as possible with minimal relevant content or content that is of poor quality.

The motivating use cases seems like it is centered almost exclusively around site retargeting and segments. But there is a lot of use cases outside of retargeting.

I'm concerned about an "equalizing effect" this will have on the ad industry, where the possibility of new products / innovation is basically impossible, since everyone will have to use the same exact segmentation/interest group approach. It will no longer be possible to infer new data elements based on user behavior. Maybe this is the intention, to put many/most independent ad agencies out of business and prevent new ones from appearing.

Someone recently asked about ML approaches that would also be severely restricted since we are relying on the browser (and vendors) to do everything. This echoes my concern.

If the winning ad is interest-group targeted, then the browser renders it inside some sort of new environment, an "opaque iframe", which does not allow information exchange with the surrounding page: no postMessage, no way to crawl the window tree using window.parent or window.frames[], etc."

Our advertisers definitely want to know what in-page keywords or domains won the bid for future improvement. In this case, we'd have no way of know this and therefore no way of focusing on targeting domains that have better conversion or click thru rates, or pages (even w/o domain information) that have better keywords, since we have no way of seeing the DOM tree of the winning page in this opaque iframe model. This approach severely limits the ability of advertisers to see ROI.

Similarly, budgeting precision for interest-group-targeted ad campaigns will suffer when the interest-group requests happen only a few times per day. Since interest-group-targeted ads tend to be relatively valuable to advertisers, we expect this loss of budget precision will be a cost worth paying.

This is probably fine for larger advertisers and ad networks with large budgets that can absorb temporary overspends. However, the smaller advertisers that count advertising budget in the tens of dollars instead of thousands of dollars, will likely balk at such a lack of granular control over budget. We experience this first hand with our clients.

Blind rendering in opaque iframes. This is hard because it requires ads that can render without network access, and also requires the switch to aggregate reporting. 

Again, client advertisers currently use 3rd party ad servers so they can have a neutral third party measure ads effectiveness via metrics (clicks, impressions, etc.) Disabling network access for ads eliminates this possibility entirely. We seem to be asking these advertisers to just "trust the web browser(s)" in terms of measurement, which will likely not be acceptable to many.

Other questions:

  • How do ad networks get notified that their ad was downloaded, viewed or interacted with (Clicks?) How do we determine conversion metrics?
  • Is turtledove intended to eventually expand to native mobile ads (iphone, android, in-app ads?)
  • Wouldn't having auctions run client side potentially be ripe for fraud? Malicious script altering the execution of bidding seems possible with this model. Would the bidding js code be completely sandboxed so that the publisher's page cannot alter it?
@michaelkleber
Copy link
Collaborator

Hello! Your DSP RTB use case is definitely one that we want to support with TURTLEDOVE. Thanks for commenting, let's go through it all.

the site operator can add people to a number of interest groups.

How are they going to make a consistent list of "interest groups? (think of one site that uses "automobiles" and another site uses "cars")? Doesn't this create a lot of work for website owners? Every new piece of content will then need to be tagged w/ interest groups for advertisers?

The "site operator" in that quote is an advertiser, or some other buy-side party. That is, a TURTLEDOVE interest group is created by some party on the buy-side, and then targeted by that same party. If you build audiences based on 3p cookies today, this API is about a new way for you to keep building your audiences in a post-3p-cookies world.

To be clear, publishers can absolutely tag all of their content with their opinions of the topic or of the people whom it will attract, if they want do put in all that work. But those targeting signals would fall into the "contextual / first-party" bucket, and they would work much more like today; that's not what the TURTLEDOVE proposal is about.

This model can support the use case where an advertiser is unwilling to run their ad on pages about a certain topic,

the gist of this seems to indicate that ad networks must rely on accurate topic choosing by the web publisher. This also seems ripe for fraud, for low quality sites that want to put as many people as possible into as many interest groups, since the ad network has no way of determining this information for themselves or to block "low quality domains" who try to show as many ads as possible with minimal relevant content or content that is of poor quality.

There's one paragraph in the proposal about this topic (it starts "In the case where the interest-group response came from some other ad network that buys through the publisher's ad network..."), but I didn't go into much detail, so let's explore a little more here. If it helps, I can incorporate this discussion back into the doc itself.

Here's what happens at the time of a page visit, calling out the things that I glossed over in the explainer:

  1. Person navigates to publisher page

  2. Publisher's ad network has a script on the page which issues the contextual/1p ad request to their ad server, like today. This includes all the normal information about what page the ad would appear on.

  3. Server-side, some exchange sends RTB call-outs to various DSPs, including contextual and 1p signals. In today's world, the responses are bids that go into an auction.
    In a TURTLEDOVE world: The DSP's response could include more stuff — some signals encoding that DSP's opinion about the topic of the publisher page.

  4. Exchange runs an auction, and ends up shipping the best contextual+1p-targeted creative back to the browser.
    In a TURTLEDOVE world: The thing sent back to the browser can also include "contextual signals", which come from anyone involved in the whole server-side fan-out. So the publisher's ad network, the exchange, and all the RTB's all have an opportunity to offer up signals that make it back to the browser.

  5. In-browser auction time. Each interest-group-targeted ad came from some buyer, and came along with a JS bidding function. That JS bidding function could use signals sent by that same buyer in steps 3+4. It could also use signals provided by the publisher's ad network, if the ad network offers them up in some documented way and the buyer decides those signals are useful.

Of course the details on how this works would all need to be decided by the ads industry — the browser never sees what information gets passed around among servers as part of the RTB process. My point is that it's entirely possible for a DSP or other buyer to do all of their decision-making on their own, without relying on anyone else's judgement, if the RTB industry agrees on that model.

Does this clear some things up? The rest of my replies will skip around a bit, because I think the fact that interest groups and page signals both come from the buyer addresses several of the other questions you asked.

The motivating use cases seems like it is centered almost exclusively around site retargeting and segments. But there is a lot of use cases outside of retargeting.

I agree. I'm trying to support a range of use cases where a buyer builds an audience. (Since TURTLEDOVE is a browser API, it does impose the additional requirement that you can only build an audience of browsers where you had an opportunity to run some code in their browser.)

Our advertisers definitely want to know what in-page keywords or domains won the bid for future improvement. In this case, we'd have no way of know this and therefore no way of focusing on targeting domains that have better conversion or click thru rates, or pages (even w/o domain information) that have better keywords, since we have no way of seeing the DOM tree of the winning page in this opaque iframe model. This approach severely limits the ability of advertisers to see ROI.

You should be able to get at this information using the Aggregated Reporting API. You can learn in-page keywords or domains as part of the contextual ad request, pass them back in your contextual signals, use that information in your JS bidding function, and report on that information after you win.

Aggregated reporting will mean that your report can't identify individual events. But as long as you're reporting on something that is aggregated over many different ad impressions, you should be able to satisfy your advertisers' needs for insight.

Similarly, budgeting precision for interest-group-targeted ad campaigns will suffer when the interest-group requests happen only a few times per day. Since interest-group-targeted ads tend to be relatively valuable to advertisers, we expect this loss of budget precision will be a cost worth paying.

This is probably fine for larger advertisers and ad networks with large budgets that can absorb temporary overspends. However, the smaller advertisers that count advertising budget in the tens of dollars instead of thousands of dollars, will likely balk at such a lack of granular control over budget. We experience this first hand with our clients.

Can you give me a sense of your needs here? If you're dealing with a budget in the tens of dollars, how much time is that spread over?

Blind rendering in opaque iframes. This is hard because it requires ads that can render without network access, and also requires the switch to aggregate reporting.

Again, client advertisers currently use 3rd party ad servers so they can have a neutral third party measure ads effectiveness via metrics (clicks, impressions, etc.) Disabling network access for ads eliminates this possibility entirely. We seem to be asking these advertisers to just "trust the web browser(s)" in terms of measurement, which will likely not be acceptable to many.
[...]

  • How do ad networks get notified that their ad was downloaded, viewed or interacted with (Clicks?) How do we determine conversion metrics?

Ad networks and neutral third parties will still be able to measure things, using the Aggregated Reporting API as well. Specifically for navigational clicks and conversions, check out the dedicated Conversion Measurement API.

  • Is turtledove intended to eventually expand to native mobile ads (iphone, android, in-app ads?)

This is a web-only proposal for now, but it certainly seems like something that could later expand to support native mobile, if the mobile platforms like the idea.

  • Wouldn't having auctions run client side potentially be ripe for fraud? Malicious script altering the execution of bidding seems possible with this model. Would the bidding js code be completely sandboxed so that the publisher's page cannot alter it?

Yes that's right — each bidding script would be run by the browser in an isolated environment, where it and the publisher page cannot interact.

@PedroAlvarado
Copy link

In-browser auction time. Each interest-group-targeted ad came from some buyer, and came along with a JS bidding function. That JS bidding function could use signals sent by that same buyer in steps 3+4. It could also use signals provided by the publisher's ad network, if the ad network offers them up in some documented way and the buyer decides those signals are useful.

@michaelkleber Is it fair to say that a js-bidding-function executing on-device will have access to all the signals regardless of source? That is, these bidding functions can read "adsignals" and "contextualSignals" originating from different ad-networks.

@michaelkleber
Copy link
Collaborator

I was imagining that each piece of in-browser JS would receive signals from one ad network — the same ad network that wrote the JS in the first place.

It seems reasonable for multiple ad networks to make some sort of agreement with each other to consume one another's signals if they mutually decide to do so. But nobody should be required to share signals if they don't want to... and the browser can deploy encryption to preserve that.

@PedroAlvarado
Copy link

Being able to leverage the signals already present in the browser as a result of new agreements between ad networks can yield easier and faster integrations while also potentially reducing resource consumption (e.g. redundant/duplicate signals between networks). It'd be great for this idea to be considered early on at least from a design perspective.

@fregas
Copy link
Author

fregas commented Apr 14, 2020

Hi @michaelkleber thanks for getting back to me. Sorry about the slow response here, but as you can imagine, things have been hectic.

Here are my thoughts on your responses.

The "site operator" in that quote is an advertiser, or some other buy-side party

That makes more sense, thank you.

That JS bidding function could use signals sent by that same buyer in steps 3+4. It could also use signals provided by the publisher's ad network, if the ad network offers them up in some documented way and the buyer decides those signals are useful.

My point is that it's entirely possible for a DSP or other buyer to do all of their decision-making on their own, without relying on anyone else's judgement, if the RTB industry agrees on that model.

Does this clear some things up?

I think so. Essentially, the DSP or similar 3rd party is the one tagging people in interest groups and then are the ones targeting them, just not as individuals but in bulk. So they control both sides of the system. My confusion was the idea that the website/publisher controlled the tagging.

The motivating use cases seems like it is centered almost exclusively around site retargeting and segments. But there is a lot of use cases outside of retargeting.

I agree. I'm trying to support a range of use cases where a buyer builds an audience.

Our DSP has a lot of use cases that rely on 3rd party cookies to some degree, that have nothing to do with site retargeting or traditional "segments."

Aggregated reporting will mean that your report can't identify individual events. But as long as you're reporting on something that is aggregated over many different ad impressions, you should be able to satisfy your advertisers' needs for insight.

That also sounds less disruptive than losing all insight to keywords or domains. Thank you.

Can you give me a sense of your needs here? If you're dealing with a budget in the tens of dollars, how much time is that spread over?

With some campaigns we alter bidding, budgets, keywords and other targeting info throughout the day. And of course, we have scripts and machine learning models that do some of this automatically. So we may notice a trend and then modify a keyword or bid immediately on multiple campaigns. I'm not sure of the aggregated reporting being proposed here would give us that level of granularity to be this reactive? How often could we get this data?

Ad networks and neutral third parties will still be able to measure things, using the Aggregated Reporting API as well.

But in those cases, they are essentially pulling the same exact data from the browser right? We're asking everyone to "trust the browser" instead of coming up with their own methodology for ensuring all this is measured correctly. That is a benefit in one way, since the numbers should always match up, but it has the downside that as the technology changes, there's not a way for a 3rd party to come up with better filtering of say fraudulent clicks or impressions that weren't visible, etc.

each bidding script would be run by the browser in an isolated environment, where it and the publisher page cannot interact.

That sounds good.

@JensenPaul
Copy link
Collaborator

Closing this issue as it represents past design discussion that predates more recent proposals. I believe some of this feedback was incorporated into the Protected Audience (formerly known as FLEDGE) proposal. If you feel further discussion is needed, please feel free to reopen this issue or file a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants