Auction Performance Testing #287

MarcoLugo · 2022-04-12T18:14:46Z

In a desire to better understand the FLEDGE API and its behavior I created a containerized environment to use the API for both manual and automated tests and see what I could learn.

One of the tests was an attempt to observe what would occur if Chrome had to deal with computationally-intensive bidders in the FLEDGE auction. The summary below expands on this test.

Setup

In the auction, we have 201 participants and the experiment tries to discover what happens when a strict subset of these bidders require significant computation. In practical terms, this was done with an infinite loop within the bidding function. Auctions are repeated many times with a randomized number of infinite-loop bidders in order to try to assess the impact of their presence.

Expected

I expected either the infinite-loop bidders would time out without affecting the rest of the auction participants or, if the system was not robust enough to handle this, the auction to freeze and thus fail.

Results

The reality landed somewhere in the middle of the expectations spectrum. Auctions did conclude and produce winners but a pattern emerged very clearly: more computationally-intensive bidders translated into more time for the auction to conclude. In some cases, this could mean seconds more to conclude. See the graphs below for more detail:

One Bidder per Buyer

More Than One Bidder per Buyer

This is problematic because the longer the auction takes, the longer the ads will take to display and this would likely lead to a higher bounce rate for websites, a lower performance for ads and degrade the overall user experience.

This test also proved to be a laptop battery hog, substantially reducing battery life. I think ultimately everyone would benefit from quantifying the impact of FLEDGE on battery life under more normal conditions.

Caveats

The computationally-intensive bidder was created with a DoS or stress test in mind. One could argue that this is merely an edge case. However, it is reasonable to expect some bidders to be compute-heavy. A more realistic test could be warranted in the near future.
The test was run using one laptop and not tested under many different devices. We cannot count on everyone having powerful hardware: an appreciable amount of people still only have 2 physical CPUs and Steam’s statistics may be biased towards higher-end computers.
The test was run as the only CPU-intensive task on the computer. Under normal conditions, the user could be running other demanding processes that would make the browser compete for resources and degrade user experience beyond the browser.
There could be a mistake in my understanding of the API or in the experiment setup and if so maybe this is an opportunity to clarify certain things and come up with a better experiment. The fact that the test harness is open sourced may help mitigate this and hopefully enable others to build on top of it if they wish to.

Takeaways

If the results are correct and assuming that the number of bidders will increase over time as well as the complexity of bidders themselves, then we may run into the performance issues outlined above in the future. I welcome WebAssembly as a way to enable bidders to do more with the same computing resources and even do calculations that would not even be possible without it but I do not think that WebAssembly alone would fix these issues, perhaps just delay their appearance. Accepting a potentially infinite number of bidders while having a finite amount of computing resources does not seem like a sustainable path forward. The suggestions in #79 and/or #268 could be among the possible solutions.

MattMenke2 · 2022-04-14T19:07:04Z

Those results seem largely in line with expectations.

Buyer scripts are in their own process-isolated process, for security reasons. Given processes are fairly heavy weight, Chromium has a global limit of 10 such processes (and a limit of 3 for seller processes), which explains the step-function like behavior, particularly when no IGs share owners. These values were arbitrarily chosen, and we may experiment with them down the line. Currently, network requests are issued by those processes, so we only load resources for buyers/sellers that we currently have a process for, though one could imagine doing a bit better here.

Per spec, sellers concerned about slow buyers can set an auctionConfig's perBuyerTimeouts to limit how long buyer scripts are allowed to run, with a default timeout of 50 milliseconds. That timeout is why these auctions completed instead of hanging. You can lower this and see how it affects auction performance.

There's currently no way to specify network timeouts, though that will likely be added (Some discussion in #280).

There has been discussion around ways to limit buyers that can participate in a single auction, as you point out. This will clearly be needed, though the exact details are very much to be worked out. Browsers should generally try to put as much power as they can in the hands of sellers running the auction, since sellers are really the only ones who know how large auctions they want to run, how long they want to wait for results, and what makes them consider a buyer's script not worth waiting on, or even running.

It's also generally up to the sellers to weed out malicious buyers - the browser isn't in the position to decide what's malicious and what's particularly when the seller is explicitly listing a buyer in its auctionConfig. If a seller feels that slow buyers that don't make bids are malicious, browsers should give them tools to identify them weed them out (presumably by no longer listing them as a buyer).

It would be good to invest more in understanding the causes of slowness. Some relevant potential causes of slowness:

Loading interest groups.
Launching one process per IG owner / seller.
Network requests (both when served from the cache and when not)
Script parsing/compilation/execution. The first two of these potentially play less of a role if one buyer repeatedly uses the same script.
Process limit.

In different cases, it's likely different parts of the problem will dominate. All of these potentially come into play when increasing the number of IGs or buyers. It looks like in your experiment, script execution may well have dominated the cost (though it is unclear what "more than one bidder" means - two bidders with the same scripts may be quite a bit different from 10 bidders with different scripts, etc.).

MarcoLugo · 2022-04-20T21:37:56Z

Thank you for the thoughtful reply. I will take some time to explore the different potential causes you suggest and add the results on this thread.

Regarding this behavior being largely in line with expectations. I agree that the timeout of 50ms as well as always having at least one legitimate bidder per auction is what allowed auctions to conclude. However, I was mistakenly expecting the timeout to act globally, such that all bidders would timeout roughly 50ms from the start of the auction, not 50ms from the start of their own participation in the auction. One could argue that the current behavior is fair for each bidder but, as seen above (potential experiment issues aside for now), may ultimately result in an undesirable outcome at the collective or aggregate level. If I am understanding correctly, this would only become an issue after 10 or so bidders?

MattMenke2 · 2022-05-11T19:23:37Z

I've added a FLEDGE tracing category to about:tracing. You can go there, start recording, choose custom categories, and select FLEDGE, run an auction in another tab, then return and stop tracing. The trace will have a list of "auction" and "bid" objects (A bid covers the phases of running generateBid() and then scoreAd() for a single IG, though there's some funkiness around component auctions, while the auction object records once-per-auction events, like loading interest groups).

The traces are not remotely user friendly, unfortunately - they do let you deduce how long things are blocked on process startup, but don't explicitly show process startup, or process sharing, for instance.

Anyhow, it may or may not be useful when experimenting with performance. I'm happy to take a look at exported logs as well. Note that exported traces do include interest group and seller URLs.

MattMenke2 · 2022-05-11T19:24:17Z

Also, it's only available in Chrome Canary, 103.0.5056.0 or later (which was just released today)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auction Performance Testing #287

Auction Performance Testing #287

MarcoLugo commented Apr 12, 2022

MattMenke2 commented Apr 14, 2022

MarcoLugo commented Apr 20, 2022

MattMenke2 commented May 11, 2022

MattMenke2 commented May 11, 2022

Auction Performance Testing #287

Auction Performance Testing #287

Comments

MarcoLugo commented Apr 12, 2022

Setup

Expected

Results

Caveats

Takeaways

MattMenke2 commented Apr 14, 2022

MarcoLugo commented Apr 20, 2022

MattMenke2 commented May 11, 2022

MattMenke2 commented May 11, 2022