Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High volume of source-storage-limit #1103

Open
michal-kalisz opened this issue Nov 9, 2023 · 8 comments
Open

High volume of source-storage-limit #1103

michal-kalisz opened this issue Nov 9, 2023 · 8 comments

Comments

@michal-kalisz
Copy link

Hi,

For approximately 6% of all attempts to register ARA sources, we encounter the source-storage-limit. For PCs, it reaches 10% (for phones, it's about 3%).
For some source websites (often those for which we display the most ads), it goes above 20% (up to a maximum of 33%).

As ARA is only being tested on a small scale, and more ad tech companies are likely to use it as 3rd-party cookies phase out, this problem is becoming more noticeable. Especially, bigger players with lots of ads and partners could hit the limit, preventing new registrations until old ones expire.

Maybe a good solution would be to introduce a storage limit that also considers the reporting origin. Additionally, it seems important to have a mechanism that allows overwriting or deleting previous registrations.

Another idea is to define separate limits for different types of ARA sources (event, navigation). For instance, an event-type event can be registered multiple times, even without any user interaction, unlike navigation events.

Best regards,
Michal

@csharrison
Copy link
Collaborator

Thanks for the report, let me also cc @agarant @akashnadan . We can look into this.

@agarant
Copy link
Collaborator

agarant commented Nov 10, 2023

Thanks for the report @michal-kalisz, an increase from 1024 to 4096 of this limit will be effective from M120. We will be monitoring the impact of the increase. In parallel, we are looking further into additional mitigation measures.

@Makpara
Copy link

Makpara commented Nov 11, 2023

Hi,

For approximately 6% of all attempts to register ARA sources, we encounter the source-storage-limit. For PCs, it reaches 10% (for phones, it's about 3%). For some source websites (often those for which we display the most ads), it goes above 20% (up to a maximum of 33%).

As ARA is only being tested on a small scale, and more ad tech companies are likely to use it as 3rd-party cookies phase out, this problem is becoming more noticeable. Especially, bigger players with lots of ads and partners could hit the limit, preventing new registrations until old ones expire.

Maybe a good solution would be to introduce a storage limit that also considers the reporting origin. Additionally, it seems important to have a mechanism that allows overwriting or deleting previous registrations.

Another idea is to define separate limits for different types of ARA sources (event, navigation). For instance, an event-type event can be registered multiple times, even without any user interaction, unlike navigation events.

Best regards, Michal

@AramZS
Copy link

AramZS commented Nov 13, 2023

I generally think that

a mechanism that allows overwriting or deleting previous registrations.

seems wise. Especially if it can be handled both by count and date (delete x number starting with the least recently registered moving toward the most recently registered).

@michal-kalisz
Copy link
Author

Hi!

Is there any update regarding this issue?

After increasing the limit, 1.8% of sources are not being registered due to the source-storage-limit.

Do we have any insight into whether the suggestion to adjust this limit on a per-reporting-origin basis was addressed?

Michal

@johnivdel
Copy link
Collaborator

Thanks @michal-kalisz, that is useful data.

We've discussed adding a separate reporting-origin scoped limit here, but likely this would be a lower limit. We can look more into this given the numbers here.

I'd be interested to understand if there are any patterns which result in this limit being hit more frequently. Are multiple impressions for the same ad a significant contribution? One thing we have discussed in the past is allowing for impression deduplication: when a source is registered it provides a dedup key which deletes previous registrations that share that same key.

It would be helpful to know if this kind of approach would help.

@michal-kalisz
Copy link
Author

Hi John,

I apologize for the delayed response.

The idea seems very interesting. However, I'd like to revisit the idea of dividing the limit by event type: while registrations of the "event" type may occur frequently, "navigation" requires user interaction and thus happens less often - therefore, it would be worth considering such a division so that one counter does not dominate the other

From what we've observed:
the problem occurs with many SSPs and in many publishers (but for some publishers, the percentage of reported errors is high, with one of the large publishers even reaching 26% for 10% of users).

A more interesting approach may be to look at it per user: looking at the entire month,
90% of users who exceeded the 'source-storage-limit' limit at least once had less than 287 ARA source registrations. (calculations based on verbose debug).
p95 - 562,
p98 - 1087,
p99 - 1641.

I'd be happy to discuss this further at Monday's meeting.

If you would like any additional statistics, please let me know.

Michal

@arpanah
Copy link
Collaborator

arpanah commented Apr 26, 2024

cc: @akashnadan @vikassahu29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants