New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguity on "maximum data size" limit and enforcement #38

Closed
igrigorik opened this Issue Oct 26, 2016 · 16 comments

Comments

Projects
None yet
4 participants
@igrigorik
Member

igrigorik commented Oct 26, 2016

The user agent should restrict the maximum data size to ensure that beacon requests are able to complete quickly and in a timely manner.

The sendBeacon method returns true if the user agent is able to successfully queue the data for transfer. Otherwise it returns false.

(note) If the user agent limits the amount of data that can be queued to be sent using this API and the size of data causes that limit to be exceeded, this method returns false. A return value of true implies the browser has queued the data for transfer.

As of today, the spec does not specify a specific maximum data size limit. This was an intentional decision when we were iterating on the early drafts; we wanted to allow UA's to experiment and adjust the limits in the future as needed. Fast forward a few years...

  • Edge and Firefox enforce a 64KB limit per beacon.
  • Chrome enforces a 64KB quota across all beacon-initiated requests within same nav context.

Technically, all implementations are "spec compliant", because we did not spell out the limit or exact conditions for how it should be enforced. However, the differences are also causing failures in Chrome for the proposed web-platform-tests: web-platform-tests/wpt#4024 (comment). Some thoughts and options...

a) I don't think we should spec a hard limit; I think it still makes sense to allow UAs to adjust this if and when needed. However, as a matter of general guidance to developers, I think we probably should add a non-normative note in the doc indicating that multiple browsers picked 64KB for their initial implementation.

b) Given that we don't spec a hard limit, I don't think web-platform-tests should hardcode 64KB either. It's good to have tests for "super large payloads are not allowed", but perhaps we can just change that number to something much larger in our tests.. e.g. >=1MB. That way, if an existing implementation decides to increase their limit, they're not immediately greeted with a bunch of test failures.

c) sendBeacon adoption has been growing steadily in Chrome, and our telemetry for quota exceeded shows that very few pages hit the 64KB quota; I've not heard developer complaints about it, so far, at least.. As such, I'm inclined to say that this behavior shouldn't be treated as a test failure either. It may be the case that it's a bit too restrictive (e.g. SPA app that sticks around for hours~days and uses sendBeacon under the hood), and we might want to revisit this behavior in the future (e.g. by raising the quota limit; only apply the quota requirement for beacons that fire when queued when the page is onloading; switch to same per-beacon enforcement as Edge/FF; ...), but I think it's a reasonable implementation under current spec language and should be allowed.. unless we can point to specific cases where it breaks. My proposal for this one is:

  • Add a note in the spec documenting the differences in implementation
  • Update submitted tests and split them into separate files to allow Chrome to pass

Additional notes and related discussions:

/cc @toddreifsteck @yoavweiss @nicjansma @plehegar

@igrigorik igrigorik added this to the Level 1 milestone Oct 26, 2016

@toddreifsteck

This comment has been minimized.

Show comment
Hide comment
@toddreifsteck

toddreifsteck Oct 27, 2016

Member

This type of thing is exactly the reason that we should all be submitting tests and discussing the outcome.

@igrigorik I don't agree that we should simply add a note. The current spec allows for an implementation that WILL harm SPAs that use sendBeacon to send 1k per minute in ~1 hour. I assert that we should tighten the specifications language around quotas to ensure that sendBeacon continues working and that quotas must either be "for the active queue" or "per beacon" or something similar.

If very few sites are hitting this behavior, I assert that is because very few SPAs are utilizing sendBeacon even with its rapid adoption.

Member

toddreifsteck commented Oct 27, 2016

This type of thing is exactly the reason that we should all be submitting tests and discussing the outcome.

@igrigorik I don't agree that we should simply add a note. The current spec allows for an implementation that WILL harm SPAs that use sendBeacon to send 1k per minute in ~1 hour. I assert that we should tighten the specifications language around quotas to ensure that sendBeacon continues working and that quotas must either be "for the active queue" or "per beacon" or something similar.

If very few sites are hitting this behavior, I assert that is because very few SPAs are utilizing sendBeacon even with its rapid adoption.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Oct 28, 2016

Member

That's fair. I think there are two threads here, let me try to unpack..

Should we specify a hard limit int he spec?

My position is that we should not, but we should add a note indicating what existing implementations have settled on. I don't hear any loud disagreements on this one?

Specify enforcement model?

@toddreifsteck I think your points are fair. That said, the primary reason why Chrome went with the quota approach is to provide some guarantee that we won't be saddled with a large POST payload as the page is being unloaded -- e.g. "please upload this 3MB trace of my app while you unload, kthnx". Enforcing a per beacon limit helps, but does not address this entirely, as one could just queue a few dozen of those. Perhaps we can find a happy medium here...

  • It probably doesn't make sense to enforce a lifetime quota while the page is active. In fact, I think one could also argue that per-beacon limit is unnecessary.
  • We could add a provision indicating that UA's may impose additional limits on beacon requests when page is being unloaded?
Member

igrigorik commented Oct 28, 2016

That's fair. I think there are two threads here, let me try to unpack..

Should we specify a hard limit int he spec?

My position is that we should not, but we should add a note indicating what existing implementations have settled on. I don't hear any loud disagreements on this one?

Specify enforcement model?

@toddreifsteck I think your points are fair. That said, the primary reason why Chrome went with the quota approach is to provide some guarantee that we won't be saddled with a large POST payload as the page is being unloaded -- e.g. "please upload this 3MB trace of my app while you unload, kthnx". Enforcing a per beacon limit helps, but does not address this entirely, as one could just queue a few dozen of those. Perhaps we can find a happy medium here...

  • It probably doesn't make sense to enforce a lifetime quota while the page is active. In fact, I think one could also argue that per-beacon limit is unnecessary.
  • We could add a provision indicating that UA's may impose additional limits on beacon requests when page is being unloaded?
@toddreifsteck

This comment has been minimized.

Show comment
Hide comment
@toddreifsteck

toddreifsteck Oct 28, 2016

Member
  1. I agree that we should not specific the specific hard limit.
  2. I think we should clean up the spec with regard to the enforcement model.

I can only think of one requirement which led to the current quotas:

  • Protect user from large network requests being kept open after the page has navigated away

@igrigorik Can you think of any other requirements? After we've locked down requirements, we can take a shot at some rough spec language to lock in this requirement.

For what its worth, I agree that all of our quotas fail to be elegant solutions to that requirement and believe if we can lock this down for Beacon which is in heavy use today, we will have helped push the concept forward.

Member

toddreifsteck commented Oct 28, 2016

  1. I agree that we should not specific the specific hard limit.
  2. I think we should clean up the spec with regard to the enforcement model.

I can only think of one requirement which led to the current quotas:

  • Protect user from large network requests being kept open after the page has navigated away

@igrigorik Can you think of any other requirements? After we've locked down requirements, we can take a shot at some rough spec language to lock in this requirement.

For what its worth, I agree that all of our quotas fail to be elegant solutions to that requirement and believe if we can lock this down for Beacon which is in heavy use today, we will have helped push the concept forward.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Oct 28, 2016

Member

Protect user from large network requests being kept open after the page has navigated away

Yep, as far as quota is concerned, I think that's it.

Member

igrigorik commented Oct 28, 2016

Protect user from large network requests being kept open after the page has navigated away

Yep, as far as quota is concerned, I think that's it.

@toddreifsteck

This comment has been minimized.

Show comment
Hide comment
@toddreifsteck

toddreifsteck Nov 2, 2016

Member

However... I think that specifying that the outstanding sendBeacon request queue has a rolling quota that is incremented on request and decremented when the request is completed. I also assert that we should note that 64k is the recommended size of this quota. Firefox, Chrome and Microsoft Edge have all chosen the same number.

I think this model is also a bit cleaner with regard to Service Worker and protects users more than the per-request 64k quotas implemented in Firefox/Microsoft Edge. It also improves upon Chrome's 64k limit for a context by allowing SPAs to continue working after sending 64k of payload over time.

I'm certain that Microsoft Edge could implement this quickly and we could submit updated tests.

@igrigorik What do you think?

Member

toddreifsteck commented Nov 2, 2016

However... I think that specifying that the outstanding sendBeacon request queue has a rolling quota that is incremented on request and decremented when the request is completed. I also assert that we should note that 64k is the recommended size of this quota. Firefox, Chrome and Microsoft Edge have all chosen the same number.

I think this model is also a bit cleaner with regard to Service Worker and protects users more than the per-request 64k quotas implemented in Firefox/Microsoft Edge. It also improves upon Chrome's 64k limit for a context by allowing SPAs to continue working after sending 64k of payload over time.

I'm certain that Microsoft Edge could implement this quickly and we could submit updated tests.

@igrigorik What do you think?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Nov 2, 2016

Member

@toddreifsteck I like that.

@cbentzel @tyoshino curious to hear your thoughts on this one.

Member

igrigorik commented Nov 2, 2016

@toddreifsteck I like that.

@cbentzel @tyoshino curious to hear your thoughts on this one.

@tyoshino

This comment has been minimized.

Show comment
Hide comment
@tyoshino

tyoshino Nov 10, 2016

I agree that the decision on choosing the limit should be left to each implementation.

I think we should have a histogram of the size of payload used for sendBeacon() instead of getting the bool of "quota exceeded or not". If developers (even third-party contents providers) have been adjusting the payload size appropriately (otherwise, they'll lose the signal), we shouldn't see increase in the graph.

That said, as we're not hearing any complaint on it, I agree that 64k should be enough currently.

Re: enforcement model, I agree that applying the limit to the queue (not per-request) is the right thing to do given the requirement is that we protect the user from consumption of the network resource with unloaded page. So, the proposal is that we spec that sendBeacon's queue would fail when its size is exceeding the limit when the page unloads?

It probably doesn't make sense to enforce a lifetime quota while the page is active.

It makes sense with regard to the purpose of the limit, but I guess people would just limit the total amount of data they queue as it's hard to observe which sendBeacon has finished (and no longer occupying the queue). I don't object to it.

tyoshino commented Nov 10, 2016

I agree that the decision on choosing the limit should be left to each implementation.

I think we should have a histogram of the size of payload used for sendBeacon() instead of getting the bool of "quota exceeded or not". If developers (even third-party contents providers) have been adjusting the payload size appropriately (otherwise, they'll lose the signal), we shouldn't see increase in the graph.

That said, as we're not hearing any complaint on it, I agree that 64k should be enough currently.

Re: enforcement model, I agree that applying the limit to the queue (not per-request) is the right thing to do given the requirement is that we protect the user from consumption of the network resource with unloaded page. So, the proposal is that we spec that sendBeacon's queue would fail when its size is exceeding the limit when the page unloads?

It probably doesn't make sense to enforce a lifetime quota while the page is active.

It makes sense with regard to the purpose of the limit, but I guess people would just limit the total amount of data they queue as it's hard to observe which sendBeacon has finished (and no longer occupying the queue). I don't object to it.

@tyoshino

This comment has been minimized.

Show comment
Hide comment
@tyoshino

tyoshino Nov 10, 2016

I'm fine with adding a note that most of UAs have chosen 64k.

tyoshino commented Nov 10, 2016

I'm fine with adding a note that most of UAs have chosen 64k.

@tyoshino

This comment has been minimized.

Show comment
Hide comment
@tyoshino

tyoshino Nov 10, 2016

but I guess people would just limit the total amount of data they queue as it's hard to observe which sendBeacon has finished (and no longer occupying the queue).

Regarding this comment of mine, Ilya, is it already common that sendBeacon() is used for periodical beacon-ing than using async XHR / Fetch API until unload happens?

tyoshino commented Nov 10, 2016

but I guess people would just limit the total amount of data they queue as it's hard to observe which sendBeacon has finished (and no longer occupying the queue).

Regarding this comment of mine, Ilya, is it already common that sendBeacon() is used for periodical beacon-ing than using async XHR / Fetch API until unload happens?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Nov 10, 2016

Member

So, the proposal is that we spec that sendBeacon's queue would fail when its size is exceeding the limit when the page unloads?

I believe the proposal is: limit sendBeacon to 64kb of in-flight data, regardless of the point in the lifecycle of the page. As part of the queue algorithm: increment in-flight data counter, once request has finished, decrement it by the request account. If attempting to queue > in-flight max, reject and return false.

Regarding this comment of mine, Ilya, is it already common that sendBeacon() is used for periodical beacon-ing than using async XHR / Fetch API until unload happens?

It doesn't appear so.. as I would have expected to hear from those developers about our 64kb max. That said, I do think that this is a pattern we want to encourage, so it is something we should fix.

Member

igrigorik commented Nov 10, 2016

So, the proposal is that we spec that sendBeacon's queue would fail when its size is exceeding the limit when the page unloads?

I believe the proposal is: limit sendBeacon to 64kb of in-flight data, regardless of the point in the lifecycle of the page. As part of the queue algorithm: increment in-flight data counter, once request has finished, decrement it by the request account. If attempting to queue > in-flight max, reject and return false.

Regarding this comment of mine, Ilya, is it already common that sendBeacon() is used for periodical beacon-ing than using async XHR / Fetch API until unload happens?

It doesn't appear so.. as I would have expected to hear from those developers about our 64kb max. That said, I do think that this is a pattern we want to encourage, so it is something we should fix.

@toddreifsteck

This comment has been minimized.

Show comment
Hide comment
@toddreifsteck

toddreifsteck Nov 10, 2016

Member

I've done a lot of reviewing of XHR/sendBeacon resource timings on top 100 in the past week. Although sendBeacon is being adopted, a lot of telemetry is still on top of XHR. Also... all sample sendBeacon code I've seen has fallback code for using XHR when sendBeacon returns false. This means it is unlikely for sites to complain. (Said another way, we should fix this.)

Member

toddreifsteck commented Nov 10, 2016

I've done a lot of reviewing of XHR/sendBeacon resource timings on top 100 in the past week. Although sendBeacon is being adopted, a lot of telemetry is still on top of XHR. Also... all sample sendBeacon code I've seen has fallback code for using XHR when sendBeacon returns false. This means it is unlikely for sites to complain. (Said another way, we should fix this.)

@tyoshino

This comment has been minimized.

Show comment
Hide comment
@tyoshino

tyoshino Nov 11, 2016

I believe the proposal is: limit sendBeacon to 64kb of in-flight data,

OK. I was misunderstanding the timing of enforcement. Thanks, Ilya.

As part of the queue algorithm: ...

Seems I was understanding this part correctly.

I rethought the issue of lack of way to know the queue size I mentioned in #38 (comment). As we're going to implement the keepalive feature for the Fetch API, I think we'll suggest that developers who want to precisely control the queue size use the Fetch API with keepalive feature, and we just align the algorithm of sendBeacon with it. So, I now think your proposal of cap on in-flight makes sense.

Thanks Todd for the analysis.

tyoshino commented Nov 11, 2016

I believe the proposal is: limit sendBeacon to 64kb of in-flight data,

OK. I was misunderstanding the timing of enforcement. Thanks, Ilya.

As part of the queue algorithm: ...

Seems I was understanding this part correctly.

I rethought the issue of lack of way to know the queue size I mentioned in #38 (comment). As we're going to implement the keepalive feature for the Fetch API, I think we'll suggest that developers who want to precisely control the queue size use the Fetch API with keepalive feature, and we just align the algorithm of sendBeacon with it. So, I now think your proposal of cap on in-flight makes sense.

Thanks Todd for the analysis.

igrigorik added a commit that referenced this issue Nov 11, 2016

enforce payload limits via Fetch API
Beacon requests are allowed to continue while/if a fetch group is being
terminated - e.g. as the page unloads. To ensure that such requests do
not negatively affect the consequent navigation, or incur high costs for
the user, we limit the amount of data that can be queued via this
method.

The enforcement is applied within Fetch API, which tracks the amount of
in-flight data queued with the `keepalive` flag set, and returns a
network error if the limit is exceeded.

Background: #38
@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Nov 11, 2016

Member

Thanks everyone, sounds like we have consensus on the high-level approach.

Let's continue in #39. There is some layering bits that we need to figure out there..

Member

igrigorik commented Nov 11, 2016

Thanks everyone, sounds like we have consensus on the high-level approach.

Let's continue in #39. There is some layering bits that we need to figure out there..

@andrewwakeling

This comment has been minimized.

Show comment
Hide comment
@andrewwakeling

andrewwakeling Dec 18, 2016

Apologies if this is the wrong place to raise this concern.

If a create a series of empty payloads and choose to transmit data via the URL, what is the expected outcome, with regard to the quota and the behaviour of the browser?

andrewwakeling commented Dec 18, 2016

Apologies if this is the wrong place to raise this concern.

If a create a series of empty payloads and choose to transmit data via the URL, what is the expected outcome, with regard to the quota and the behaviour of the browser?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Dec 21, 2016

Member

@andrewwakeling the quota is only enforced on the body payload. We don't restrict the number of requests in flight.

Member

igrigorik commented Dec 21, 2016

@andrewwakeling the quota is only enforced on the body payload. We don't restrict the number of requests in flight.

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik Mar 1, 2017

Member

Resolved via #39.

Member

igrigorik commented Mar 1, 2017

Resolved via #39.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment