Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch-based design #70

Open
achristensen07 opened this issue Mar 16, 2023 · 53 comments
Open

Fetch-based design #70

achristensen07 opened this issue Mar 16, 2023 · 53 comments
Labels
api Issue with API specs discussion fetch-based-api Fetch-based API design

Comments

@achristensen07
Copy link

The design presented today was better than the get-and-post-only design previously proposed, but still not quite as good as I think it can be. I think we still need to think less about designing around the current use case and ask ourselves "what is the minimal addition to fetch that would allow me to do what I need to do?" and I think the answer is just some kind of signal to send now and two new values in the RequestInit dictionary: background timeout seconds and a boolean value saying this is deferred until unload. If you want to change the URL or content, you abort the fetch and start a new one.

@mingyc
Copy link
Collaborator

mingyc commented Mar 17, 2023

What you suggest is similar to an alternative approach we previously proposed: adding deferSend: true into RequestInit.

I think this is a question about to what level of abstraction we should provide. One reason for this previous proposal (PendingBeacon) is to encapsulate the best practics around sending beacons, that the browser takes care of filling up the irrelevant details in Request and also takes care of sending at destroying of document. We also want to preserve this in the fetch-based design.

@noamr
Copy link

noamr commented Mar 17, 2023

Yea I think the alternative approach works better, with handling background-timing and the rest of the semantics (including the visibility-related things) in userland. deferSend or so that works only on document destruction and aborting/reinstating when you want to update the contents or do something special on visibility/timeout does the job.

We can provide a "best practices" pure JS wrapper that looks like the PendingRequest proposal if it's necessary, but I sense that this API will be used mainly by RUM providers/CMS/more advanced users at least at the start so perhaps that can wait.

@yoavweiss
Copy link
Collaborator

One more question to answer is: how can developers know that a pending fetch was or wasn't yet sent, when they have more data to potentially append to it?

What we could do is resolve the Fetch promise when the request is starting to be sent (rather than when a response arrives), and resolve it without a response. There may be better alternatives to get that signal.

Another question would be around the ergonomics of aborting a fetch() every time the data is updated. It may be interesting to explore code samples to that effect.

@noamr
Copy link

noamr commented Mar 17, 2023

Ergonomics-wise I think it's fairly simple but let's see what we come up with.
What we do need though is some additional options for the backgrounded timeout for the bfcache case (as you lose your JS context at that point).

Is the response promise not enough for understanding if your beacon was sent? Isn't it actually better, as in, if there was a network issue or 500 error or what not you can still re-send it?

If we do need a resolve-on-request, I think this should have its own call (or sendBeacon arg) rather than fetch with an extra parameter because it changes the semantics of fetch's return value.

@fergald
Copy link
Collaborator

fergald commented Mar 17, 2023

@yoavweiss

What we could do is resolve the Fetch promise when the request is starting to be sent (rather than when a response arrives), and resolve it without a response. There may be better alternatives to get that signal.

So something like

let hasBeenSent = false;
let abort = new AbortController(); 
fetch(...).then(() => {hasBeenSent = true;};

function updateData(data)
  if (!hasBeenSent) {
    abort.abort();
  }
  hasBeenSent = false;
  fetch(...).then(() => {hasBeenSent = true;};
}

We have to worry about the timing of microtasks. Thinking about Chrome's impl, as long as the browser sending is blocked on "go ahead" from the renderer (when it's still alive) then we can post that microtask when the renderer gives that "go ahead". As long as it runs before any other JS, we're OK. If some any other JS could sneak in before it runs then it will see hasBeenSent == false but actually it has been sent.

I'm not familiar enough with JS microtask queueing to know what happens if e.g. we have a a queued JS task and then we post a microtask before that task runs. Does the microtask run immediately or will it run at the end of the already queued JS task?

@noamr
Copy link

noamr commented Mar 17, 2023

Imagining userland code that uses it would look something like this:

class UserlandPendingBeacon {
  #abortController = null;

  // Including the {defer} structure, timeouts etc
  #requestInfo = null;
  
  // Let's imagine it's a list of plain objects.
  #pendingEntries = [];
  
  constructor(requestInfo, initialEntries) {
    this.#requestInfo = requestInfo;
    this.#pendingEntries = initialEntries;
    this.#send();
  }

  async #send() {
    if (this.#abortController) {
      this.#abortController.abort();
    }

    this.#abortController = new AbortController();
    const sentSignal = new SentSignal();
    fetch(this.#requestInfo, {signal: this.#abortController.signal, sentSignal, body: JSON.stringify(this.#entries)});
    sentSignal.addEventListener("sent", () => {
      this.#pendingEntries = [];
      this.#abortController = null;
    });
  }

  // This will abort an ongoing fetch, and will schedule a new deferred fetch
  append(entry) { 
    this.#pendingEntries.push(entry);
    this.#send();
  }
}

@noamr
Copy link

noamr commented Mar 17, 2023

I'm not familiar enough with JS microtask queueing to know what happens if e.g. we have a a queued JS task and then we post a microtask before that task runs. Does the microtask run immediately or will it run at the end of the already queued JS task?

Posted microtasks always run before queued tasks.

@yoavweiss
Copy link
Collaborator

Is the response promise not enough for understanding if your beacon was sent? Isn't it actually better, as in, if there was a network issue or 500 error or what not you can still re-send it?

Waiting for the response promise to resolve would be racy, and can lead to wasted client and server resources.

A few constraints that the original design had in mind:

  • We want to minimize the number of beacons servers have to deal with, as those have a linear cost to analytics collectors
  • We want to update the beacon frequently
  • We want multiple unrealted entities to be able to safely use this, without losing data or wasting resources

If we waited for the response, we run a risk of aborting an in-flight request, either wasting resources if it waas terminated halfway through, or reporting the same data twice, if it we aborted after the response was sent from the server.

If we do need a resolve-on-request, I think this should have its own call (or sendBeacon arg) rather than fetch with an extra parameter because it changes the semantics of fetch's return value.

I think this was one of the considerations that lead to the PendingBeacon design. An alternative is to have the fetch() Promise never resolve, and have a separate signal for "this request started sending, so too late to update it".

@noamr
Copy link

noamr commented Mar 17, 2023

I think this was one of the considerations that lead to the PendingBeacon design. An alternative is to have the fetch() Promise never resolve, and have a separate signal for "this request started sending, so too late to update it".

... or it can still resolve if there was a response. why not?
See #70 (comment) for a sample implementation with a "sent" signal.

@yoavweiss
Copy link
Collaborator

See #70 (comment) for a sample implementation with a "sent" signal.

Nice! (although, if we're using an event there may be a task queueing race? If so, being able to sync check SentSignal would solve that, I think)

@noamr
Copy link

noamr commented Mar 17, 2023

See #70 (comment) for a sample implementation with a "sent" signal.

Nice! (although, if we're using an event there may be a task queueing race? If so, being able to sync check SentSignal would solve that, I think)

Event doesn't imply anything about tasks. But also signal.sent is fine

@yoavweiss
Copy link
Collaborator

Event doesn't imply anything about tasks.

That's fair..

@fergald
Copy link
Collaborator

fergald commented Mar 17, 2023

I'm not familiar enough with JS microtask queueing to know what happens if e.g. we have a a queued JS task and then we post a microtask before that task runs. Does the microtask run immediately or will it run at the end of the already queued JS task?

Posted microtasks always run before queued tasks.

Microtasks posted from a JS task will run before other queued JS tasks but sending of a beacon in Chrome is an IPC from the browser to the renderer to say "hey I want to send this, if it's not already cancelled, mark it sent and tell me I can go ahead". So I guess as long as that posts an empty JS tasks that then runs this microtask and as long as that whole thing happens before any other JS tasks that happened to be enqueued already, we're ok. I don't know if this is simply an implementation problem or if it is impossible spec-wise to say "here's a micro task, it must jump the queue on anything else that's already queued". I don't think the spec has any concept of non-JS tasks, so there is no JS task for this microtask to run after.

@noamr
Copy link

noamr commented Mar 17, 2023

You can't "queue a microtask" via IPC. It would have to be a regular task that resolves a promise.
If it's the browser's decision when to send the beacon in practice, I think sentSignal would always be somewhat racy.

@fergald
Copy link
Collaborator

fergald commented Mar 18, 2023

OK, so then implementation-wise, in that IPC we would want to synchronously run an empty JS task with this microtask. This seems doable at least.

@mingyc
Copy link
Collaborator

mingyc commented Mar 20, 2023

Offline discussion today:

PendingBeacon Size Limit

Fetch Spec enforces a 64KB size limit for keepalive request per fetch group. And PendingBeacon underlying is a keepalive fetch.

Scenario 1: if user makes a X (>64KB) size PendingBeacon request, it should fail, throwing network error. User should switch to non-keepalive fetch instead.

Scenario 2: if user makes a X (<=64KB) size PendingBeacon request, it should be queued to send. If after queuing, the total excceeds the 64KB limit, the browser flushes the queues earlier than the time of page destruction.

Can enabling BackgroundFetch permission allows to bypass this limit?

SendSignal

Proposed as boolean pending in PendingBeacon.

The discussions were around whether to bundle both AbortSignal and SendSignal together, or let users handle them by themselves.

To prevent from race condition, the renderer should be authoritative to the request's send state when it's alive. (Also discussed before for PendingBeacon).

fetch Promise

Need more user input.

To maintain the same semantic, browser should resolve Promise when the request is sent.
But it may introduce unnecessary belief to the Promise for this usage: in reality PendingBeacon may never resolve.

backgroundTimeout

Need more user input.

Lifetime of PendingBeacon

Previously discussed in #30 that a request can be kept if another same partition page is open.

mingyc added a commit to mingyc/pending-beacon that referenced this issue Mar 24, 2023
- Previous discussions about fetch-based API are in WICG#70, WICG#52, WICG#50, WebKit/standards-positions#85 (comment)
- Related new discussions in WICG#71, WICG#72, WICG#73, WICG#74, WICG#75
mingyc added a commit to mingyc/pending-beacon that referenced this issue Mar 24, 2023
- Previous discussions about fetch-based API are in WICG#70, WICG#52, WICG#50, WebKit/standards-positions#85 (comment)
- Related new discussions in WICG#71, WICG#72, WICG#73, WICG#74, WICG#75
mingyc added a commit to mingyc/pending-beacon that referenced this issue Mar 24, 2023
- Previous discussions about fetch-based API are in WICG#70, WICG#52, WICG#50, WebKit/standards-positions#85 (comment)
- Related new discussions in WICG#71, WICG#72, WICG#73, WICG#74, WICG#75
mingyc added a commit to mingyc/pending-beacon that referenced this issue Mar 24, 2023
- Previous discussions about fetch-based API are in WICG#70, WICG#52, WICG#50, WebKit/standards-positions#85 (comment)
- Related new discussions in WICG#72, WICG#73, WICG#74, WICG#75, WICG#76, WICG#77
@mingyc mingyc added discussion api Issue with API specs labels Mar 24, 2023
mingyc added a commit that referenced this issue Mar 24, 2023
- Previous discussions about fetch-based API are in #70, #52, #50, WebKit/standards-positions#85 (comment)
- Related new discussions in #72, #73, #74, #75, #76, #77
@fergald
Copy link
Collaborator

fergald commented Apr 24, 2023

Can't you learn from the server?

For the RUM use-case (and ads metrics etc), these things tend to be write-only. There is no end-point that you can query to tell if a specific chunk of data has been received and building one would require indexing all the incoming data for serving and keeping that around for long enough. We should avoid that if possible.

@annevk
Copy link

annevk commented Apr 25, 2023

In a large number of cases isn't that what you would have to do if you want to know the answer given OOM? This part of the feature would only be relevant if you successfully manage to get in the back-forward cache, which is optional anyway.

@noamr
Copy link

noamr commented Apr 25, 2023

In a large number of cases isn't that what you would have to do if you want to know the answer given OOM?

Not sure how OOM is related? What you this tries to solve is to know whether the fetch was transmitted due to the backgrounding timeout. Yes, this is only relevant to being restored from BFCache. We only care about whether these fetches were transmitted, not about the answer.

@mingyc
Copy link
Collaborator

mingyc commented Apr 25, 2023

@noamr is right. The Motivation section from the explainer already makes it clear that we want an API to sending a bundle of data to a backend server, without expecting a particular response, ideally at the ‘end’ of a user’s visit to a page

@annevk
Copy link

annevk commented Apr 25, 2023

What I'm saying is that in a large number of cases you won't know whether they are transmitted via this kind of feature as the page will not be in the back-forward cache.

@mingyc I don't see how that relates to determining success (and only in the back-forward cache case)?

@noamr
Copy link

noamr commented Apr 25, 2023

Correct. But for those cases it's unnecessary as they would be transmitted when unloaded.
You need this feature only if BFCache worked, to avoid sending duplicate data. BFCache-restore is the only case where your document is alive and you're not sure whether your fetch had been transmitted.

@fergald
Copy link
Collaborator

fergald commented Apr 25, 2023

What I'm saying is that in a large number of cases you won't know whether they are transmitted via this kind of feature as the page will not be in the back-forward cache.

@mingyc I don't see how that relates to determining success (and only in the back-forward cache case)?

The goal is not to determine success or failure.

One use case is a beacon that accumulates data. You have some new data. If the beacon has not been sent, you cancel it and create a new one with the accumulated data. If the beacon has been sent you start a new one with this new data as the first piece of accumulated data.

You need a sent/not-sent signal in order to make decisions about beacons when they can spontaneously be sent.

@annevk
Copy link

annevk commented Apr 27, 2023

Thanks for your patience in explaining this additional wrinkle. I discussed this with colleagues and we're supportive of addressing it. Here's an API that would make sense to us: a fetchLater() method that essentially takes the same arguments as fetch() (modulo additional parameters) and returns a promise for a dictionary containing a single boolean done member. That promise is only ever rejected with an exception or resolved with { done: true }. The latter can only happen after it's survived bfcache at least once.

@noamr
Copy link

noamr commented Apr 27, 2023

Thanks for seriously discusses the use cases @annevk, I'm supportive of this solution. IMO it can also be a Promise<void> rather than Promise<{done: true}> but that's more of a bikeshedding issue than a material difference.

@annevk
Copy link

annevk commented Apr 27, 2023

Yeah, that's fair. Probably best to stick the most minimal thing (i.e., your suggestion) as that can also be extended in much the same way.

noamr added a commit to noamr/fetch that referenced this issue May 2, 2023
Add a JS-exposed function to request a deferred fetch.

A deferred fetch would be invoked in one of two scenarios:
- The document is destroyed (the fetch group is terminated)
- The document is backgrounded (the fetch group is deactivated)
  and not restored after a certain time.

A few constraints:
- Deferred fetch body sizes are limited to 64KB per origin.
  Exceeding this would immediately reject with a QuotaExceeded.
- Request body streams are not allowed. A request body, if exists,
  has to be a byte sequence.

The JS function is called `requestDeferredFetch` but that's
bikesheddable.

See WICG/pending-beacon#70
@noamr
Copy link

noamr commented May 2, 2023

See initial fetch PR: whatwg/fetch#1647

noamr added a commit to noamr/fetch that referenced this issue May 8, 2023
Add a JS-exposed function to request a deferred fetch.

A deferred fetch would be invoked in one of two scenarios:
- The document is destroyed (the fetch group is terminated)
- The document is backgrounded (the fetch group is deactivated)
  and not restored after a certain time.

A few constraints:
- Deferred fetch body sizes are limited to 64KB per origin.
  Exceeding this would immediately reject with a QuotaExceeded.
- Request body streams are not allowed. A request body, if exists,
  has to be a byte sequence.

The JS function is called `requestDeferredFetch` but that's
bikesheddable.

See WICG/pending-beacon#70
noamr added a commit to noamr/fetch that referenced this issue Jun 11, 2023
Add a JS-exposed function to request a deferred fetch.

A deferred fetch would be invoked in one of two scenarios:
- The document is destroyed (the fetch group is terminated)
- The document is backgrounded (the fetch group is deactivated)
  and not restored after a certain time.

A few constraints:
- Deferred fetch body sizes are limited to 64KB per origin.
  Exceeding this would immediately reject with a QuotaExceeded.
- Request body streams are not allowed. A request body, if exists,
  has to be a byte sequence.

The JS function is called `requestDeferredFetch` but that's
bikesheddable.

See WICG/pending-beacon#70
mingyc added a commit to mingyc/pending-beacon that referenced this issue Jul 4, 2023
mingyc added a commit to mingyc/pending-beacon that referenced this issue Jul 4, 2023
mingyc added a commit to mingyc/pending-beacon that referenced this issue Jul 4, 2023
mingyc added a commit to mingyc/pending-beacon that referenced this issue Jul 4, 2023
mingyc added a commit that referenced this issue Jul 4, 2023
This PR adds overview and example codes for the draft `fetchLater()` API spec from whatwg/fetch#1647

The API will address the discussions in #70 #72 #73 #74 #75 #76.
noamr added a commit to noamr/fetch that referenced this issue Aug 20, 2023
Add a JS-exposed function to request a deferred fetch.

A deferred fetch would be invoked in one of two scenarios:
- The document is destroyed (the fetch group is terminated)
- The document is backgrounded (the fetch group is deactivated)
  and not restored after a certain time.

A few constraints:
- Deferred fetch body sizes are limited to 64KB per origin.
  Exceeding this would immediately reject with a QuotaExceeded.
- Request body streams are not allowed. A request body, if exists,
  has to be a byte sequence.

The JS function is called `requestDeferredFetch` but that's
bikesheddable.

See WICG/pending-beacon#70
noamr added a commit to noamr/fetch that referenced this issue Sep 19, 2023
Add a JS-exposed function to request a deferred fetch.

A deferred fetch would be invoked in one of two scenarios:
- The document is destroyed (the fetch group is terminated)
- The document is backgrounded (the fetch group is deactivated)
  and not restored after a certain time.

A few constraints:
- Deferred fetch body sizes are limited to 64KB per origin.
  Exceeding this would immediately reject with a QuotaExceeded.
- Request body streams are not allowed. A request body, if exists,
  has to be a byte sequence.

The JS function is called `requestDeferredFetch` but that's
bikesheddable.

See WICG/pending-beacon#70
noamr added a commit to noamr/fetch that referenced this issue Mar 12, 2024
Add a JS-exposed function to request a deferred fetch.

A deferred fetch would be invoked in one of two scenarios:
- The document is destroyed (the fetch group is terminated)
- The document is backgrounded (the fetch group is deactivated)
  and not restored after a certain time.

A few constraints:
- Deferred fetch body sizes are limited to 64KB per origin.
  Exceeding this would immediately reject with a QuotaExceeded.
- Request body streams are not allowed. A request body, if exists,
  has to be a byte sequence.

The JS function is called `requestDeferredFetch` but that's
bikesheddable.

See WICG/pending-beacon#70
noamr added a commit to noamr/fetch that referenced this issue Apr 11, 2024
Add a JS-exposed function to request a deferred fetch.

A deferred fetch would be invoked in one of two scenarios:
- The document is destroyed (the fetch group is terminated)
- The document is backgrounded (the fetch group is deactivated)
  and not restored after a certain time.

A few constraints:
- Deferred fetch body sizes are limited to 64KB per origin.
  Exceeding this would immediately reject with a QuotaExceeded.
- Request body streams are not allowed. A request body, if exists,
  has to be a byte sequence.

The JS function is called `requestDeferredFetch` but that's
bikesheddable.

See WICG/pending-beacon#70
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Issue with API specs discussion fetch-based-api Fetch-based API design
Projects
None yet
Development

No branches or pull requests

6 participants