Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make read requests abortable #1103

Closed
domenic opened this issue Jan 25, 2021 · 15 comments
Closed

Make read requests abortable #1103

domenic opened this issue Jan 25, 2021 · 15 comments

Comments

@domenic
Copy link
Member

domenic commented Jan 25, 2021

Noted by @reillyeon in WICG/serial#112 (comment) .

For some cases it would be nice to dequeue (and possibly abort?) a given read request, without canceling the stream entirely. That is,

const controller = new AbortController();

// Assume "rs" contains chunks A and B.
const reader = rs.getReader();

const promise1 = reader.read();
const promise2 = reader.read({ signal: controller.signal });
const promise3 = reader.read();

controller.abort();

The intent here is that promise1 gets fulfilled with chunk A, and promise3 gets fulfilled with chunk B.

Some questions:

  • What does this do to promise2? Probably reject it with an "AbortError" DOMException (like in other AbortSignal scenarios).
  • What happens if abort is signaled after the read request is already fulfilled? Probably nothing (like in other AbortSignal scenarios).
  • Does this do more than just dequeue the read request? For example, could it be communicated all the way down to the underlying source, I guess via pull()? I don't think this helps much, but it's worth checking...

This is looking relatively straightforward to me to spec, and the use case @reillyeon presents is a good one. Should we just go for it?

@reillyeon
Copy link

The desired behavior seems analogous to the preventCancel and preventAbort options to pipeTo() and pipeThrough().

IIRC calling reader.cancel() will cause promise2 to resolve with done set to true. Using the AbortSignal should probably cause it to reject with an "AbortError" DOMException instead since the stream isn't really "done".

If we added an AbortSignal parameter to pull() then it would make sense to signal it in this case, but only if there were no other reads queued.

@MattiasBuelens
Copy link
Collaborator

  • What does this do to promise2? Probably reject it with an "AbortError" DOMException (like in other AbortSignal scenarios).

Yes, that seems most appropriate. We do the same for rs.pipeTo(dest, { signal }).

While we're on the topic: I think rs.pipeTo(dest, { signal }) should also use the given signal for all of its read requests. That will allow the pipe to shutdown faster.

  • What happens if abort is signaled after the read request is already fulfilled? Probably nothing (like in other AbortSignal scenarios).

Yup, that makes sense.

I think it's also important to guarantee that if we do reject the read request with an AbortError, then the next read request will receive the "missed" chunk. That is, aborting any read request does not "lose" any chunks. (We should have some tests for this.)

  • Does this do more than just dequeue the read request? For example, could it be communicated all the way down to the underlying source, I guess via pull()? I don't think this helps much, but it's worth checking...

I think just dequeuing the read request is fine.

A while back, I suggested we could pass an AbortSignal to pull(), but that was tied to rs.cancel(). Not sure where we want to go with that. 🤷


Could we also make this work for BYOB readers? We'd probably want something like reader.read(view, { signal }), but this has some problems:

  • We need to return ownership of the backing array buffer after the read is aborted, but we can't do that if we merely reject with an AbortError.
  • Before we can return ownership of the buffer, we need to take it away from any pending byobRequest. But how will we know when the underlying sink is done using byobRequest.view?

@reillyeon
Copy link

For BYOB readers I think we'll need to pass an AbortSignal to the underlying source and it has the option of either ignoring the signal or immediately indicating that the read has completed with whatever data has been written into the buffer so far.

As I understand the AbortSignal interface it is valid for an API accepting a signal to ignore it. This would be required for an underlying source implementation which passed the buffer to an API which did not support aborting an operation.

As for resolving reads with partial data, it might not be an issue in practice. For the underlying source implementations I maintain (in the Web Serial API and its polyfill based on WebUSB) supporting BYOB readers would never create a situation in which the buffer is partially filled as it will be returned immediately to the caller whenever any data is available.

@yutakahirano
Copy link
Member

Generally sounds good. This is safe in terms of the locking semantics because we cannot release a lock while there is a pending read request. Maybe that is worth noting.

Is there a practical use-case for aborting a read request which has subsequent read requests?

@reillyeon
Copy link

I don't have one. The use case for this coming from Web Serial API developers is being able to unlock the stream without closing it.

@yutakahirano
Copy link
Member

Then how about adding a method abortPendingRequestsAndReleaseLock to the reader? That method returns a promise.

@ricea
Copy link
Collaborator

ricea commented Feb 4, 2021

I haven't heard a sufficiently compelling use case for cancelling a single read() while leaving other reads pending.

Strictly speaking, if a call to read() caused backpressure to be relieved, then cancelling the read() should cause backpressure to be re-established, but we have no mechanism to do that.

The fact that this makes byte streams into a special case worries me greatly.

@domenic
Copy link
Member Author

domenic commented Feb 24, 2021

WICG/serial#122 is another case where this seems to be useful, although again it's possible there are alternate strategies... I'm inclined to think this is a good idea though.

I guess the remaining work is:

  • Figure out whether the use cases on hand, and any others we can think of, are better served by invalidating individual reads vs. abortPendingRequestsAndReleaseLock().

  • Figure out byte stream integration.

@jasnell
Copy link

jasnell commented Feb 24, 2021

Coming from the Node.js perspective, it would be better for the AbortController to be associated with the Reader as a whole rather than the individual read operation. So...

const controller = new AbortController();

// Assume "rs" contains chunks A and B.
const reader = rs.getReader({ signal: controller.signal });

const promise1 = reader.read();
const promise2 = reader.read();
const promise3 = reader.read();

controller.abort();

There are a couple reasons for this...

  1. Not all read operations are cancelable. libuv fs operations, for instance, cannot be canceled once they are picked up by the libuv thread pool. You can only cancel them while they are still queued. If we consider each discreet read a single libuv request, then, the read-level abort is of limited use and unreliable.

  2. In most operations, it's the entire sequence of reads that are critical. It will be rare to abort a single read. What we'd want is to abort the entire set of read operations and clean up the underlying resource.

When the abort() is called, I would expect any already fulfilled reads to just ignore it but all subsequent still pending read promises to be rejected. So, for the sake of the example, let's assume that promise1 ends up resolving synchronously (for whatever reason) but promise2 and promise3 are still pending. Both promise2 and promise3 would reject with AbortError.

I really cannot think of a case where I would want to abort a single read without aborting the entire subsequent sequence of reads that also may be pending.

@domenic
Copy link
Member Author

domenic commented Feb 24, 2021

Thanks @jasnell, that's very helpful. It sounds similar to the discussion a few comments up in #1103 (comment).

The case in WICG/serial#122 seems like it might desire canceling the read without canceling subsequent reads, but upon reflection, I think you could write it in a different way which avoids that, hmm...

@annevk
Copy link
Member

annevk commented Feb 25, 2021

It seems this might be the low-level primitive whatwg/fetch#180 wants.

@MattiasBuelens
Copy link
Collaborator

From #1168, it turns out rs.pipeTo(ws) is actually cheating. To correctly express the desired behavior when preventCancel = true, we may in fact need abortable reads ourselves. 😅

Figure out whether the use cases on hand, and any others we can think of, are better served by invalidating individual reads vs. abortPendingRequestsAndReleaseLock().

For the use case of pipeTo(), we don't really need an AbortSignal for each reader.read() call, a single signal associated with the entire reader would also suffice.

Figure out byte stream integration.

All read(view) calls reject with an error, so they don't get their BYOB view back. (This is similar to cancel(), except that resolves pending reads with { done: true, value: undefined } instead.)

The BYOB request should remain valid, so the underlying byte source can continue writing into it. However, it's no longer associated with a pending read-into request, which makes things trickier. Right now, the specification assumes that the first pull-into descriptor corresponds to the first read-into request, but this change would break this assumption.

Instead, we want committing this "detached" BYOB request to behave more like calling controller.enqueue(): copying bytes into any non-detached pull-into descriptors (and resolving read(view) requests), and putting the remaining bytes into the stream's queue.

...Sounds like a lot of spec refactoring to be done. 😛

@dead-claudia
Copy link

FYI, there's some user-level security implications for #1143 that this can help address. While it wouldn't be as important on the client side, Deno servers would greatly benefit from it for securely reading bodies.

domenic pushed a commit that referenced this issue Jan 13, 2022
Currently, reader.releaseLock() throws an error if there are still pending read requests. However, in #1103 we realized that it would be more useful if we allowed releasing a reader while there are still pending reads, which would reject those reads instead. The user can then acquire a new reader to receive those chunks.

For readable byte streams, we also discard pull-into descriptors corresponding to (now rejected) read-into requests. However, we must retain the first pull-into descriptor, since the underlying byte source may still be using it through controller.byobRequest. Instead, we mark this descriptor as "detached", such that when we invalidate this BYOB request (as a result of controller.enqueue() or byobRequest.respond()), the bytes from this descriptor are pushed into the stream's queue and used to fulfill read requests from a future reader.

This also allows expressing pipeTo()'s behavior more accurately. Currently, it "cheats" by calling ReadableStreamReaderGenericRelease directly without checking if [[readRequests]] is empty. With the proposed changes, we can safely release the reader when the pipe finishes (even if there's a pending read), and be sure that any unread chunks can be read by a future reader or pipe.
@reillyeon
Copy link

Am I correct that with the change above we can consider this issue closed because releaseLock() functions as an "abort" method for read() which doesn't abort the whole stream. What is the status of implementation of this behavior in browsers?

@MattiasBuelens
Copy link
Collaborator

Correct, #1168 fixes this issue. Closing.

Instead of adding a new API, we changed the behavior of the existing releaseLock() method. So the use case in the original question would now look like this:

const reader = rs.getReader();

const promise1 = reader.read();
const promise2 = reader.read();
const promise3 = reader.read();

reader.releaseLock();
// This is now always allowed, even if one or more read() promises are still pending.
// For example, if rs only contains 2 chunks, then promise1 and promise2 are resolved,
// but promise3 will reject with a TypeError.

If you want to use an AbortController, you can do that manually:

const controller = new AbortController();
const reader = rs.getReader();
controller.signal.addEventListener("abort", () => {
  reader.releaseLock();
});

const promise1 = reader.read();

controller.abort();

What is the status of implementation of this behavior in browsers?

This spec change is hot off the press, so no browser has implemented it yet. We've filed issues with all major browsers, you can have a look at the "implementation bugs" listed in #1168 to track their progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

8 participants