Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to identify Abort errors originating from controllers #1033

Closed
benjamingr opened this issue Nov 21, 2021 · 39 comments
Closed

Add a way to identify Abort errors originating from controllers #1033

benjamingr opened this issue Nov 21, 2021 · 39 comments
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: aborting AbortController and AbortSignal

Comments

@benjamingr
Copy link
Member

Hey,

I recently realized that it's now impossible (since .reason and .abort(reason)) to identify AbortErrors, it would be really useful if there was some way to identify them as originating from cancellation (timeout is cancellation in that it's not an exceptional case that should warrant monitoring probably).

It would be useful either to have a property on DOMExceptions that signal cancellation/timeout users can check or a branding check AbortSignal.isAbortError or similar.

This would enhance the ergonomics of using AbortSignals while allowing the flexibility of passing reason.

@kanongil
Copy link

kanongil commented Nov 21, 2021

Won't it always be safer to check signal.aborted / signal.reason when handling rejection errors? Even if you brand the errors, it could have originated from a different signal.

async function a({ signal }) {
  await b();
}

async function b() {
  const ab = new AbortController();
  ab.abort();
  throw ab.signal.reason;
}

try {
  const ac = new AbortController();
  await a({ signal: ac.signal });
}
catch (err) {
  console.log(err); // 'AbortError' DOMException
  console.log(ac.signal.aborted); // false
}

@benjamingr
Copy link
Member Author

If you have a direct reference to the signal you might. You often do not have a reference to the signal which means you can't check err === signal.reason or signal.aborted.

Think "I'm instrumentation that needs to decide if an error is really exceptional and I need to wake up someone in the middle of the night or I'm just cancellation which I will happily ignore".

@kanongil
Copy link

kanongil commented Nov 21, 2021

You often do not have a reference to the signal.

I don't agree with this claim – especially the "often" part.

Do you have any concrete examples where you need to handle abort errors without access to the signal? Instrumentation could also access it, eg. fetch(…, { signal }) provides it as a parameter, which the instrumentation could extract. Also, I'm not sure it makes sense for instrumentation to determine the difference – it seems like something that should be handled at the app level.

@kanongil
Copy link

kanongil commented Nov 21, 2021

Hmm, it's not actually possible to brand all reason values, as any non-undefined value is a valid reason. ac.abort(false) is expected to reject with false.

@benjamingr
Copy link
Member Author

benjamingr commented Nov 21, 2021

@kanongil In this case "instrumentation" is something like APM (that presumably doesn't override fetch)

We have spent time talking to API consumers about this and have plenty of examples and use cases. Here is a common one (I'll use the web terminology since this is whatwg/dom but there are plenty of others).

  • I have a framework over my web components / React.
  • Components can "cancel rendering" because the user navigated away or the time allotment for rendering is up or an API request timed out.
  • I have a high order component above those that needs to catch the timeout/cancellation and abort it (componentDidCatch in React speak).
  • That component presumably does not have access to the signal (coming from a hook in the framework React speak).

This is just an example, it can be worked around (by for example making the signal global, injecting the signal to all components) but that would create pretty ugly code from what I recall.

I am happy to loop in library/sdk authors we have talked to about this in the past if that'd help.

@benjamingr
Copy link
Member Author

Hmm, it's not actually possible to brand all reason values, as any non-undefined value is a valid reason. ac.abort(false) is expected to reject with false.

It would be great if we could prohibit non DOMExceptions as reasons even if it means a lot more work for Node.js in the future because I am a lot more worried about the "API surface increased and cancellation is hard to determine" problem than the "Node will have portability issues with it" problem.

@benjamingr benjamingr added the topic: aborting AbortController and AbortSignal label Nov 21, 2021
@domenic
Copy link
Member

domenic commented Nov 21, 2021

I think the premise here is wrong. If the controller wants to send an "AbortError" DOMException, knowing some consumers will discriminate based on that, it can do so. But if it wants to send a TypeError, knowing some consumers are not specifically watching for that, it can. You should generally not write code that assumes all controller-caused errors are "third state" abort errors. The signal is telling you about an error, but that error has the usual semantics based on its type: some are "third state" abort errors, some are fatal failures of a different sort, some are timeouts, etc.

Perhaps it would have been better if AbortSignal were named ExceptionSignal, but the historical naming here shouldn't cause people to assume all controller-initiated exceptions are treated the same way (e.g. ignored).

@benjamingr
Copy link
Member Author

But if it wants to send a TypeError, knowing some consumers are not specifically watching for that, it can.

Why is that capability good/useful?

You should generally not write code that assumes all controller-caused errors are "third state" abort errors.

Are there any non-third-state errors that are aborted with?

@domenic
Copy link
Member

domenic commented Nov 21, 2021

That capability is useful in the same way that exceptions in general are useful.

A signal-accepting API is one which gives its caller some amount of control over what exceptions it throws/rejections it returns/etc. Those exceptions can communicate anything: sometimes an abort, sometimes a timeout, sometimes a TypeError, sometimes a HTTP error (in a Node server), etc. The caller could use its controller to control the function and have it give a TypeError in exactly the situations a TypeError would normally be appropriate. (E.g., the caller has done some further async validation work and realized that the inputs are bad.)

Are there any non-third-state errors that are aborted with?

Any error can be the rejection reason. That's the point. When you accept a controlled signal, you are just giving the caller the ability to participate in your function's control flow.

@domenic
Copy link
Member

domenic commented Nov 21, 2021

Stated another way, the fact that a function accepts a signal should not change how you handle its errors. If you expect a function to return error types {X, Y, ...other stuff you don't care about}, and you want special handling for X and Y, then just add special handling for those, and rethrow the rest. It doesn't matter which among X/Y/other stuff come from logic provided by the controller, vs. the function body.

@Jamesernator
Copy link

Jamesernator commented Nov 23, 2021

Just to point out another example, accepting an AbortSignal that can signal any error is not significantly different from accepting a function that can throw an arbitrary error. In either case if you don't know where the signal/function is coming from, then you can't reasonably expect to handle any error it could throw.

i.e. Like this could throw any error:

async function someTask(transformResult=(result) => result) {
    const data = await loadDataSomehow();
    const transformed = await transformResult(data);
    return transformed;
}

If you're not in control of the transformResult function, then you can't really expect to be able to distinguish errors thrown from transformResult vs from within loadDataSomehow. In this way AbortSignal works the same, if you don't have the signal/controller you can't do anything to control how it behaves, or even know how the errors were generated.

@benjamingr
Copy link
Member Author

@Jamesernator that's a good analogy but you'll find few (if at all) places in the DOM that have this behaviour. Event listeners for example "wrap" error handling internally and call the host error handler rather than "throw it back" to the dispatchEvent call.

@domenic I think I was confused about how this works?

If a user does

const ac = new AbortController();
const fetchResult = fetch('.../', { signal: ac.signal });
ac.abort(new TypeError());

Is fetchResult rejected with an ERR_ABORT DOMException and the user can access ac.signal.reason for metadata or is the goal (eventually) for promise APIs to reject with ac.signal.reason?

@annevk
Copy link
Member

annevk commented Nov 23, 2021

The goal is to reject with the reason. #1030 tracks this.

@benjamingr
Copy link
Member Author

@annevk isn't that a big breaking API change potentially changing how code passing an external signal to fetch should behave?

I am wondering if I am missing something since it feels like the API surface of AbortSignal exploded and (more importantly) the responsibility of "what do I throw with" moved from each API to the signal?

Would Node.js still be at liberty to not reject with signal.reason in order to provide stronger guarantees to users regarding what core methods throw?

@annevk
Copy link
Member

annevk commented Nov 24, 2021

I don't see how it's breaking. Any existing code should still work the same way. If one part of your code adopts something new and another part does not, that might result in breakage, but that seems expected.

If Node.js did that it would go against https://dom.spec.whatwg.org/#abortcontroller-api-integration. Note that @jasnell also reviewed this addition.

@jasnell
Copy link
Contributor

jasnell commented Nov 24, 2021

I also don't see it as breaking in the general case. Node.js has, for some time, used our own AbortError instance to signal cancelations, and it will be breaking for Node.js users if/when we start switching those specific cases to use signal.reason or DOMException(AbortErr), but that's a Node.js-specific implementation detail. We'd just need to walk changes there through a proper deprecation cycle.

Node.js (and any application) is at liberty to ignore the new signal.reason if they wish. That will be the case, for instance, for any existing code in the wild that was deployed before reason was added. We shouldn't expect those to be updated and we can't really depend on signal.reason being used at all. Code written after reason was introduced should start making use of it and where those two cases intersect, yes, there may be some edge cases where devs might have to jump through a few hoops.

Where developers might need to bridge those environments, at least in Node.js, it would be a good idea for Node.js' AbortError to support the cause option. That would at least give us the option of doing something like new AbortError('it was canceled!', { cause: signal.reason }) ---- For that matter, it would be a good idea for new DOMException() to support specifying the cause also, but that's a separate conversation).

@benjamingr
Copy link
Member Author

I also don't see it as breaking in the general case. Node.js has, for some time, used our own AbortError instance to signal cancelations, and it will be breaking for Node.js users if/when we start switching those specific cases to use signal.reason or DOMException(AbortErr), but that's a Node.js-specific implementation detail. We'd just need to walk changes there through a proper deprecation cycle.

I am not sure Node.js will align (there will be significant resistance internally), remember: I am not the one who pushed Node's own AbortError and those parties will have to be convinced or we will have to end up with a non-spec-compliant signal. Our consensus based governance is that blocking is very easy and the TSC doesn't like calling shots when there is resistence.

That said: I believe we can handle Node.js - I was more concerned with everyone on the web using fetch though (rather than people in Node.js) - these people are expecting AbortErrors to mean the signal was aborted and cancellation otherwise.

I am honestly kind of terrified by this change and don't have the capacity to "dig into" this since I'm stuck home with a sick kid who won't let me get more than 10+ of computer time :) Once I get more time I'll probably spend it on symbols in signals since that will teach me more about WebIDL and there appears to be consensus.


That said, to avoid the breaking change I think would make sense to:

  • Add a base ExceptionController type and ExceptionSignal.
  • That can signal any sort of exception.
  • Make AbortSignal a subtype of ExceptionSignal (and AbortController of ExceptionController).
  • AbortSignal/AbortController can only raise DOMExceptions of certain types (or even just AbortError) and treat .reason as metadata.
  • Make fetch and other DOM APIs use ExceptionController, people would still be able to pass in AbortController.

That way we fix the problematic terminology, end up with a clearer API and a smaller one with better API guarantees. Then Node.js can choose which APIs get ExceptionController and which do not.

@benjamingr
Copy link
Member Author

Also, while I understand we (node) do not need to fulfill the AbortController integration bits since those only apply to web APIs - I would much rather talk this out (when we can) and figure out what I (or others) are misunderstanding.

The fact Node.js "can" implement its own APIs while staying spec compliant with throw new AbortError() doesn't mean that situation is optimal and my goal (personally) is for Node.js to implement the spec "to the spirit" rather than "to the letter" as much as possible.

@annevk
Copy link
Member

annevk commented Nov 26, 2021

That seems like an awful lot of complexity. Thus far there has been no problem adopting this change for the various operations that consume an AbortSignal today. Unless someone comes forward with an extremely compelling argument, I somewhat doubt we'll revisit this.

Do you still think we need to keep this issue open?

@benjamingr
Copy link
Member Author

This one? Sure - I am still interested in a way to detect an error originated from a signal (tagging or other) so that Node can provide a way for users to migrate from checking `err.name === 'AbortError'.

@annevk
Copy link
Member

annevk commented Nov 26, 2021

The design doesn't allow for that. At least, it's not clear to me how you'd do that for controller.abort(1) or equivalent.

@kanongil
Copy link

The design doesn't allow for that. At least, it's not clear to me how you'd do that for controller.abort(1) or equivalent.

Indeed. Without access to the signal, code that wants to detect abort errors will need to rely on fallible heuristics, as is already done in node (someone could have made an AbortError that is not used for abort signals). Node will have to rely on assumptions / conventions for it to work.

Also, is it really an issue? If someone does controller.abort(1), they will most likely catch any errors themselves, and handle it however they want.

@kanongil
Copy link

Also, is it really an issue?

It can actually be an issue for library code that does have access to the signal, if it uses the heuristics to do abort specific stuff (eg. for logging / avoid retrying a request). Here it needs to use the signal for detection to always work correctly.

@benjamingr
Copy link
Member Author

@kanongil

Also, is it really an issue? If someone does controller.abort(1), they will most likely catch any errors themselves, and handle it however they want.

Apparently, when I went around asking library authors and project members identifying AbortErrors was important to them so I am here representing them figuring out if we can do this.

@annevk

The design doesn't allow for that. At least, it's not clear to me how you'd do that for controller.abort(1) or equivalent.

I would have honestly much preferred if it was only possible to abort with errors (or even just DOMExceptions) - but if it isn't possible to make that happen - then I would be content with a way to solve this case just for errors.

Something like:

// Detect if an error was signaled as a cancellation reason
// throws TypeError if `err` isn't an Error (or DOMException, but preferably error)
// returns true if the error was tagged as signaling cancellation, false otherwise
AbortController.isAbort(err); 

I realize this can be done in Node.js land as a static method - but I don't think this is a Node.js specific concern.

@annevk annevk added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Nov 26, 2021
@annevk
Copy link
Member

annevk commented Nov 26, 2021

@benjamingr thanks, I think that's a concrete suggestion that could work. Let's leave this open to track interest in that. I think it would help a lot if those library developers could make the case for this feature as at least for me it's somewhat hard to do so given what's been presented thus far. (It might also be worth moving this to a new issue as this contains a lot of other discussion, but I'll leave that decision up to you.)

@benjamingr
Copy link
Member Author

@annevk Thanks - I feel like I need to communicate this. I know you to be very reasonable and I understand it is terrible timing to come here with this problem after the feature landed and James gave an LGTM. I could have raised concerns during that period and missed it - I acknowledge that.

I am not trying to create frustration and I am personally very invested in aiding AbortSignal/AbortController succeed as APIs in many environments. I think this is represented in the (many) APIs we've integrated it into.

I just think I've found an issue with the API change that can cause issues and I feel like I need to make sure we don't create issues for users.

I also acknowledge that it's entirely possible this is just a misunderstanding on my part and that in practice people will only abort with AbortErrors when signaling abort (and other errors when aborting to not actually abort but to communicate other intent). I am just not convinced changing API guarantees like that is safe.

I will ping the people who have spoken up about this in the past.

@benjamingr
Copy link
Member Author

Some members (e.g. @littledan) are on a break so unfortunately it's going to be hard to get their feedback.

Pinging @benlesh @ronag @bterlson @mcollina @jakearchibald in the meantime - if no one response I will ping 4 more (in level of engagement in the past to not create a lot of pings) :) Let me know if you want additional context and I'll write a summary.

@benlesh
Copy link

benlesh commented Nov 26, 2021

I think the design was too specialized before this discussion, and it seems to be getting more specialized.

Personally, I'm unsure of the value of passing a reason to .abort. Generally, I think it would be great if AbortSignal were just Signal. And merely a remote way to flip a flag and fire and event. What code does after it sees the flag is up to that code. It would be more useful beyond just cancellation, etc.

Providing a new AbortError type that people can throw and check for should be sufficient for identifying cancellation. And they could also use it for their own cancellation mechanisms, which definitely already exist in the wild now.

@domenic
Copy link
Member

domenic commented Nov 26, 2021

I'm still strongly against the direction @benjamingr is pushing and believe it's based on a misunderstanding of this feature. But I've made that case already in these comments #1033 (comment) so I won't repeat myself. I just wanted to be clear I am against any additions to address this use case since I think it's a bad use case.

@Jamesernator
Copy link

I generally agree with @domenic on this, although I might be more convinced about the need with some concrete examples of code that can't access the signal but still needs to observe cancellation.

Being able to identify cancellation from sources you don't have a signal/controller for seem a lot like arbitrary stack inspection, like the fact that an abort controller/signal is even used should generally be transparent to you unless you're actually in a position to observe it. The one possible exception to this is promise-accepting APIs, however these are pretty rare, more often things will take a function which returns a promise in which case you should have an opportunity to pass in an abort signal/controller.

This is just an example, it can be worked around (by for example making the signal global, injecting the signal to all components) but that would create pretty ugly code from what I recall.

While I do agree with the sentiment, the fact that cancellation had to be done in userland and efforts at TC39 failed, means that wiring abort signals through all functions is just what is neccessary.

I would certainly still like to see some revival of such integration of cancellation, but I'm not sure it's likely to happen. If function decorators happen at some point in future, it may be possible to create such a thing in userland (especially easy if we could get an await hook or something in decorators).

This is yet-another place where Zone-style functionality would help, because frameworks could create zones that can vend abort controllers/signals associated to the Zone and ensure that errors don't escape. I dunno what happened to the other proposal that added a couple smaller primitives similar to zones, but that sort've functionality would definitely enable such things as well.

@domenic
Copy link
Member

domenic commented Nov 28, 2021

still needs to observe cancellation.

Let's be clear: the use case being asked here is not observing cancelation. It's observing who caused an arbitrary exception.

In the following example:

const controller = new AbortController();

const resultPromise = fetch(url, { signal: controller.signal });

setTimeout(() => {
  controller.abort(new TypeError("actually, it turns out the URL was bad in some application-specific sense, and we didn't validate it until now!"));
}, 10);

resultPromise is not being canceled. It is failing, with a TypeError, because url was invalid. The fact that this is coming from the controller, instead of from fetch(), is irrelevant.

Code that cares about cancelations (abortions) should check for e.name === "AbortError". Code that cares about timeouts should check for e.name === "TimeoutError". Code that cares about input validation errors should check for e.name === "TypeError". It doesn't matter whether that came from the controller, or the fetch() implementation itself. For example, if instead of checking for e.name === "AbortError", you checked for e === signal.reason, you would miss aborted fetches due to other parts of the fetch() spec, e.g. those due to the page unloading or the body stream being canceled.

Fundamentally, code which is inspecting the implementation details of how a given exception appeared, is unsound. I like @Jamesernator's comparison to stack inspection; you shouldn't need to do either such thing in order to handle exceptions.

@kanongil
Copy link

kanongil commented Nov 28, 2021

@domenic Checking the error is not sufficient if you don't control the signal. When you expose an API that takes a signal, you no longer control it. Eg. for this, you now1 need to check the signal to safely drop retries:

export async function retryableGetBlob(url, { signal, retries } = {}) {
  try {
    const res = await fetch(url, { signal });

    if (res.status !== 200 && res.status !== 204) {
       throw new BadResponseError(res.status);
    }

    return await res.blob();
  }
  catch (err) {
    if (signal?.aborted) {
      throw signal.reason !== undefined ? signal.reason : new DOMException('User abort', 'AbortError');
    }

    if (!retries || isFatalError(err)) {
      recordUpstreamFailureMetric(err);
      throw err;
    }

    // retry on soft errors

    return retryableGetBlob(url, { signal, retries: retries - 1 });
  }
}

Here you don't want to call recordUpstreamFailureMetric() on API caller aborts, no matter how it is aborted. Additionally without the guard, an ac.abort(/* <something that looks like a soft error> */) would repeatedly call fetch with an aborted signal until retries === 0.

This is probably how it should be, and I like the power that signal.reason provides the AbortController owner. It just means that API providers will need to check signal.aborted before checking the error when calling an API with a forwarded signal. Similar to how API providers need to check signal.aborted after calls to async methods that don't support cancellations.

Footnotes

  1. once signal.reason is used in fetch

@benjamingr
Copy link
Member Author

@kanongil I think @domenic 's point is that the fact a signal originated from a controller is by design opaque to you and cancellation is still going to always be .name === 'AbortError'. So his point is that you shouldn't check this in instrumentation.

I think this is probably fine but the guidance (always throw signal.reason) is going to receive significant pushback in Node.js. In particular people were fine with cancellation but the ability to throw any error into the API will likely receive pushback since in Node.js APIs are very particular about what errors they throw and any error changes are semver-major and this means "every API taking a signal can throw anything if you don't control the signal".

@benjamingr
Copy link
Member Author

(I'm reaching out to library authors and core members who strongly objected to this in the past to see if their opinion changed and we can just close this)

@ronag
Copy link

ronag commented Nov 28, 2021

I’m a little concerned that this will break assumptions and make the flow harder to reason about.

How about keeping it as AbortError but reason goes as either the message or a property on the aborterror? Or maybe use the existing cause property?

@jasnell
Copy link
Contributor

jasnell commented Nov 28, 2021

That seems backwards. If anything, code listening for the abort would throw their own errors with signal.reason as the cause. E.g.

signal.on('abort', () => {
  throw new Error('...', { cause: signal.reason });

@benjamingr
Copy link
Member Author

@ronag that is likely what we'll do but that is not the guidance and in my experience when there were misunderstandings like this it was 90% just a misunderstanding. Intuitively this indeed does greatly increase the API surface of all of our APIs taking signals.

The guidance to throw signal.reason makes sense for stuff like AbortSignal.timeout since it means that APIs like fetch don't need to change and it's a generic mechanism to throw exceptions into stuff./

@jasnell note that in this case signal.addEventListener('abort' consuming code is expected to throw signal.reason and not throw new AbortError('...', { cause: signal.reason }). The spec is pretty specific:

Convey that the operation got aborted by rejecting the promise with AbortSignal object’s abort reason.

So while we can keep throw new AbortError('...', { cause: signal.reason }) because our APIs (like fs.readFile aren't web APIs) that does not move Node.js closer to the web platform nor does it enable universal code nearly as much.

@annevk
Copy link
Member

annevk commented Jul 27, 2022

I'm closing this as working as designed. We decided that AbortSignal is a way to signal an arbitrary exception. (Arguably this could be further clarified in the standard. That would best be tracked by a new issue at this point.)

@annevk annevk closed this as completed Jul 27, 2022
@benlesh
Copy link

benlesh commented Dec 15, 2022

Sorry for the very, very late response here, but actually this likely is fine for RxJS's needs and may help discriminate cancellations that are expected from unexpected cancellations.

For RxJS's part, we might internally ac.abort(someExpectedCancellationInstance) or something, and then we can check for our own known cancellation. We could then even discern between expected and unexpected cancellations within RxJS itself, and/or discern between RxJS cancellation and external cancellations from other apis.

Now, that said, the only real concern I see is the fact that in many cases ac.signal.reason might be keeping an Error instance around for a long time, and Errors tend to retain everything that was in scope when they're created in memory. But that's more of an implementation detail I suppose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: aborting AbortController and AbortSignal
Development

No branches or pull requests

8 participants