Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a timeout option, to prevent hanging #951

Closed
ianstormtaylor opened this issue Oct 12, 2019 · 17 comments
Closed

Add a timeout option, to prevent hanging #951

ianstormtaylor opened this issue Oct 12, 2019 · 17 comments

Comments

@ianstormtaylor
Copy link

@ianstormtaylor ianstormtaylor commented Oct 12, 2019

I'd like to propose adding a timeout option to the Fetch API.

Prior issue: I know this was in #20, but that discussion was long and winding. One of the blockers was aborting requests, which was a big rabbit hole, but is now solved with AbortController! So I'd like to open this fresh issue to discuss the importance and ergonomics of timeouts.

Please bare with me! I just want a blank slate to argue the case…

Timeouts are critical!

Before talking implementation… it's important to reiterate just how critical timeouts are. Since TCP/HTTP is inherently unpredictable, you can't actually guarantee that any call to fetch will ever resolve or reject. It can very easily just hang forever waiting for a response. You have no guarantee of a response. To reiterate:

Any HTTP request without a timeout might hang your code forever.

That should scare you if you're trying to write stable code. Because this applies at all levels of the callstack. If you call into an async function that calls fetch without a timeout, you can still hang forever.

The tried-and-true solution for this uncertainty is timeouts. Pretty much every HTTP-requesting library has a way to use them. As long as you've set a timeout for a response, you've returned to a state of certainty. Your request might fail, sure, but at least you're now guaranteed not to be left in an infinitely hanging state.

That is the critical piece: timeouts eliminate uncertainty.

I think Python's requests documentation sums up the severity nicely (emphasis mine):

You can tell Requests to stop waiting for a response after a given number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests. Failure to do so can cause your program to hang indefinitely.

It underscores just how widespread the use case is. If your code is striving to be stable and bug-free, then every call to fetch should have an associated timeout.

But to make that goal possible, specifying a timeout needs to be easy. Right now it's not.

Prior concern: In #20, people brought up other types of timeouts, like "idle timeouts", which ensure that at least some data is received for a request every interval of time. These can be useful, but they are definitely not the 90% case. And importantly, they don't actually eliminate the infinite hanging uncertainty. Other types of timeouts can either be implemented in userland or taken up in a separate issue if there's a need.

AbortController is great, but not enough.

The AbortController and AbortSignal features are great! And they seem really well designed for use across different APIs and for low-level use cases. I have no qualms with them at all, and I think they are perfect to use to make a timeout property available.

But I don't think that just adding them has solved "timeouts" for Fetch. Because they aren't ergonomic enough to offer a good solution for the common case.

Right now you'd need to do...

const controller = new AbortController()
const timeoutId = setTimeout(() => controller.abort(), 5000)
const res = await fetch('https://example.com', { signal: controller.signal })
const body = await res.json()
clearTimeout(timeoutId)

This is a lot of code.

It's too much code for a use case that is so common. (Remember, this is something that should be done for almost every call to fetch.) Not only is it a lot, it requires learning and mastering a new AbortController concept, just to get a guarantee of certainty for fetch. For most users this is unnecessary.

And it's easy to get wrong too!

Most people want to wait for the entire body to be received (notice how the timeout is cleared after res.json()), but most examples of using AbortController in the wild right now do not properly handle this case, leaving them in an uncertain state.

Not only that, but prior to AbortController (and still now) people would use libraries like p-timeout and almost certainly added unexpected bugs because it is common to see people recommend things like:

const res = pTimeout(fetch('https://example.com'), 5000)
const body = await res.json()

That example also has the potential to hang forever!

What's the ideal UX?

Most people are currently using either no timeouts, or incorrectly written timeouts that still leave their code in a potentially infinitely hanging state. And these subtle bugs are only getting more common as more and more people switch to using the isomorphic-fetch or node-fetch packages. (And remember, this bubbles up the callstack!)

I think this is something that fetch should solve. And to do that, we really need something as simple as:

fetch('https://example.com', {
  timeout: 5000
})

It needs to be simple, because it needs to be something you can add to any call to fetch and be guaranteed that it will no longer hang forever. Simple enough that is "just works" as expected. Which means that if people are reading the body (and they often are), the timeout should cover you when reading the body too.

Prior concern: In #20, people brought up that because fetch breaks responses down into "headers" and "body" across two promises, it's unclear what a timeout property should apply to. I think this is actually not a problem, and there's a good solution. (Keep reading! It's a solution that is used in Go for their Timeout parameter.)

A potential solution…

For the timeout option to match user expectations, it needs to incorporate reading the full body when they do.

This is just how people think about HTTP responses—it's the 90% use case. They will absolutely expect that timeout encompasses time to read the entire response they're working with, not just the headers. (This mental model is also why people are incorrectly libraries like p-timeout to add timeouts right now.)

However! Keep reading before you assume things…

The decision does not have to be the black-and-white "either the timeout only applies to the headers, or it only applies to the body" that was dismissed in #20. It can just as easily apply to either just the headers, or both the headers and the body, in a way that ensures it always meets user expectations and gives them the guarantee of certainty.

This is similar to how Go handles their default Timeout parameter:

Timeout specifies a time limit for requests made by this Client. The timeout includes connection time, any redirects, and reading the response body. The timer remains running after Get, Head, Post, or Do return and will interrupt reading of the Response.Body. Source

This is smart, because it allows timeout to adapt to how the user's code handles the response. If you read the body, the timeout covers it. If you don't you don't need to worry either.

Here's what that looks like in terms of where errors are thrown…

const res = await fetch('https://example.com', { timeout: 5000 })
// Right now different types of network failures are thrown here.
//
// With a timeout property, timeout failures will also be thrown and 
// thrown here *if* the headers of the request have not been received 
// before the timeout elapses.
const json = await res.json()
// Right now network failures *and* JSON parsing failures are thrown here.
// 
// With a timeout property, timeout failures will also be thrown and 
// thrown here *if* the headers of the request were received, but the
// body of the request was not also received within the timeout.

To summarize what's going on there in English...

The timeout property can be passed into fetch calls, which specifies a total duration for the read operations on a response. If only the headers are read, the timeout only applies to the headers. If both the headers and body are read the timeout applies to the headers and body.

A real example.

So you can do:

const example = () => {
  const res = await fetch('https://example.com', { timeout: 10000 })
  const json = await res.json()
  return json
}

Which guarantees you received that full JSON body in 10 seconds.

Or, if you don't care about the body...

const example = () => {
  const res = await fetch('https://example.com', { timeout: 1000 })
  return res.ok
}

Which guarantees you receive the headers in 1 second.

This aligns well with user expectations and use cases. As far as a user is concerned, they can set a timeout property, and be covered from network uncertainty regardless of how much of the response they read. By setting timeout, they are saved from the infinitely hanging bug.

What about $my_edge_case?

To be clear, I'm not saying that this timeout option handles every single need for timeouts under the sun. That's impossible.

There are concepts like "idle timeouts" that are unrelated to the uncertainty of TCP. And there are always going to be advanced use cases where you want a timeout for only the initial connection, or only for reading the headers, etc.

This proposal does not try to address those advanced cases.

It's about improving the ergonomics for the 90% of cases that should have a timeout already today, but likely don't.

The timeout option is… optional. It's opt-in. And when you do opt-in to it, it gives you the guarantee of certainty—that your calls to fetch() and res.json() will no longer hang indefinitely, no matter how much of the response you choose to read or not. That's it job. And it does it in a way that matches user expectations for the 90% use cases.

Anyone who needs more explicit/complex timeout logic can always use the existing AbortController pattern, which is designed well enough to handle any use case you can throw at it.

Why not do it in userland?

It's already possible to do in userland. But people aren't doing it correctly because right now the API doesn't have good ergonomics. Most of the popular code samples that show adding timeouts to fetch do it incorrectly, and leave open the infinite hanging failure mode. And that's not even counting the code that just doesn't add timeouts because fetch's current ergonomics don't make it easy.

It's extremely hard to debug these infinite hangs when they happen, because they can happen at any layer of the callstack above where fetch is called. Any async function that has a dependency call a buggy fetch can hang indefinitely.

It's critical that people use timeouts. And to make that happen it's critical that it be easy.

Asking people to master entirely new AbortController and AbortSignal concepts for the 99% use case of timing out a request is not a smart thing to do if you're looking to help people write stable code. (Nothing wrong with those two concepts, they just shouldn't be involved in the common case because they are super low-level.)

And the argument that "fetch is a low-level API" also misses the point. People are increasingly using fetch, isomorphic-fetch, node-fetch, etc. in non-low-level places. They are using it as their only HTTP dependency for their apps, because bundle size is important in JavaScript.

And because timeouts are not handled nicely in the spec, those polyfill libraries are incapable of adding timeout options themselves.

In summary…

  • Any call to fetch without a timeout is not guaranteed to resolve or reject.
  • This is has been brought up in many, many issues in fetch polyfills.
  • The existing ways to add timeouts to a fetch aren't ergonomic enough.
  • This leads to subtle bugs and widespread incorrect example code.
  • There's a good solution for a timeout option that meets user expectations.
  • The solution doesn't preclude future solutions for other timeout needs. (Or userland.)
  • This solution is guaranteed to eliminate network uncertainty, that's the goal.
  • It explicitly doesn't try to solve for 100% of timeouts—just for the common 99%.

Thanks for reading!

@domenic
Copy link
Member

@domenic domenic commented Oct 12, 2019

I don't think the code example comparisons are fair. Here is something more realistic:

const controller = new AbortController();
setTimeout(() => controller.abort(), 5000);

fetch('https://example.com', { signal: controller.signal });

or, if you have a tiny utility library

fetch('https://example.com', { signal: timeoutSignal(5000) });

compared to

fetch('https://example.com', { timeout: 5000 });

If anything, it might make sense to add something like AbortSignal.timeout(5000).

But if you are making a purely ergonomic argument, it's best not to exaggerate your point with unnecessary const signal = or clearTimeout() lines, or to ignore the potential for libraries to allow saving yourself 2 lines of code.

@Pauan
Copy link

@Pauan Pauan commented Oct 13, 2019

@ianstormtaylor The problem with your solution is that it only covers the common case but falls apart in even slightly uncommon cases. For example:

const res = await fetch('https://example.com', { timeout: 10000 });

const shouldFetch = await checkSomething(res);
if (shouldFetch) {
    return await res.json();
}

In this case the timeout will cover only the headers, not the body. This is very surprising and unexpected, it will lead to subtle bugs that are hard to find and debug.

That's why I said this issue is tricky, and it's why I still advocate for explicitly indicating the scope of the timeout. This makes it absolutely 100% clear what is covered by the timeout, so it prevents bugs. And the code is short, so it's not a big burden on the developer.

So if you're going to have a timeout property, it cannot have ambiguous behavior, it must have very clear behavior. For example, timeout could be specified so that it always applies to the body, and so if you don't want that then you'll need to manually use AbortController.

@ianstormtaylor
Copy link
Author

@ianstormtaylor ianstormtaylor commented Oct 13, 2019

@domenic Sorry! My goal was not to exaggerate things at all. My example was actually copied and modified from #20 (comment) and I didn't try to optimize it. You're totally right that it can be simplified a little bit though—I've just edited the initial example to simplify it! Thanks.

I feel like the gravity of the situation didn't come across though…

This is something that needs to happen for pretty much every fetch call in an application, or you leave yourself open to infinite hanging bugs.

For that level of severity, I do not think it's possible that the current AbortController API is going to get us there on its own. Again, I have nothing against it. It's extremely flexible, and seems designed well for extensibility and for low-level use. But there's no way it's going to be clear enough to be used for every request in an application. The first example you gave is confusing because abort appears to be called no matter what, since it's never cleared.

And I'm not talking about utility libraries on purpose.

Again… I'm talking about the average use of fetch resulting in a program hanging indefinitely. Having to import a library just to solve that means that fetch on its own isn't really useable as a solution for people. That's severe! It means that people directly using isomorphic-fetch in their app are likely not writing stable code.


@Pauan I think you've missed the core of what I'm saying. (Not to mention the example you gave seems kind of incomplete, and doesn't seem grounded in any real-world use cases.)

I'm talking about how the fetch library can eliminate its current limitation where it has infinite hanging uncertainty. If you look at your example again, you'll notice that the solution I proposed has actually eliminated the uncertainty. The code you written, given the timeout option, is guaranteed to execute to completion, and not hang indefinitely. When the headers only are read, it will timeout for the headers. And in the if statement when the body is read, it will timeout for both the headers and body. Guaranteeing that its caller will receive control again no matter what happens to the HTTP response.

That's the whole point.

It seems like you're not thinking in terms of the user's expectation. The timeout property in the solution I proposed does actually have clear behavior from the user's point of view. It will always ensure that, whatever the amount of data I read from the response is, it always completes in a given duration. And it never results in infinitely hanging.


Again… the core of what I'm talking about is that the current fetch API makes it extremely easy to write code that will hang indefinitely. And people are currently unaware of these bugs.

This problem is widespread. This blog post comes up as one of the first results for "fetch javascript timeout" in Google, but it actually still can hang indefinitely when you await res.json().

This accepted stackoverflow answer has the same issue, it can infinitely hang. This is widespread.

Having a simple timeout option seems critical to me to avoid these kinds of bugs. Otherwise we're going to have to start telling people to avoid using fetch directly, which is really unfortunate. To write good HTTP-related code you have to use timeouts, there's no two ways about it. And the AbortController API we have currently doesn't make that easy.

@steida
Copy link

@steida steida commented Oct 13, 2019

I believe timeout logic belongs to the userland. Any browser async callback potentially can be postponed to forever. I can't imagine handling everything feature by feature. Maybe I am overlooking something and maybe it can be useful to cancel request but as far I understand it, Promise like API cancelation is breaking the promise itself.

https://gcanti.github.io/fp-ts-contrib/modules/Task/withTimeout.ts.html

Related https://medium.com/@benlesh/promise-cancellation-is-dead-long-live-promise-cancellation-c6601f1f5082

@steida
Copy link

@steida steida commented Oct 13, 2019

Can you please help me to understand why I am wrong? Thank you.

@ianstormtaylor
Copy link
Author

@ianstormtaylor ianstormtaylor commented Oct 13, 2019

@steida (and others) please don't bring up and force us to revisit promise cancellation in this thread—it's unrelated and off topic. That problem has been solved by AbortController, and was one of the things that derailed #20 and I don't want to get into it. Thanks.

@ianstormtaylor ianstormtaylor changed the title Add a timeout option Add a timeout option, to prevent hanging Oct 13, 2019
@Pauan
Copy link

@Pauan Pauan commented Oct 13, 2019

@ianstormtaylor Not to mention the example you gave seems kind of incomplete, and doesn't seem grounded in any real-world use cases.

That's not the point, of course it's a contrived example. My point is that (with your solution) people will eventually write code similar to that, and their code will now be broken, but they won't realize that it's broken. That's a big problem.

It seems like you're not thinking in terms of the user's expectation.

I am well aware of the user's expectations. My point is that your proposal does not work, from a technical perspective.

If you look at your example again, you'll notice that the solution I proposed has actually eliminated the uncertainty. The code you written, given the timeout option, is guaranteed to execute to completion, and not hang indefinitely. When the headers only are read, it will timeout for the headers. And in the if statement when the body is read, it will timeout for both the headers and body. Guaranteeing that its caller will receive control again no matter what happens to the HTTP response.

You have to look at it from the implementation's perspective. With your proposal, if the user does not access the body, then the timeout is cancelled. If the user does access the body, then the timeout is extended to cover the body.

But in my example, it uses await checkSomething(), which can take an arbitrary amount of time. So let's say that await checkSomething() takes 2 seconds. During that time, fetch does not know that you are going to later call res.json(). So it doesn't know whether it should cancel the timeout or not.

Remember, at this point fetch has already succeeded, and for the next 2 seconds you aren't calling res.json(). So fetch would have to wait for an indefinite amount of time, just in case you later call res.json(). That's the problem. It needs to be deterministic.

@ianstormtaylor
Copy link
Author

@ianstormtaylor ianstormtaylor commented Oct 13, 2019

@Pauan the proposal absolutely works from a technical perspective. And I know it does—if you read my opening issue carefully—because it's the technique that Go already uses for the Timeout parameter to its HTTP client.

This proposal gives people a very clear contract:

If you pass in the timeout option when using fetch, it will be used to guarantee that all of the data you read from the response will be read within that timeout's duration—whether you read just the headers, or the headers and the body.

That's actually exactly the contract people want in all the most common cases of using fetch. (As evidenced by Go using it for their default timeout logic too.)

It's beautiful because no matter what you do with the response, you're guaranteed that the request will never hang indefinitely while you're working with it. Eliminating that inherent uncertainty from TCP is the goal when using a timeout in the first place.


@Pauan With your proposal, if the user does not access the body, then the timeout is cancelled. If the user does access the body, then the timeout is extended to cover the body. … So fetch would have to wait for an indefinite amount of time, just in case you later call res.json(). That's the problem. It needs to be deterministic.

That's not true. The timeout is not cancelled, it continues to tick until it finishes, or the user has fully read the body. The way the timeout works is actually pretty simple:

  • fetch is called, a timeout is started.
    • Timeout finishes first: fetch is rejected with a timeout error.
    • Timeout doesn't finish first:
      • fetch is resolved.
      • If the user never reads the body, that's it! They're done. If they do, then…
      • json is called:
        • Timeout already finished: json is rejected with a timeout error.
        • Timeout finishes first: json is rejected with a timeout error.
        • Timeout doesn't finish first:
          • json is resolved.

If you look at the different cases that people are using fetch for, you'll see that this system actually works nicely:

  • 90% case — using fetch and reading the entire body of the response, the timeout option covers everything that's being read, in this case the body.

  • 9% case — using fetch and reading just the headers of the response, the timeout option covers everything that's being read, in this case just the headers.

  • <1% case — using fetch and sometimes reading the headers only, but sometimes reading both the headers and the body, the timeout option still covers you for all of the data you're reading, in this case either just the headers or the headers and the body.

Covering everything that's being read—and explicitly not choosing just the headers, or just the body—is actually the goal of the timeout. If you don't cover both, then you still leave open the surface for hanging indefinitely. That's why my proposal is what it is.

You're describing an extreme edge case in the 1% of cases above. It's very easy to document (as Go has done) that timeout applies to reading all of the data from the response. And in that case, people should not make potentially endless await calls between reading the headers and reading the body if they have opted in to using timeout. That's a completely reasonable restriction to make, on an extreme edge case, to solve for the 99% cases well.

@sibelius
Copy link

@sibelius sibelius commented Oct 13, 2019

userland

@jimmywarting
Copy link

@jimmywarting jimmywarting commented Oct 13, 2019

Agree that it should be in the userland, it's like you said

To be clear, I'm not saying that this timeout option handles every single need for timeouts under the sun. That's impossible.

AbortController can... fetch is a lower level API

This is something that needs to happen for pretty much every fetch call in an application, or you leave yourself open to infinite hanging bugs.

If you want it to apply to every request then you should look into using service worker and modify each request (acting a bit as a MITM).
Then you can control more then just ajax calls. You can then handle styles, images and other assets too

Or make this implementation yourself in a mini-lib like @domenic mention.

most ppl basically wrap fetch in there own http-client since they need to inject some authentication in each request, then you could as well create a abort controller if they pass in a timeout option

@ianstormtaylor
Copy link
Author

@ianstormtaylor ianstormtaylor commented Oct 13, 2019

@sibelius More constructive please? Saying "do it in userland" misses the entire point of this issue.

It's already possible to do in userland—I know that. But people aren't doing it correctly because the API doesn't have good UX. Most of the popular code samples that show adding timeouts to fetch do it incorrectly, and leave open the infinite hanging failure mode. And that's not even counting the code that just doesn't add timeouts because fetch's current ergonomics don't make it easy.

It's extremely hard to debug these infinite hangs when they happen, because they can happen at any layer of the callstack above where fetch is called. Any async function that has a dependency call a buggy fetch can hang indefinitely.

It's critical that people use timeouts. And to make that happen it's critical that it be easy.

Asking people to master entirely new AbortController and AbortSignal concepts for the 99% use case of timing out a request is not a smart thing to do if you're looking to help people write stable code. (Nothing wrong with those two concepts, they just shouldn't be involved in the common case because they are super low-level.)


@jimmywarting Saying "fetch is a low-level API" also misses the point. It's sticking your head in the sand. People are increasingly using fetch, isomorphic-fetch, node-fetch, etc. in non-low-level places. They are using it as their only HTTP dependency for their apps, because bundle size is important in JavaScript.

And because timeouts are not handled nicely in the spec, those polyfill libraries are incapable of adding timeout options themselves.


The options are:

  1. Make using timeouts with fetch more ergonomic. That's what this proposal would do. (There may be alternate ways to do it, I'm all ears.)

  2. Recommend that people do not use fetch, and always use a wrapper like axios instead. And not just "those people who want timeouts", because as I've been saying that is everyone. You literally have to recommend that unless you are using AbortController and AbortSignal, you should not use fetch. That's extreme. And is not going to happen. It's already too popular, and for good reason. People want a nice standard library that helps them write good code. And it's also not good for the ecosystem, because have a common fetch primitive to use with good ergonomics goes a long way in terms of being a building block for other libraries.

  3. Do nothing and let the issue spread to fetch-using packages. As more and more people (and libraries) use fetch under the covers, the indefinite hangs will become more and more common. You'll have to be wary of calling any async function that might delegate to fetch under the covers unless you know that it handles timeouts properly, and most of them won't have. Eventually people will start standardizing on a timeout: number-like property. (See node-fetch, which already has this issue.) But they won't all implement it the same. So the abstraction will leak further, because you'll need to know which timeout wrapper they've used.

@annevk
Copy link
Member

@annevk annevk commented Oct 14, 2019

I certainly appreciate there's a demand for timeout support, but opening duplicates is not the way to go about it. #179 (comment) still applies, though we should maybe explore some alternatives as @domenic proposes.

I would also recommend that you keep your posts brief and to the point. Take the time to edit and elide unnecessary information so the 140 folks following the conversation don't have to.

@annevk annevk closed this as completed Oct 14, 2019
@ianstormtaylor
Copy link
Author

@ianstormtaylor ianstormtaylor commented Oct 14, 2019

@annevk sorry! I read a lot of the different offshoot issues from #20, but somehow missed #179, or I absolutely would have just commented there instead. Thanks.

I would also recommend that you keep your posts brief and to the point. Take the time to edit and elide unnecessary information so the 140 folks following the conversation don't have to.

Fair enough. I did do a lot of editing actually. Based on #20 (not having seen #179) I was of the impression that the general opinion was "userland only, timeouts are solved". And when you're arguing to revisit a contentious topic, if you aren't thorough people will close it saying "we've already covered this". It's a bit of a catch-22.

From #179 it seems like you were after more information, so maybe some of the initial writeup can be helpful. Looking forward to timeout support!

@GrosSacASac
Copy link

@GrosSacASac GrosSacASac commented Dec 2, 2019

But people aren't doing it correctly because the API doesn't have good UX. Most of the popular code samples that show adding timeouts to fetch do it incorrectly, and leave open the infinite hanging failure mode.

It is inevitable, the numbers of Web API's and people is too big.

@Stevemoretz
Copy link

@Stevemoretz Stevemoretz commented May 19, 2021

Not happened yet at 2021?

@GrosSacASac
Copy link

@GrosSacASac GrosSacASac commented May 19, 2021

abort signal is the way

@karlhorky
Copy link

@karlhorky karlhorky commented May 27, 2022

AbortSignal.timeout looks great, thanks to everyone who worked on speccing and implementing this! (whatwg/dom#1032, w3ctag/design-reviews#711)

// Fetch the URL, cancelling after 8 seconds
fetch(url, { signal: AbortSignal.timeout(8000) });

Apparently available in Firefox 100 and Chrome 103 beta!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

10 participants