Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache revalidation requests cause DOMException [AbortError] due to AbortController reuse #54045

Closed
1 task done
advdv opened this issue Aug 15, 2023 · 13 comments · Fixed by #54533
Closed
1 task done
Labels
bug Issue was opened via the bug report template. locked

Comments

@advdv
Copy link

advdv commented Aug 15, 2023

Verify canary release

  • I verified that the issue exists in the latest Next.js canary release

Provide environment information

Operating System:
      Platform: darwin
      Arch: arm64
      Version: Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:19 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T8103
    Binaries:
      Node: 18.17.1
      npm: 9.8.1
      Yarn: 1.22.19
      pnpm: 8.6.1
    Relevant Packages:
      next: 13.4.16-canary.1
      eslint-config-next: 13.4.12
      react: 18.2.0
      react-dom: 18.2.0
      typescript: 5.1.6
    Next.js Config:
      output: N/A

Which area(s) of Next.js are affected? (leave empty if unsure)

caching, data fetching

Link to the code that reproduces this issue or a replay of the bug

I've skipped adding this because I've spend hours in the debugger and know exactly which part of the nextjs codebase causes this error and why (see below) .

To Reproduce

I've skipped adding this because I've spend hours in the debugger and know exactly which part of the nextjs codebase causes this error and why (see below) .

Describe the Bug

With the "time-based" cache revalidation strategy: https://nextjs.org/docs/app/building-your-application/data-fetching/fetching-caching-and-revalidating#time-based-revalidation NextJS will issue an additional fetch to revalidate (refresh) the cache.

This "revalidation" fetch call needs to be identical to the original and as such the "init" option of the original fetch are cloned and used over here:

return originFetch(input, clonedInit).then(async (res) => {

the problem is that this cloning behaviour also copies over the .signal (AbortController) property from the original fetch. Since the revalidation fetch runs long after the original fetch this abort signal will almost definitely have been triggered and the cache revalidation fetch will fail immediately with:

DOMException [AbortError]: This operation was aborted
    at Object.fetch (node:internal/deps/undici/undici:11576:11)

This error will only be plainly reported at

doOriginalFetch(true).catch(console.error)
without any further information. Which makes it very frustrating to debug.

Expected Behavior

  • The .signal should not be cloned into the cache revalidation's fetch
  • Any error from revalidation fetches should be reported more extensively so debugging is not so painful

Which browser are you using? (if relevant)

Not relevant, this happens on the server, locally

How are you deploying your application? (if relevant)

Strangely, this behaviour does not happen when deployed to Vercel (I haven't tested any other platform)

@advdv advdv added the bug Issue was opened via the bug report template. label Aug 15, 2023
@icyJoseph
Copy link
Contributor

I guess when deployed to Vercel, the serverless deploy kills the JS environment.

How could I try to reproduce this though? Setup a Server Component that fetches something, using an abort controller, and call the abort method? Then later on when revalidate happens it crashes?

@icyJoseph
Copy link
Contributor

icyJoseph commented Aug 18, 2023

Alright, doing a build, with a fetch that's immediately aborted printed this:

warn - No utility classes were detected in your source files. If this is unexpected, double-check the `content` option in your Tailwind CSS configuration.
warn - https://tailwindcss.com/docs/content-configuration
- info Creating an optimized production build
- info Compiled successfully
- info Linting and checking validity of types
- info Collecting page data
[    ] - info Generating static pages (0/4)DOMException [AbortError]: This operation was aborted
    at Object.fetch (node:internal/deps/undici/undici:11457:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
DOMException [AbortError]: This operation was aborted
    at Object.fetch (node:internal/deps/undici/undici:11457:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
- info Generating static pages (4/4)
- info Finalizing page optimization

Route (app)                                Size     First Load JS
┌ ○ /                                      137 B          78.6 kB
└ ○ /favicon.ico                           0 B                0 B
+ First Load JS shared by all              78.4 kB
  ├ chunks/596-8d4c5184540d5955.js         26.1 kB
  ├ chunks/fd9d1056-a99b58d3cc150217.js    50.5 kB
  ├ chunks/main-app-c83e970ee4010e4a.js    219 B
  └ chunks/webpack-eb85a1db5fb9bd78.js     1.64 kB

Route (pages)                              Size     First Load JS
─ ○ /404                                   182 B          76.4 kB
+ First Load JS shared by all              76.3 kB
  ├ chunks/framework-8883d1e9be70c3da.js   45.1 kB
  ├ chunks/main-7125f0e2f16d2afa.js        29.4 kB
  ├ chunks/pages/_app-52924524f99094ab.js  195 B
  └ chunks/webpack-eb85a1db5fb9bd78.js     1.64 kB

○  (Static)  automatically rendered as static HTML (uses no initial props)

✨  Done in 9.89s.

With a single page with these contents:

export default async function Home() {
  const controller = new AbortController();

  const poke = await fetch("https://pokeapi.co/api/v2/pokemon/ditto", {
    signal: controller.signal,
    next: {
      revalidate: 10,
    },
  });

  controller.abort();

  return <div>Hi</div>;
}

@advdv
Copy link
Author

advdv commented Aug 18, 2023

It's not exactly the behaviour I had but it does illustrate that the AbortController is more liberally copied between fetches than it should be.

@icyJoseph
Copy link
Contributor

Right, at least I have a quick way to fail and trigger that issue. Let me poke around a bit. Great job finding the lines to analyse! That can take several hours 😓

@icyJoseph
Copy link
Contributor

icyJoseph commented Aug 18, 2023

This is a bit tricky. There is a valid reason to keep the signal in, and pass it to the original. For example if you race a bunch of fetch-promises, and one of them completes, or if you do promise all and one fails, you'd want to cancel the others, using a signal.

So, I believe that, not passing the signal to the original fetch is not an option.

My reproduction example is not really representative of your issue, other than the opaque logging.

As far as I understand, when revalidate happens, you somehow end up with a stale signal, right? Which was already aborted.

I am gonna have to ask you for a more complete code reproduction, because mine is not good enough, and it is not showing me any way forward. It seems to me like you have a subtle memory leak rather, but I want to see code, even pseudo-code would be fine.

@advdv
Copy link
Author

advdv commented Aug 18, 2023

Your reproduction example should also trigger my condition. That is: with "revalidate" set to 10 seconds, run it locally and warm up the cache. Then wait 10 seconds and refresh to trigger the "revalidate fetch" that should now fail with the aborted error.

I can't provide you with my reproduction example because our team did not go with nextjs in the end (we ran into another issue with the tag-based caching as well, and seeing the code in patch-fetch.ts didn't help either).

@icyJoseph
Copy link
Contributor

Well I can't reproduce it. I've placed several logs, setup revalidate and build and started the app, it works without issue.

I am starting to think that you had a memory leak, in a form of shared controller. Something like this:

const controller = new AbortController();

export default async function Home() {
  const pr = await fetch("https://pokeapi.co/api/v2/pokemon/ditto", {
    signal: controller.signal,
    next: {
      revalidate: 10,
    },
  });

  controller.abort();

  return <div>Hi</div>;
}

Now there's a memory leak with that controller, and I do see the issue you describe. After the first revalidate, the second rans into the issue. Is that what your team had?

@roch-numbered
Copy link

@advanderveer I'm trying to create a reproduction example but I can't figured out what's the actual cause of the issue (I mean which function/pattern triggers that). I also have this issue but on a maintained codebase and I'm struggling to understand the source of the error. If you can provide a reproduction repo or some details (for example, were you using sanity client?), it would be really helpful to correctly understand the source of the problem here 🙏

@icyJoseph as mentioned, I'm facing the same error message in the console and I'm not using AbortController at all. So maybe it comes from the sanity client (which for now is the only idea that I got about the source of the issue).
If I'm able to isolate and identify the origin of the error, I'll share a reproduction.

@icyJoseph
Copy link
Contributor

@roch-numbered you could try to move the creation of the client to the scope of the component that uses it. That way the client eventually goes out of scope.

If it is a global variable then this happens for sure. The question is why would a controller linger around, and how is it being aborted anyway?

What kind of operations do you do? I guess you query for some data?

@stipsan
Copy link
Contributor

stipsan commented Aug 18, 2023

@advanderveer I'm trying to create a reproduction example but I can't figured out what's the actual cause of the issue (I mean which function/pattern triggers that). I also have this issue but on a maintained codebase and I'm struggling to understand the source of the error. If you can provide a reproduction repo or some details (for example, were you using sanity client?), it would be really helpful to correctly understand the source of the problem here 🙏

@icyJoseph as mentioned, I'm facing the same error message in the console and I'm not using AbortController at all. So maybe it comes from the sanity client (which for now is the only idea that I got about the source of the issue).

If I'm able to isolate and identify the origin of the error, I'll share a reproduction.

If you were using @sanity/client know that we just released a fix that could help solve this: https://github.com/sanity-io/client/releases/tag/v6.4.7

@roch-numbered
Copy link

@stipsan thx I just updated next and @sanity/client and the error isn't there anymore! 🎉

@icyJoseph do you want me to keep digging over a reproduction?

@icyJoseph
Copy link
Contributor

Ah, not really, like I said, this is most likely a memory leak kind of issue. A controller that never goes out of scope, for whichever reason, and lingers around for future requests.

Feels like it is possible to run into this pitfall in several ways, and each will require careful study of how we are running into it.

Again, not passing the signal down is a non-starter. It has to be passed. Maybe doOriginalFetch(true).catch(console.error) should re-throw 🤷 - I did some testing with printing a warning when we fail because the controller signal was aborted, but that's about it.

timneutkens pushed a commit that referenced this issue Aug 25, 2023
This ensures we don't pass through the signal field when revalidating a
fetch as that can be delayed and we don't want to abort the
revalidation.

Closes: #54045
@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2023

This closed issue has been automatically locked because it had no new activity for 2 weeks. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

@github-actions github-actions bot added the locked label Sep 9, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue was opened via the bug report template. locked
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants