Have you verified whether the abort signal is actually functioning in the edge runtime? #90

wsxiaoys · 2023-06-16T03:23:20Z

Based on my experiments so far, it appears that the AbortController doesn't function properly on Vercel's hosted edge runtime. This observation aligns with the issue discussed in detail on the following GitHub issue: vercel/next.js#50364.

jensen · 2023-06-16T06:19:25Z

Update: I have done some testing using the ai-chatbot project deployed to Vercel with some logging. Tokens are not generated on the server, in the background, after cancellation.

I am coming to the same conclusion.

I was very excited to see the release of this library today, particularly this documentation https://sdk.vercel.ai/docs/concepts/backpressure-and-cancellation.

Last month I was asked to add cancellation to a product I am working on for a client. I attempted to get this working with edge functions but could not receive the cancel callback on the server. I decided to see if I could get it working at all and was able to implement a version using Deno since it has a similar runtime API.

At this point, I sent a support request to Vercel and about two weeks later it was confirmed the upstream provider, Cloudflare or AWS, I assume, did not support the abort signal and was aware of this limitation.

I moved our Open AI streaming endpoints to a Fastify server running Node 20 on an AWS ECS cluster. This setup has allowed me to handle the cancellations properly. I can cancel my request to OpenAI, which saves me tokens and writes whatever tokens were received to the db with a "cancel" type.

I did not expect the edge function to receive the AbortController signal, even though the examples use this as the cancellation mechanism. The example of using the pull callback in a ReadableStream was something I hadn't tried previously, so I decided to apply the approach to our codebase.

I found that the AIStream doesn't use the pull callback but instead has a TransformStream. I looked for pull and found it in the Hugging Face example. I converted my request.body to an iterator using a similar approach. This also doesn't seem to handle the cancellation.

I am using the pages directory in our app, assuming that the examples only use the new app directory to highlight it rather than it being a requirement.

wsxiaoys · 2023-06-16T06:44:12Z

I moved our Open AI streaming endpoints to a Fastify server running Node 20 on AWS an ECS cluster. This setup has allowed me to handle the cancellations properly. I can cancel my request to OpenAI, which saves me tokens and writes whatever tokens were received to the db with a "cancel" type.

@jensen were you able to confirm this behavior? Based on my experiments, it seems that sending a cancel signal to OpenAI does not actually reduce the token usage of the request...

related: openai/openai-node#134

jensen · 2023-06-16T06:49:52Z

There are a lot of moving pieces, but I have successfully cancelled the request using both Deno and Fastify with Node 20.

I have my own Open AI API account that has no traffic. I use this to confirm a long completion that is cancelled uses the number of tokens that matches what I calculate using tiktoken vs what it would have if it finished. My prompt would be "Write a blog post about React". When I cancel it after a few sentences, the usage on the Open AI dashboard matches after an approximate delay of 10 minutes.

wsxiaoys · 2023-06-16T06:52:16Z

There are a lot of moving pieces, but I have successfully cancelled the request using both Deno and Fastify with Node 20.

I have my own Open AI API account that has no traffic. I use this to confirm a long completion that is cancelled uses the number of tokens that matches what I calculate using tiktoken vs what it would have if it finished. My prompt would be "Write a blog post about React". When I cancel it after a few sentences, the usage on the Open AI dashboard matches.

It's good to know that cancellation is indeed possible with OpenAI's API. Now, the next steps lie in vercel / nextjs side...

jensen · 2023-06-17T05:01:15Z

It looks like this was released today. https://github.com/vercel-labs/ai-chatbot

It has a stop-generating button https://chat.vercel.ai/ which makes sense since the SDK has a stop API through the hooks. I guess my next step is to clone this, set up some logging and deploy it to Vercel to double-check how it is behaving on the server. Perhaps the development environment isn't a good one to test this in, it hasn't been in the past.

wsxiaoys · 2023-06-17T06:46:55Z

It looks like this was released today. https://github.com/vercel-labs/ai-chatbot

It has a stop-generating button https://chat.vercel.ai/ which makes sense since the SDK has a stop API through the hooks. I guess my next step is to clone this, set up some logging and deploy it to Vercel to double-check how it is behaving on the server. Perhaps the development environment isn't a good one to test this in, it hasn't been in the past.

Thank you for taking the time to verify this. However, based on the source code, I predict that it won't work because the stop button simply calls the AbortController.abort() function.

I'm eagerly anticipating the result.

enricoros · 2023-06-19T00:48:46Z

Seems like it stops the new tokens to be displayed on the screen, but the token production still goes on between the Vercel Edge and OpenAI, so you'll be charged for the full amount and incur rate limiting while many generations are still going on in the background.
High priority fix, imo.

jensen · 2023-06-19T07:08:23Z

I was able to spend some time testing this tonight. I deployed a version of the ai-chatbot application with some additional logging. I am confident the cancellation works as intended when using the edge runtime. Tokens are not generated on the server, in the background, after cancellation.

This is excellent news.

I haven't done as much testing as I want to for our production application, but my next step will be to test my streaming changes using our staging environment.

I likely won't get to this in the next few days since we have already shipped our cancellation feature using AWS. I will still want to move these endpoints back to Vercel in July if I can.

wsxiaoys · 2023-06-19T19:50:23Z

Seems related:

vercel/edge-runtime#396
vercel/next.js#51330

Hey @jridgewell , seems what you're working on is related to the issue discussed here?

jridgewell · 2023-06-20T21:16:33Z

Hi! Yes, I'm working on getting proper streaming cancellation and back-pressure into Next.js. If you're using Next as your dev/production server, it's not currently possible to end the stream. Once vercel/next.js#51330 is merged and released, this should be fixed.

### What? This is an alternative to #51330, which only support aborting the response (doesn't support back-pressure). If the client cancels the request (HMR update, navigates to another page, etc), we'll be able to detect that and stop pulling data from the dev's `ReadableStream` response. ### Why? We want to allow API routes to stream data coming from another server (eg, AI services). The responses from these other servers can be long running and expensive. In the case the browser aborts the connection, it's critical that we stop streaming data as soon as possible. ### How? By checking whether `response.closed` is set during the `for await (…)` iteration, we're able to detect that the client has aborted the connection. Cleanup of the `ReadableStream` is handled implicitly by the async iterator when the loop ends. The one catch is our use of http-proxy for worker processes. It does not properly detect a client disconnecting (but does handle back-pressure). In order to fix that, I've manually added event listeners to detect the disconnect and cancel the proxied req/res pair. Re: [WEB-1185](https://linear.app/vercel/issue/WEB-1185) (we still need back-pressure) Fixes #50364 Fixes vercel/ai#90

jridgewell · 2023-06-22T01:47:58Z

Hi all! We've merged vercel/next.js#51594, which implements cancellation only. We'll work on getting back-pressure support after verifying it's impact on Next's general streaming performance (Next is mainly for streaming React components, and we need to make sure that's not taking a hit). I don't think it's going to be an issue, but just need some time to verify.

enricoros · 2023-06-22T17:32:19Z

@jridgewell Anything special devs need to do for using cancellation? And what is back-pressure and what do we need to accomodate it?

I work on one of the popular opensource GPT UIs (https://github.com/enricoros/big-agi) and I'm sure devs like us appreciate your fix.

jridgewell · 2023-06-22T17:59:53Z

Anything special devs need to do for using cancellation?

The Next.js team hasn't released a new version yet (I'll ping them to see if they can do a canary release), but once they have users just need to npm install next@latest (or next@canary).

And what is back-pressure and what do we need to accomodate it?

Back-pressure is explained in https://sdk.vercel.ai/docs/concepts/backpressure-and-cancellation. Essentially, it's the ability for the server pause the stream because the client doesn't need more information yet. Next.js hasn't added support for it yet, but when they do, everyone will need to update their next dependency again.

enricoros · 2023-06-22T18:02:54Z

@jridgewell thanks for the explanation. Will try out the canary when available. I'm glad it doesn't require code changes (maybe some exception handling?) I was trying with AbortControllers and exceptions everywhere, but nothing worked for me wrt cancellations.

jridgewell · 2023-06-22T18:27:49Z

v13.4.8-canary.0 just got released. If you update your project dependency, your dev server will support cancellation, and when you deploy that change your prod server should get it too.

enricoros · 2023-06-23T05:50:15Z

@jridgewell I tried canary.0 and .1, but somehow is not working for me, the server continues to pull events from OpenAI and feed pieces down the ReadableStream controller, despite closing the client browser window.

When I close the socket to the server (physically closing the Chrome window of the edge function caller), this is what I see in the log of the dev server (next dev):

And this is the code that prints the streaming events (the error is printed by the edge server):

This is within the ReadableStream.

return new ReadableStream({
    start: (controller) => {
        ...loop above...
    },
});

I must be doing something wrong.

jvandenaardweg · 2023-06-23T08:50:10Z

Maybe catch the AbortError on the server and expect it to happen by returning null or something? Like the client hooks do: https://github.com/vercel-labs/ai/blob/main/packages/core/react/use-completion.ts#L179

Haven't played around with it just yet, just watching this issue 👀

edit: nvm the above, just tried it and can't seem to catch that error

edit2: unfortunately it does not seem to abort the request to OpenAI as also stated above. The token usage reported on the OpenAI usage page is just too high for some aborted streams I just tested. It reports the usage as if I did not abort anything.

enricoros · 2023-06-23T15:17:17Z

Tried catching the AbortError but doesn't catch anything. Agree, the token usage keeps skyrocketing, a sign that the request to OpenAI servers keeps going..

jridgewell · 2023-06-23T15:32:38Z

Hi @enricoros: I'm not sure where your code is coming from, can you provide a link? Just based on reading the screenshot, it looks like you are the keeping the connection alive by eagerly pulling the data out of upstreamResponse.body with a for await (…) loop (this is talked about in the back-pressure and cancellation doc).

I'm assuming your eventParser is a SSE parser, but I'm not sure where you're sending the data after it's parsed. It's possible this could be fixed with by switching to a TransformStream similar to https://github.com/vercel-labs/ai/blob/107e436e925f660ea9fd02ced726a02cb7831a25/packages/core/streams/ai-stream.ts#L31-L62

enricoros · 2023-06-23T15:38:16Z

Ok I will try it and report.
Many of us GPT UIs had implementations that predate vercel/ai, and possibly with a code path that's non optimal.

Code below:

https://github.com/enricoros/big-agi/blob/main/pages/api/openai/stream-chat.ts#L112

jridgewell · 2023-06-23T15:58:37Z

Reading your code, it's definitely possible to switch to a TrasnformStream:

Move (most) of the code out of start handler, you can do it immediately before constructing the TransformStream
Delete the for await (…) code, replace it with a transform handler as explained in the back-pressure and cancellation

jvandenaardweg · 2023-06-23T17:44:32Z

To hook into this conversation. My implementation looks like this: https://sdk.vercel.ai/docs/api-reference/openai-stream#chat-model-example . So with all the methods the AI SDK provides. And the latest NextJS canary. Using the pages directory.

I also tried passing in req.signal into the createChatCompletion options, as it is an option there. But it does not seem to help. The server just throws the "Error: aborted" as mentioned by @enricoros , and then that's it. Also no more logging. But OpenAI is still sending data until it's done, i'm just not receiving it anymore (through the methods provided by the AI SDK)

Edit: Manually aborting using a timeout of a few seconds with a new AbortController inside the Edge function and then passing the signal into the options of createChatCompletion does work. It cancels the stream and it does not add token usage. Just to verify it is supported for that method.

jridgewell · 2023-06-23T22:55:21Z

@jvandenaardweg: It seems the AbortSignal isn't actually hooked up the node response that Next.js receives. I've opened vercel/next.js#51727 to address.

And, I've discovered that the cancel handler in a ReadableStream/TransformStream isn't called either. That'll be fixed once vercel/edge-runtime#428 is merged and we update the edge-runtime internal dependency.

jvandenaardweg · 2023-06-24T09:01:56Z

@jridgewell awesome! Appreciate the quick response on this! Looking forward to try it out 👍

Also, could you re-open this issue until there's a verification it works?

StringKe · 2023-06-27T02:20:29Z

@jridgewell hi
Your commit solves this problem at the same time vercel/next.js#50804

I did not detect the same problem in version v13.4.8-canary.5.

@jridgewell

…s for real cancellations This implementation has been largely inspired by the Vercel AI (stream) SDK, available at https://github.com/vercel-labs/ai/, and in particular by the work of @jridgewell on vercel/ai#90 and related issues. As soon as some pending changes land in edge-runtime and nextjs, we'll have full stream cancellation and tokens saving #57

enricoros · 2023-06-29T05:41:57Z

Reading your code, it's definitely possible to switch to a TrasnformStream:

Move (most) of the code out of start handler, you can do it immediately before constructing the TransformStream

Delete the for await (…) code, replace it with a transform handler as explained in the back-pressure and cancellation

Thanks for your help @jridgewell . Our app is now ported to use backpressure and cancellation, as you suggested. https://github.com/enricoros/big-agi/blob/490f8bdac30267662bee6b853ec8a3a303d2ab13/pages/api/llms/stream.ts#L141

I looked at the vercel-labs/ai implementation and adapted ours. Due to a couple of changes to the transformation functions, I couldn't use vercel-labs/ai as-is, but it's been an enormous help.

Test results (for when the client closes the connection):

without the canary (13.4.7), the TransformStream keeps going
with 13.4.8-canary.8, the TransformStream stops, BUT data is still sent from the node process to the OpenAI servers

Great progress - thanks!

jridgewell · 2023-07-03T17:24:05Z

13.4.8 is out now, which fixes both issues from #90 (comment).

with 13.4.8-canary.8, the TransformStream stops, BUT data is still sent from the node process to the OpenAI servers

With vercel/next.js#51944 (released in v13.4.8-canary.12) and OpenAIStream, this should be fixed. The transform stream will receive the cancel() event from Next's server, and that should be propagated to the fetch you're maintaining to OpenAI's server.

jensen · 2023-07-03T18:20:18Z

Thank you for all of your work on this @jridgewell. This closes a long-standing support ticket I opened in April and allows me to give my client some options.

enricoros · 2023-07-03T19:38:55Z

@jridgewell: tested with 13.4.8 and it works!

In our implementation (which is inspired by AIStream, using a TransformStream) I can see the connection stopping from the Node process to the OpenAI servers! GREAT!

There's still an error message on the console (- error uncaughtException: Error: aborted), and maybe others won't see that, but apart from the scare effect, all the new changes seem to be working well! Our abort on the (browser) client side will stop the TransformStream on the (edge) server side, and the fetch to the OpenAI servers stops transmitting bytes too!

Well done @jridgewell!

jvandenaardweg · 2023-07-04T03:40:29Z

Thanks @jridgewell , confirmed it works! Token usages reported on the OpenAI website matches with what you would expect when you cancel the stream.

One future improvement could be to catch the abort in the Edge Function so we don't have a uncaught error, and allow to handle the abort, if that's even possible.

The abort error:

error uncaughtException: Error: aborted
  at connResetException (node:internal/errors:717:14)
  at abortIncoming (node:_http_server:754:17)
  at socketOnClose (node:_http_server:748:3)
  at Socket.emit (node:events:525:35)
  at Socket.emit (node:domain:489:12)
  at TCP.<anonymous> (node:net:322:12)
  at TCP.callbackTrampoline (node:internal/async_hooks:130:17) {
  digest: undefined
}

In my use case I need to keep track of how many tokens are used. I already do this when started (onStart) for the prompt and when completed (onCompletion) for the generated output tokens. So handling the abort would allow me to report token usage at the moment it was aborted.

But I think that would fit better in a new issue here on Github.

Many thanks all!

enricoros · 2023-07-04T04:30:43Z

One future improvement could be to catch the abort in the Edge Function so we don't have a uncaught error, and allow to handle the abort

We're on the same use cases @jvandenaardweg 👍 good request!!

@jridgewell

Thanks to the Vercel team (@jridgewell), an interruption of the stream on the client side will lead to the cancellation of the TransformStream on the servers side, which in turns cancels the open fetch() to the upstream. This was a long needed change and we are happy to report it works well. Related: #114 - vercel/ai#90 - vercel/edge-runtime#428 - trpc/trpc#4586 (enormous thanks to the tRPC team to issue a quick release as well)

wsxiaoys · 2023-07-09T15:20:49Z

If you found abort signal is not fired on node 16, it's being discussed in vercel/next.js#51727 (comment)

jridgewell · 2023-07-10T15:46:36Z

I'm working on Node 16 support in vercel/next.js#52281, it's currently blocked on a test that only fails in CI.

# Summary Edge-runtime 2.4.4 has many bug fixes, but most importantly it adds support for stream cancellation to the edge runtime. This is extremely important since a lot of projects are using `streams` related to `ai`. They currently have no way of handling a cancellation coming from the client. This was introduced to `next` with as described by this comment: vercel/ai#90 (comment) You can find the PR for that here: vercel/next.js#51727 It also has a good description for what we're trying to do here, but for people not using `next` # Problem When a client sends an abort signal, it is currently not being handled by edge functions. This was fixed in edge-runtime@2.4.4 # Solution Update the package

lassegit · 2023-08-31T12:17:55Z

@jvandenaardweg Same here. Any resolution yet?

jvandenaardweg · 2023-08-31T12:38:29Z

@jvandenaardweg Same here. Any resolution yet?

Unfortunately no. Just 2 possible workarounds I can think of:

When a user initiates a cancel from your UI, send the AI generated text to an API that tracks the token/character/word usage of that text. Probable downside is when a user navigates away from your site (or closes the browser), then the API call will not be fired.
Track each generated token using onToken on the server and after a certain timeout just assume its cancelled. Also track the completion, so you can cancel them out. But for this you need to keep a record in a Redis store or something. Send each token to that store (that are a lot of requests depending on the size of text being generated). Because once the stream is done, the Edge Function is terminated and your timeout code will not run. Then I guess you need to poll Redis each minute or so and pull that usage info into your main database.

I currently have option 1 in place in my app. Not optimal, but in terms of dev time the easiest I think.

lassegit · 2023-08-31T20:24:05Z

I have something similar to option 1, but I need to move it to the server. Guess I will remove to possibility to aboard for now.

OlegGulevskyy · 2024-01-23T10:18:32Z

Hm, how did you guys manage to make stop function working? What it does for me is cancels the stream but then it resumes it again by itself, therefore it never really does cancel it.
Nextjs version is ^14.1.0 and ai package version is ^2.2.31
I would appreciate any pointers if there is something else needs to be implemented on the server side, but the plain example does not seem to work

This was referenced Jun 20, 2023

Implement lazy streaming vercel/next.js#51330

Open

Support response aborting vercel/next.js#51594

Merged

jridgewell closed this as completed Jun 22, 2023

jridgewell reopened this Jun 26, 2023

jridgewell mentioned this issue Jun 28, 2023

Update edge-runtime to latest vercel/next.js#51944

Merged

ijjk closed this as completed in ijjk/next.js@aec3c58 Jul 1, 2023

enricoros mentioned this issue Jul 3, 2023

chore: update next dependency to 13.4.8 trpc/trpc#4586

Closed

1 task

jvandenaardweg mentioned this issue Jul 10, 2023

Allow to handle stream cancellation on server #309

Open

hassanbazzi mentioned this issue Jul 26, 2023

[node] bump edge-runtime to 2.4.4 vercel/vercel#10255

Merged

miurla mentioned this issue Aug 18, 2023

Add dev page to rebuild miurla/babyagi-ui#163

Merged

lgrammel mentioned this issue Oct 30, 2023

stop() does not turn off server-side stream #577

Closed

Have you verified whether the abort signal is actually functioning in the edge runtime? #90

Have you verified whether the abort signal is actually functioning in the edge runtime? #90

Comments

wsxiaoys commented Jun 16, 2023

jensen commented Jun 16, 2023 • edited

wsxiaoys commented Jun 16, 2023 • edited

jensen commented Jun 16, 2023 • edited

wsxiaoys commented Jun 16, 2023

jensen commented Jun 17, 2023

wsxiaoys commented Jun 17, 2023

enricoros commented Jun 19, 2023

jensen commented Jun 19, 2023 • edited

wsxiaoys commented Jun 19, 2023 • edited

jridgewell commented Jun 20, 2023

jridgewell commented Jun 22, 2023

enricoros commented Jun 22, 2023

jridgewell commented Jun 22, 2023

enricoros commented Jun 22, 2023

jridgewell commented Jun 22, 2023

enricoros commented Jun 23, 2023 • edited

jvandenaardweg commented Jun 23, 2023 • edited

enricoros commented Jun 23, 2023

jridgewell commented Jun 23, 2023

enricoros commented Jun 23, 2023

jridgewell commented Jun 23, 2023

jvandenaardweg commented Jun 23, 2023 • edited

jridgewell commented Jun 23, 2023

jvandenaardweg commented Jun 24, 2023 • edited

StringKe commented Jun 27, 2023

enricoros commented Jun 29, 2023 • edited

jridgewell commented Jul 3, 2023 • edited

jensen commented Jul 3, 2023

enricoros commented Jul 3, 2023

jvandenaardweg commented Jul 4, 2023 • edited

enricoros commented Jul 4, 2023

wsxiaoys commented Jul 9, 2023

jridgewell commented Jul 10, 2023

lassegit commented Aug 31, 2023

jvandenaardweg commented Aug 31, 2023 • edited

lassegit commented Aug 31, 2023

OlegGulevskyy commented Jan 23, 2024

jensen commented Jun 16, 2023 •

edited

wsxiaoys commented Jun 16, 2023 •

edited

jensen commented Jun 16, 2023 •

edited

jensen commented Jun 19, 2023 •

edited

wsxiaoys commented Jun 19, 2023 •

edited

enricoros commented Jun 23, 2023 •

edited

jvandenaardweg commented Jun 23, 2023 •

edited

jvandenaardweg commented Jun 23, 2023 •

edited

jvandenaardweg commented Jun 24, 2023 •

edited

enricoros commented Jun 29, 2023 •

edited

jridgewell commented Jul 3, 2023 •

edited

jvandenaardweg commented Jul 4, 2023 •

edited

jvandenaardweg commented Aug 31, 2023 •

edited