Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰 BUG: vercel/ai streaming broken with 3.50.0 in dev | unexpected gzip compression #5614

Closed
flexchar opened this issue Apr 14, 2024 · 11 comments
Labels
bug Something that isn't working

Comments

@flexchar
Copy link

flexchar commented Apr 14, 2024

Which Cloudflare product(s) does this pertain to?

Workers Runtime

What version(s) of the tool(s) are you using?

3.50.0

What version of Node are you using?

v21.7.1

What operating system and version are you using?

macOS 14.4.1

Describe the Bug

vercel/ai library provides easy to implement streaming for OpenAI. After upgrade to latest version the streaming is not output until it's completely done buffering. This is in development. I cannot deploy as that would break the clients. Once I can, I will test.

I am also using HonoJS however it did not change. I am not using any compression middlewares etc.

Observed behavior

The streaming is buffered until AI provider is finished.

Expected behavior

Should not be buffered.

Steps to reproduce

Reproducible https://github.com/flexchar/workers-sdk-issue-5614

Please provide a link to a minimal reproduction

No response

Please provide any relevant error logs

I noticed that the broken response includes Content-Encoding: gzip header. Compressing would explain why buffering is not working.

As such, appending header to disable compression alleviates the issue:

return new StreamingTextResponse(stream, {
    headers: {
        // Disable any compression to prevent buffering
        'Content-Encoding': 'identity',
    },
});

Reference https://sdk.vercel.ai/docs/api-reference/streaming-text-response#example

Might be related:

@flexchar flexchar added the bug Something that isn't working label Apr 14, 2024
@flexchar flexchar changed the title 馃悰 BUG: Vercel/AI streaming broken with 3.50.0 馃悰 BUG: vercel/ai streaming broken with 3.50.0 in dev | unexpected gzip compression Apr 14, 2024
@RamIdeas
Copy link
Contributor

Hi @flexchar! Can you please provide a minimal reproduction of this bug where running with npx wrangler@3.50.0 dev would not stream the response and where npx wrangler@3.49.0 dev will stream the response?

@zwily
Copy link

zwily commented Apr 18, 2024

I'm am seeing the same thing with my AI-based app. on 3.49.0, I get responses streamed back (with no Content-Encoding: gzip) header, and with 3.50, Content-Encoding: gzip is added and the entire response is buffered and sent at once.

Unfortunately my code is not public and getting all the pieces in places for a minimal reproduction would take quite awhile so I'm going to pin to 3.49.0 for now.

@flexchar
Copy link
Author

Hey, yes! I am insanely busy, I put this on my todo this weekend. Sorry for the delay!

@zwily
Copy link

zwily commented Apr 18, 2024

A couple notes... If I add a Content-Encoding on my response from the worker, like Content-Encoding: identity, then it works. (That's not a valid value for Content-Encoding, but it skips the logic that adds the gzipping.)

Also, in my app in production, there is no Content-Encoding header being set on the response from the worker. In the 3.50 release notes it talks about adding the Accept-Encoding: gzip, br header to all worker requests to match production, but production does not seem to always gzip the response like it appears is happening with wrangler 3.50.

Thanks for taking a look!

Here's an example of the code I added to work around this in dev:

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
      // this is a hack to fix wrangler buffering responses in > 3.50.0. Remove
      // once it's fixed.
      // https://github.com/cloudflare/workers-sdk/issues/5614
      ...(process.env.NODE_ENV === "development"
        ? {
            "Content-Encoding": "identity",
          }
        : {}),
    },
  });

@flexchar
Copy link
Author

@teak
Copy link

teak commented Apr 27, 2024

Similar issue with the simplest of proxy workers. Streaming requests are fully buffered before responding when using the latest wrangler 3.52.0, while 3.49.0 works as expected.

export default {
  async fetch(request, environment, context) {
    return fetch(new Request('http://localhost:7777/', request));
  }
}

@bruceharrison1984
Copy link

bruceharrison1984 commented Apr 29, 2024

I'm seeing the exact same behavior as @teak. I investigated more and discovered the same gzip header as the culprit.

I had previously filed a bug, but hadn't yet collected this additional piece of info.
#5630

This can be worked around by manually gzipping and chunking the response that wrangler sends, and forward the altered response to the client.

@RamIdeas
Copy link
Contributor

This should be fixed by #5798

Please try with wrangler@next or wait for tomorrow's release. If the problem persists, please open another issue.

@flexchar
Copy link
Author

flexchar commented Jun 9, 2024

Is it possible this still present in Wrangler 3.60.0? @RamIdeas @bruceharrison1984 @teak

@teak
Copy link

teak commented Jun 9, 2024

I'm not seeing any issue with Wrangler 3.60.0.

@ahmadbilaldev
Copy link

ahmadbilaldev commented Jun 11, 2024

Tested on 3.53 and 3.60, but still getting gzip header and streaming through vercel ai is not working. It works with "Content-Encoding": "identity" or 'Content-Type': 'text/event-stream'. Any pointers?

Screenshot 2024-06-11 at 5 18 54鈥疨M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that isn't working
Projects
Archived in project
Development

No branches or pull requests

6 participants