Example using Streaming Response for FastAPI. #161

mattzcarey · 2023-06-19T10:15:20Z

Lots of people write their Langchain apis in Python, not using RSC.

A common tech stack is using FastAPI on the backend with NextJS/React for the frontend. It would be great to show an example of this using FastAPI Streaming Response.

This would really help us building Quivr..

jasan-s · 2023-06-22T22:36:06Z

@mattzcarey I'm thinking of using similar tech stack, but it seems that vercel doesn't support python runtime streaming. could you please share your stack in more detail.
I'm currently using langchain js b deployed to vercel edge function and streaming response back to client. But it is apparent that the python version is far more featured, thus my reason to switch.

mattzcarey · 2023-06-24T15:09:22Z

@jasan-s I have managed to do this with langchain callbacks and Streaming Response from FastAPi. You can check out the 'stream' route in the Quivr codebase.

jasan-s · 2023-06-24T15:35:24Z

@jasan-s I have managed to do this with langchain callbacks and Streaming Response from FastAPi. You can check out the 'stream' route in the Quivr codebase.

Did you deploy quiver to vercel?

mattzcarey · 2023-06-25T13:43:47Z

@jasan-s I have managed to do this with langchain callbacks and Streaming Response from FastAPi. You can check out the 'stream' route in the Quivr codebase.

Did you deploy quiver to vercel?

Yes it can be.

kallebysantos · 2023-10-20T21:53:55Z

I Had create a gist example:

Running AI models with FastAPI and Vercel AI SDK

2023-10-20.22-57-29.mp4

satyamdalai · 2023-10-25T18:41:07Z

Having a native support for converting streaming responses from FastAPI/any other HTTP Server in Next.js API routes (with the help of SDK) will be helpful in my usecase. Since I don't want to directly call FastAPI endpoint using useChat hook, as I manage the authentication layer in Next.js.

danielcorin · 2024-02-02T15:05:52Z

I came across this thread looking for the same thing but wanted to use the openai library (rather than langchain as in the gist above) and the useChat hook. Here's what I ended up doing:

server.py

from openai import AsyncOpenAI

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import StreamingResponse

app = FastAPI()

# Added because the frontend and this backend run on separate ports, should change depending on your setup, not a good idea in prod
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

client = AsyncOpenAI()

@app.post("/ask")
async def ask(req: dict):
    stream = await client.chat.completions.create(
        messages=req["messages"],
        model="gpt-3.5-turbo",
        stream=True,
    )

    async def generator():
        async for chunk in stream:
            yield chunk.choices[0].delta.content or ""

    response_messages = generator()
    return StreamingResponse(response_messages, media_type="text/event-stream")

Run with

uvicorn server:app --reload

Example frontend src/app/page.tsx in a new Next.js app

"use client";

import { useChat } from "ai/react";

export default function Home() {
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: "http://127.0.0.1:8000/ask"
  });

  return (
    <main className="flex min-h-screen flex-col items-center justify-between p-24">
      <div>
        {messages.map((m) => (
          <div key={m.id}>
            {m.role === "user" ? "User: " : "AI: "}
            {m.content}
          </div>
        ))}

        <form onSubmit={handleSubmit}>
          <label>
            Say something...
            <input value={input} onChange={handleInputChange} />
          </label>
          <button type="submit">Send</button>
        </form>
      </div>
    </main>
  );
}

kallebysantos · 2024-02-03T09:40:29Z

I think that Issue should be mark as complete.
We had provide useful examples that solves the question.

DanLeininger · 2024-02-05T00:53:36Z

Building off the above answers, here's an example using experimental_StreamData:

server.py

from openai import AsyncOpenAI

from utils import stream_chunk #formats chunks for use with experimental_StreamData

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import StreamingResponse

app = FastAPI()

# Added because the frontend and this backend run on separate ports, should change depending on your setup, not a good idea in prod
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
    expose_headers=[ "X-Experimental-Stream-Data"],  # this is needed for streaming data header to be read by the client
)

client = AsyncOpenAI()

@app.post("/ask")
async def ask(req: dict):
    stream = await client.chat.completions.create(
        messages=req["messages"],
        model="gpt-3.5-turbo",
        stream=True,
    )

    async def generator():
        async for chunk in stream:
            yield stream_chunk(chunk.choices[0].delta.content or "", "text")
        yield stream_chunk([{"foo":"bar"}], "data") # send streaming data after 

    response_messages = generator()
    return StreamingResponse(response_messages, media_type="text/event-stream",  headers={"X-Experimental-Stream-Data": "true"})

Where stream_chunk is a util that looks like this:

utils.py

# transforms the chunk into a stream part compatible with the vercel/ai
def stream_chunk(chunk, type: str = "text"):
    code = get_stream_part_code(type)
    formatted_stream_part = f"{code}:{json.dumps(chunk, separators=(',', ':'))}\n"
    return formatted_stream_part

# given a type returns the code for the stream part
def get_stream_part_code(stream_part_type: str) -> str:
    stream_part_types = {
        "text": "0",
        "function_call": "1",
        "data": "2",
        "error": "3",
        "assistant_message": "4",
        "assistant_data_stream_part": "5",
        "data_stream_part": "6",
        "message_annotations_stream_part": "7",
    }
    return stream_part_types[stream_part_type]

szymonzmyslony · 2024-02-09T18:32:42Z

@DanLeininger
your setup works for me when using useChat(). I want to add some custom onCompletion handlers with AI stream in route handler. My server setup is exactly like yours (again works with useChat) but im getting no response with:


export async function POST(req: Request) {
const json = await req.json()
const { messages, previewToken } = json
const userId = (await auth())?.user.id

if (!userId) {
return new Response('Unauthorized', {
status: 401
})
}
const data = {
messages: [{ role: 'user', content: 'Hello' }]
}
const fetchResponse = await fetch('http://127.0.0.1:8000/ask', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(data)
})
const reader = fetchResponse
console.log('Reader is', reader)
const myStream = AIStream(reader, undefined, {
onStart: async () => {
console.log('Stream started')
},
onCompletion: async (completion: string) => {
console.log('Completion completed', completion)
},
onFinal: async (completion: string) => {
console.log('Stream completed', completion)
}
})
return new StreamingTextResponse(myStream)
}

Udbhav8 · 2024-02-14T10:29:36Z

@DanLeininger your setup works for me when using useChat(). I want to add some custom onCompletion handlers with AI stream in route handler. My server setup is exactly like yours (again works with useChat) but im getting no response with:


export async function POST(req: Request) {
const json = await req.json()
const { messages, previewToken } = json
const userId = (await auth())?.user.id

if (!userId) {
return new Response('Unauthorized', {
status: 401
})
}
const data = {
messages: [{ role: 'user', content: 'Hello' }]
}
const fetchResponse = await fetch('http://127.0.0.1:8000/ask', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(data)
})
const reader = fetchResponse
console.log('Reader is', reader)
const myStream = AIStream(reader, undefined, {
onStart: async () => {
console.log('Stream started')
},
onCompletion: async (completion: string) => {
console.log('Completion completed', completion)
},
onFinal: async (completion: string) => {
console.log('Stream completed', completion)
}
})
return new StreamingTextResponse(myStream)
}

Having the same issue @danielcorin @DanLeininger would be great to have some help

ichitaka · 2024-02-19T21:29:25Z

I think that Issue should be mark as complete. We had provide useful examples that solves the question.

We still need a useful example that include tool-calling and streaming data.

DanLeininger · 2024-02-19T23:53:03Z

@szymonzmyslony @Udbhav8 In our use case we're bypassing Next.js api routes / route handlers and streaming from Fast API directly to the client / useChat() and so haven't attempted passing anything through AIStream

ErikDale · 2024-04-03T10:30:51Z

@szymonzmyslony @Udbhav8 @satyamdalai have you found out how to add some custom onCompletion handlers with AI stream in the route handler, maybe using the AIStream?

lgrammel · 2024-05-10T11:32:55Z

If your endpoint sends a chunked text stream, you can useCompletion and useChat with streamMode: "text"

ashen007 · 2024-05-30T07:58:53Z

@DanLeininger your answer worked for me, my use case was that I had a fast API back end which used langgraph agent and had to do the streaming as you mentioned. it worked properly, thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example using Streaming Response for FastAPI. #161

Example using Streaming Response for FastAPI. #161

mattzcarey commented Jun 19, 2023

jasan-s commented Jun 22, 2023

mattzcarey commented Jun 24, 2023 •

edited

Loading

jasan-s commented Jun 24, 2023

mattzcarey commented Jun 25, 2023

kallebysantos commented Oct 20, 2023 •

edited

Loading

satyamdalai commented Oct 25, 2023

danielcorin commented Feb 2, 2024 •

edited

Loading

kallebysantos commented Feb 3, 2024

DanLeininger commented Feb 5, 2024 •

edited

Loading

szymonzmyslony commented Feb 9, 2024 •

edited

Loading

Udbhav8 commented Feb 14, 2024

ichitaka commented Feb 19, 2024

DanLeininger commented Feb 19, 2024

ErikDale commented Apr 3, 2024 •

edited

Loading

lgrammel commented May 10, 2024

ashen007 commented May 30, 2024

Example using Streaming Response for FastAPI. #161

Example using Streaming Response for FastAPI. #161

Comments

mattzcarey commented Jun 19, 2023

jasan-s commented Jun 22, 2023

mattzcarey commented Jun 24, 2023 • edited Loading

jasan-s commented Jun 24, 2023

mattzcarey commented Jun 25, 2023

kallebysantos commented Oct 20, 2023 • edited Loading

satyamdalai commented Oct 25, 2023

danielcorin commented Feb 2, 2024 • edited Loading

kallebysantos commented Feb 3, 2024

DanLeininger commented Feb 5, 2024 • edited Loading

szymonzmyslony commented Feb 9, 2024 • edited Loading

Udbhav8 commented Feb 14, 2024

ichitaka commented Feb 19, 2024

DanLeininger commented Feb 19, 2024

ErikDale commented Apr 3, 2024 • edited Loading

lgrammel commented May 10, 2024

ashen007 commented May 30, 2024

mattzcarey commented Jun 24, 2023 •

edited

Loading

kallebysantos commented Oct 20, 2023 •

edited

Loading

danielcorin commented Feb 2, 2024 •

edited

Loading

DanLeininger commented Feb 5, 2024 •

edited

Loading

szymonzmyslony commented Feb 9, 2024 •

edited

Loading

ErikDale commented Apr 3, 2024 •

edited

Loading