504 error with langchain chat api (edge function timeout) #487

sambhav2612 · 2023-08-21T05:33:40Z

I keep getting 405 error while executing any complex query on https://gpt.suitable.ai

Vercel log shows this: [POST] /api/chat reason=EDGE_FUNCTION_INVOCATION_TIMEOUT, status=504, user_error=true

It seems not to be streaming anything, sends a block response if at all.

Failing prompts:

how to source on linkedin?
how to find candidate details on naukri?

Route.ts:

import { NextRequest, NextResponse } from "next/server";
import { Message as VercelChatMessage, StreamingTextResponse } from "ai";

import { ChatOpenAI } from "langchain/chat_models/openai";
import { AIMessage, ChatMessage, HumanMessage } from "langchain/schema";
import { Calculator } from "langchain/tools/calculator";
import { PromptTemplate } from "langchain/prompts";
import { BytesOutputParser } from "langchain/schema/output_parser";

export const runtime = "edge";

const formatMessage = (message: VercelChatMessage) => {
  return `${message.role}: ${message.content}`;
};

const TEMPLATE = `
You are a CHRO (Chief Human Resources Officer) with over 20 years of industry experience called Suitable.
You are based out of India.
You know nothing but recruitment and talent acqusition, basically you're the king of all things that have anything to do with people.
All responses must be extremely clear, localized and professional, remember you exist to make that person understand their problems.
Respond with markdown to emphasize on your comprehensiveness. 
 
Current conversation:
{chat_history}
 
User: {input}
AI:`;

export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    const messages = body.messages ?? [];
    const formattedPreviousMessages = messages
      .slice(0, -1)
      .map(formatMessage);
    const currentMessageContent = messages[messages.length - 1].content;
    const prompt = PromptTemplate.fromTemplate<{
      chat_history: string;
      input: string;
    }>(TEMPLATE);

    const model = new ChatOpenAI({
      modelName: "gpt-4",
      temperature: 0.5,
      streaming: true,
      // maxRetries: 10,
      // maxConcurrency: 5,
    });
   
    const outputParser = new BytesOutputParser();
    const chain = prompt.pipe(model).pipe(outputParser);

    const stream = await chain.stream({
      chat_history: formattedPreviousMessages.join("\n"),
      input: currentMessageContent,
    });

    return new StreamingTextResponse(stream);
  } catch (e: any) {
    return NextResponse.json({ error: e.message }, { status: 500 });
  }
}

The text was updated successfully, but these errors were encountered:

sambhav2612 · 2023-08-21T06:13:29Z

EDIT: seems to have worked out once i removed model specification, although its still not streaming

sambhav2612 · 2023-08-23T11:44:32Z

EDIT: edge functions times out once I specify default gpt-3.5/4 models, you can imagine what might happen with custom-trained gpt-4 model

ElectricCodeGuy · 2023-10-24T19:40:12Z

Same issue... also using langchain and ai from vercel. No issue on localhost or other providers.

jaschaephraim · 2023-11-15T18:22:06Z

Just to add a data point: encountering the same issue not using langchain, but specifying model: 'gpt-4-vision-preview'
The edge function is not always timing out, but seems to happen when sending a lot of image urls as input. I'd guess there's just an error from openai that's not being caught.

ElectricCodeGuy · 2023-11-15T21:53:20Z

So i found a method to reduce the load time, but im not sure this is on verccel. it seems like Cloudflare where i host my service can take between 5 sec to 35 sec using edge runtime and much much worse on serverless...
Localhost is always a few sec thou

MaxLeiter · 2023-11-20T19:36:07Z

I believe this is from the APIs you're using not returning anything to stream. Especially with gpt-4-vision-preview, which can take a long time to analyze the image. If you can provide a reproduction I can re-open the issue.

lucasquinteiro · 2023-12-08T19:43:07Z

@ElectricCodeGuy @sambhav2612

In Next.js Edge Functions "must begin sending a response within 25 seconds" or they will send a 504 timeout error. Probably, as you are calling models like gpt-4 and gpt-4-vision-preview the cold start takes more than those 25 seconds and this causes your connection to time out.

As the docs explain, Edge Functions don't have a maximum streaming time once they started streaming. So a possible workaround for this issue could be to start streaming empty strings with a 2 second interval so that we keep the connection alive while the LLM API call loads.

Note that while Edge Functions don't have a max streaming time, nodejs Serverless Functions do have maximum times, so this solution will probably not be enough in those cases.

Here is a possible approach using the stream handlers:

import { NextRequest, NextResponse } from "next/server";
import { Message as VercelChatMessage, StreamingTextResponse } from "ai";

import { ChatOpenAI } from "langchain/chat_models/openai";
import { PromptTemplate } from "langchain/prompts";
import { BytesOutputParser } from "langchain/schema/output_parser";
import { LangChainStream } from "ai";

export const runtime = "edge";

const formatMessage = (message: VercelChatMessage) => {
  return `${message.role}: ${message.content}`;
};

const TEMPLATE = `
You are a CHRO (Chief Human Resources Officer) with over 20 years of industry experience called Suitable.
You are based out of India.
You know nothing but recruitment and talent acqusition, basically you're the king of all things that have anything to do with people.
All responses must be extremely clear, localized and professional, remember you exist to make that person understand their problems.
Respond with markdown to emphasize on your comprehensiveness. 
 
Current conversation:
{chat_history}
 
User: {input}
AI:`;

export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    const messages = body.messages ?? [];

    // Use LangChainStream from the ai package to create a stream that can be handled by the model
    const { stream, handlers } = LangChainStream();

    const heartbeatInterval = setInterval(() => {
      // stream an empty string to keep the connection alive while the model loads
      handlers.handleLLMNewToken("");
    }, 2000);

    const formattedPreviousMessages = messages.slice(0, -1).map(formatMessage);
    const currentMessageContent = messages[messages.length - 1].content;
    const prompt = PromptTemplate.fromTemplate<{
      chat_history: string;
      input: string;
    }>(TEMPLATE);

    const model = new ChatOpenAI({
      modelName: "gpt-4",
      temperature: 0.5,
      streaming: true,
      // connect the handlers to the model callbacks
      callbacks: [handlers],
      verbose: true,
    });

    const outputParser = new BytesOutputParser();
    const chain = prompt.pipe(model).pipe(outputParser);

    chain
      // use invoke instead of stream as the streaming will be done by the handlers
      .invoke({
        chat_history: formattedPreviousMessages.join("\n"),
        input: currentMessageContent,
      })
      // clear the interval once the chain resolves
      .then(() => {
        clearInterval(heartbeatInterval);
      })
      .catch((err) => {
        clearInterval(heartbeatInterval);
      });

    return new StreamingTextResponse(stream);
  } catch (e: any) {
    console.error(e);
    return NextResponse.json({ error: e.message }, { status: 500 });
  }
}

Here are some docs about the max duration of Edge Functions and serverless timeouts in general.

I stumbled upon a similar issue with timeouts and this workaround helped me solve it. Let me know if this works for you.

jpherrerap · 2024-01-04T17:47:20Z

@sambhav2612 @ElectricCodeGuy I was getting a similar error with langchain and was able to fix it by setting a long timeout. Not sure how to set an indefinite one. Try

const model = new ChatOpenAI({
      modelName: "gpt-4",
      temperature: 0.5,
      streaming: true,
    },
    { timeout: 10000000 }
);

MaxLeiter changed the title ~~405 error with langchain openai chatbot~~ 504 error with langchain openai chatbot Aug 21, 2023

vercel deleted a comment from sambhav2612 Aug 21, 2023

sambhav2612 changed the title ~~504 error with langchain openai chatbot~~ 504 error with langchain chat api (edge function timeout) Aug 23, 2023

MaxLeiter closed this as completed Nov 20, 2023

SuiYunsy mentioned this issue Jun 1, 2024

[Feature Request]: Vercel Functions for Hobby can now run up to 60 seconds ChatGPTNextWeb/ChatGPT-Next-Web#4781

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

504 error with langchain chat api (edge function timeout) #487

504 error with langchain chat api (edge function timeout) #487

sambhav2612 commented Aug 21, 2023 •

edited

Loading

sambhav2612 commented Aug 21, 2023

sambhav2612 commented Aug 23, 2023

ElectricCodeGuy commented Oct 24, 2023

jaschaephraim commented Nov 15, 2023

ElectricCodeGuy commented Nov 15, 2023

MaxLeiter commented Nov 20, 2023 •

edited

Loading

lucasquinteiro commented Dec 8, 2023

jpherrerap commented Jan 4, 2024

504 error with langchain chat api (edge function timeout) #487

504 error with langchain chat api (edge function timeout) #487

Comments

sambhav2612 commented Aug 21, 2023 • edited Loading

sambhav2612 commented Aug 21, 2023

sambhav2612 commented Aug 23, 2023

ElectricCodeGuy commented Oct 24, 2023

jaschaephraim commented Nov 15, 2023

ElectricCodeGuy commented Nov 15, 2023

MaxLeiter commented Nov 20, 2023 • edited Loading

lucasquinteiro commented Dec 8, 2023

jpherrerap commented Jan 4, 2024

sambhav2612 commented Aug 21, 2023 •

edited

Loading

MaxLeiter commented Nov 20, 2023 •

edited

Loading