Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

504 error with langchain chat api (edge function timeout) #487

Closed
sambhav2612 opened this issue Aug 21, 2023 · 8 comments
Closed

504 error with langchain chat api (edge function timeout) #487

sambhav2612 opened this issue Aug 21, 2023 · 8 comments

Comments

@sambhav2612
Copy link

sambhav2612 commented Aug 21, 2023

I keep getting 405 error while executing any complex query on https://gpt.suitable.ai

Vercel log shows this: [POST] /api/chat reason=EDGE_FUNCTION_INVOCATION_TIMEOUT, status=504, user_error=true

It seems not to be streaming anything, sends a block response if at all.

Failing prompts:

  • how to source on linkedin?
  • how to find candidate details on naukri?

Route.ts:

import { NextRequest, NextResponse } from "next/server";
import { Message as VercelChatMessage, StreamingTextResponse } from "ai";

import { ChatOpenAI } from "langchain/chat_models/openai";
import { AIMessage, ChatMessage, HumanMessage } from "langchain/schema";
import { Calculator } from "langchain/tools/calculator";
import { PromptTemplate } from "langchain/prompts";
import { BytesOutputParser } from "langchain/schema/output_parser";

export const runtime = "edge";

const formatMessage = (message: VercelChatMessage) => {
  return `${message.role}: ${message.content}`;
};

const TEMPLATE = `
You are a CHRO (Chief Human Resources Officer) with over 20 years of industry experience called Suitable.
You are based out of India.
You know nothing but recruitment and talent acqusition, basically you're the king of all things that have anything to do with people.
All responses must be extremely clear, localized and professional, remember you exist to make that person understand their problems.
Respond with markdown to emphasize on your comprehensiveness. 
 
Current conversation:
{chat_history}
 
User: {input}
AI:`;

export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    const messages = body.messages ?? [];
    const formattedPreviousMessages = messages
      .slice(0, -1)
      .map(formatMessage);
    const currentMessageContent = messages[messages.length - 1].content;
    const prompt = PromptTemplate.fromTemplate<{
      chat_history: string;
      input: string;
    }>(TEMPLATE);

    const model = new ChatOpenAI({
      modelName: "gpt-4",
      temperature: 0.5,
      streaming: true,
      // maxRetries: 10,
      // maxConcurrency: 5,
    });
   
    const outputParser = new BytesOutputParser();
    const chain = prompt.pipe(model).pipe(outputParser);

    const stream = await chain.stream({
      chat_history: formattedPreviousMessages.join("\n"),
      input: currentMessageContent,
    });

    return new StreamingTextResponse(stream);
  } catch (e: any) {
    return NextResponse.json({ error: e.message }, { status: 500 });
  }
}
@sambhav2612
Copy link
Author

EDIT: seems to have worked out once i removed model specification, although its still not streaming

@MaxLeiter MaxLeiter changed the title 405 error with langchain openai chatbot 504 error with langchain openai chatbot Aug 21, 2023
@vercel vercel deleted a comment from sambhav2612 Aug 21, 2023
@sambhav2612
Copy link
Author

EDIT: edge functions times out once I specify default gpt-3.5/4 models, you can imagine what might happen with custom-trained gpt-4 model

@sambhav2612 sambhav2612 changed the title 504 error with langchain openai chatbot 504 error with langchain chat api (edge function timeout) Aug 23, 2023
@ElectricCodeGuy
Copy link

Same issue... also using langchain and ai from vercel. No issue on localhost or other providers.

@jaschaephraim
Copy link

Just to add a data point: encountering the same issue not using langchain, but specifying model: 'gpt-4-vision-preview'
The edge function is not always timing out, but seems to happen when sending a lot of image urls as input. I'd guess there's just an error from openai that's not being caught.

@ElectricCodeGuy
Copy link

So i found a method to reduce the load time, but im not sure this is on verccel. it seems like Cloudflare where i host my service can take between 5 sec to 35 sec using edge runtime and much much worse on serverless...
Localhost is always a few sec thou

@MaxLeiter
Copy link
Member

MaxLeiter commented Nov 20, 2023

I believe this is from the APIs you're using not returning anything to stream. Especially with gpt-4-vision-preview, which can take a long time to analyze the image. If you can provide a reproduction I can re-open the issue.

@lucasquinteiro
Copy link

@ElectricCodeGuy @sambhav2612

In Next.js Edge Functions "must begin sending a response within 25 seconds" or they will send a 504 timeout error. Probably, as you are calling models like gpt-4 and gpt-4-vision-preview the cold start takes more than those 25 seconds and this causes your connection to time out.

As the docs explain, Edge Functions don't have a maximum streaming time once they started streaming. So a possible workaround for this issue could be to start streaming empty strings with a 2 second interval so that we keep the connection alive while the LLM API call loads.

Note that while Edge Functions don't have a max streaming time, nodejs Serverless Functions do have maximum times, so this solution will probably not be enough in those cases.

Here is a possible approach using the stream handlers:

import { NextRequest, NextResponse } from "next/server";
import { Message as VercelChatMessage, StreamingTextResponse } from "ai";

import { ChatOpenAI } from "langchain/chat_models/openai";
import { PromptTemplate } from "langchain/prompts";
import { BytesOutputParser } from "langchain/schema/output_parser";
import { LangChainStream } from "ai";

export const runtime = "edge";

const formatMessage = (message: VercelChatMessage) => {
  return `${message.role}: ${message.content}`;
};

const TEMPLATE = `
You are a CHRO (Chief Human Resources Officer) with over 20 years of industry experience called Suitable.
You are based out of India.
You know nothing but recruitment and talent acqusition, basically you're the king of all things that have anything to do with people.
All responses must be extremely clear, localized and professional, remember you exist to make that person understand their problems.
Respond with markdown to emphasize on your comprehensiveness. 
 
Current conversation:
{chat_history}
 
User: {input}
AI:`;

export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    const messages = body.messages ?? [];

    // Use LangChainStream from the ai package to create a stream that can be handled by the model
    const { stream, handlers } = LangChainStream();

    const heartbeatInterval = setInterval(() => {
      // stream an empty string to keep the connection alive while the model loads
      handlers.handleLLMNewToken("");
    }, 2000);

    const formattedPreviousMessages = messages.slice(0, -1).map(formatMessage);
    const currentMessageContent = messages[messages.length - 1].content;
    const prompt = PromptTemplate.fromTemplate<{
      chat_history: string;
      input: string;
    }>(TEMPLATE);

    const model = new ChatOpenAI({
      modelName: "gpt-4",
      temperature: 0.5,
      streaming: true,
      // connect the handlers to the model callbacks
      callbacks: [handlers],
      verbose: true,
    });

    const outputParser = new BytesOutputParser();
    const chain = prompt.pipe(model).pipe(outputParser);

    chain
      // use invoke instead of stream as the streaming will be done by the handlers
      .invoke({
        chat_history: formattedPreviousMessages.join("\n"),
        input: currentMessageContent,
      })
      // clear the interval once the chain resolves
      .then(() => {
        clearInterval(heartbeatInterval);
      })
      .catch((err) => {
        clearInterval(heartbeatInterval);
      });

    return new StreamingTextResponse(stream);
  } catch (e: any) {
    console.error(e);
    return NextResponse.json({ error: e.message }, { status: 500 });
  }
}

Here are some docs about the max duration of Edge Functions and serverless timeouts in general.

I stumbled upon a similar issue with timeouts and this workaround helped me solve it. Let me know if this works for you.

@jpherrerap
Copy link

@sambhav2612 @ElectricCodeGuy I was getting a similar error with langchain and was able to fix it by setting a long timeout. Not sure how to set an indefinite one. Try

const model = new ChatOpenAI({
      modelName: "gpt-4",
      temperature: 0.5,
      streaming: true,
    },
    { timeout: 10000000 }
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants