-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
504 error with langchain chat api (edge function timeout) #487
Comments
EDIT: seems to have worked out once i removed model specification, although its still not streaming |
EDIT: edge functions times out once I specify default gpt-3.5/4 models, you can imagine what might happen with custom-trained gpt-4 model |
Same issue... also using langchain and ai from vercel. No issue on localhost or other providers. |
Just to add a data point: encountering the same issue not using langchain, but specifying |
So i found a method to reduce the load time, but im not sure this is on verccel. it seems like Cloudflare where i host my service can take between 5 sec to 35 sec using edge runtime and much much worse on serverless... |
I believe this is from the APIs you're using not returning anything to stream. Especially with gpt-4-vision-preview, which can take a long time to analyze the image. If you can provide a reproduction I can re-open the issue. |
In Next.js Edge Functions "must begin sending a response within 25 seconds" or they will send a 504 timeout error. Probably, as you are calling models like gpt-4 and gpt-4-vision-preview the cold start takes more than those 25 seconds and this causes your connection to time out. As the docs explain, Edge Functions don't have a maximum streaming time once they started streaming. So a possible workaround for this issue could be to start streaming empty strings with a 2 second interval so that we keep the connection alive while the LLM API call loads. Note that while Edge Functions don't have a max streaming time, nodejs Serverless Functions do have maximum times, so this solution will probably not be enough in those cases. Here is a possible approach using the stream handlers: import { NextRequest, NextResponse } from "next/server";
import { Message as VercelChatMessage, StreamingTextResponse } from "ai";
import { ChatOpenAI } from "langchain/chat_models/openai";
import { PromptTemplate } from "langchain/prompts";
import { BytesOutputParser } from "langchain/schema/output_parser";
import { LangChainStream } from "ai";
export const runtime = "edge";
const formatMessage = (message: VercelChatMessage) => {
return `${message.role}: ${message.content}`;
};
const TEMPLATE = `
You are a CHRO (Chief Human Resources Officer) with over 20 years of industry experience called Suitable.
You are based out of India.
You know nothing but recruitment and talent acqusition, basically you're the king of all things that have anything to do with people.
All responses must be extremely clear, localized and professional, remember you exist to make that person understand their problems.
Respond with markdown to emphasize on your comprehensiveness.
Current conversation:
{chat_history}
User: {input}
AI:`;
export async function POST(req: NextRequest) {
try {
const body = await req.json();
const messages = body.messages ?? [];
// Use LangChainStream from the ai package to create a stream that can be handled by the model
const { stream, handlers } = LangChainStream();
const heartbeatInterval = setInterval(() => {
// stream an empty string to keep the connection alive while the model loads
handlers.handleLLMNewToken("");
}, 2000);
const formattedPreviousMessages = messages.slice(0, -1).map(formatMessage);
const currentMessageContent = messages[messages.length - 1].content;
const prompt = PromptTemplate.fromTemplate<{
chat_history: string;
input: string;
}>(TEMPLATE);
const model = new ChatOpenAI({
modelName: "gpt-4",
temperature: 0.5,
streaming: true,
// connect the handlers to the model callbacks
callbacks: [handlers],
verbose: true,
});
const outputParser = new BytesOutputParser();
const chain = prompt.pipe(model).pipe(outputParser);
chain
// use invoke instead of stream as the streaming will be done by the handlers
.invoke({
chat_history: formattedPreviousMessages.join("\n"),
input: currentMessageContent,
})
// clear the interval once the chain resolves
.then(() => {
clearInterval(heartbeatInterval);
})
.catch((err) => {
clearInterval(heartbeatInterval);
});
return new StreamingTextResponse(stream);
} catch (e: any) {
console.error(e);
return NextResponse.json({ error: e.message }, { status: 500 });
}
} Here are some docs about the max duration of Edge Functions and serverless timeouts in general. I stumbled upon a similar issue with timeouts and this workaround helped me solve it. Let me know if this works for you. |
@sambhav2612 @ElectricCodeGuy I was getting a similar error with langchain and was able to fix it by setting a long timeout. Not sure how to set an indefinite one. Try const model = new ChatOpenAI({
modelName: "gpt-4",
temperature: 0.5,
streaming: true,
},
{ timeout: 10000000 }
); |
I keep getting 405 error while executing any complex query on https://gpt.suitable.ai
Vercel log shows this:
[POST] /api/chat reason=EDGE_FUNCTION_INVOCATION_TIMEOUT, status=504, user_error=true
It seems not to be streaming anything, sends a block response if at all.
Failing prompts:
Route.ts:
The text was updated successfully, but these errors were encountered: