-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add maxToolRoundtrips
option to streamText
settings
#1943
Comments
it's planned. the integration with |
Can I help? Or is there any known workaround to handle that on the server? |
I'd also be curious to hear if you have a workaround in mind @lgrammel / others. Recursion is what I spent yesterday on, and I really could not decide what I needed to do based on the source and feel like I wasn't just applying band-aids that get washed out by the next update when you guys do push this change. I'm essentially wanting to use the streamUI itself as an async generator; any thoughts? |
Workaround for now (simplified example), wrapping the readable stream into another one: const maxToolCalls = 5
export const runOpenAIChatCompletionStream = async ({
credentials: { apiKey },
options,
variables,
config: openAIConfig,
compatibility,
totalToolCalls = 0,
toolResults,
toolCalls,
}: Props) => {
const response = await streamText({
model,
temperature: options.temperature ? Number(options.temperature) : undefined,
messages: await parseChatCompletionMessages({
options,
variables,
toolCalls,
toolResults,
}),
tools: parseTools({ tools: options.tools, variables }),
})
return new ReadableStream({
async start(controller) {
const reader = response.toAIStream().getReader()
async function pump(reader: ReadableStreamDefaultReader<Uint8Array>) {
const { done, value } = await reader.read()
if (done) {
toolCalls = (await response.toolCalls) as ToolCallPart[]
toolResults = (await response.toolResults) as
| ToolResultPart[]
| undefined
return
}
controller.enqueue(value)
return pump(reader)
}
await pump(reader)
if (toolCalls && toolCalls.length > 0 && totalToolCalls < maxToolCalls) {
totalToolCalls += 1
const newReader = await runOpenAIChatCompletionStream({
credentials: { apiKey },
options,
variables,
config: openAIConfig,
compatibility,
toolCalls,
toolResults,
})
if (newReader) await pump(newReader.getReader())
}
controller.close()
},
})
} Am I doing this correctly? I am not super familiar with streams. I tested out briefly and it seems to work as expected. |
Do we have any timeline for this? Also, when client automatically makes a request, is it possible to control what is sent to the backend and if so, how? |
My current solution:
|
@wong2 how do you than convert data to get it to work with useChat hook on frontend? only option i see is to copy the logic from the core package? |
I'm not using useChat on frontend. |
i cannot use useChat personally, because i use Tauri which is client-side nextjs only (unless reimplementing stuff in rust) my hack: await generateText({
model: provider,
tools: {
suggest_queries: {
description: `Suggest queries for the user's question and ask for confirmation. Example:
{
suggested_queries: [
{ content_type: "audio", start_time: "2024-03-01T00:00:00Z", end_time: "2024-03-01T23:59:59Z", q: "screenpipe" },
{ content_type: "ocr", app_name: "arc", start_time: "2024-03-01T00:00:00Z", end_time: "2024-03-01T23:59:59Z", q: "screenpipe" },
]
}
- q contains a single query, again, for example instead of "life plan" just use "life"
- When using the query_screenpipe tool, respond with only the updated JSON object
- If you return something else than JSON the universe will come to an end
- DO NOT add \`\`\`json at the beginning or end of your response
- Do not use '"' around your response
- Date & time now is ${new Date().toISOString()}. Adjust start_date and end_date to properly match the user intent time range.
`,
parameters: z.object({
suggested_queries: screenpipeMultiQuery,
queries_results: z
.array(z.string())
.optional()
.describe(
"The results of the queries if called after the tool query_screenpipe"
),
}),
execute: async ({ suggested_queries }) => {
console.log("Suggested queries:", suggested_queries);
const confirmation = await askQuestion(
"Are these queries good? (yes/no): "
);
if (confirmation.toLowerCase() === "yes") {
return { confirmed: true, queries: suggested_queries };
} else {
const feedback = await askQuestion(
"Please provide feedback or adjustments: "
);
return { confirmed: false, feedback };
}
},
},
query_screenpipe: {
description:
"Query the local screenpipe instance for relevant information.",
parameters: screenpipeMultiQuery,
execute: queryScreenpipeNtimes,
},
stream_response: {
description:
"Stream the final response to the user. ALWAYS FINISH WITH THIS TOOL",
parameters: z.object({
response: z
.string()
.describe("The final response to stream to the user"),
}),
execute: async ({ response }) => {
const { textStream } = await streamText({
model: provider,
messages: [{ role: "user", content: response }],
});
for await (const chunk of textStream) {
process.stdout.write(chunk);
}
console.log("\n");
throw new Error("STREAM_COMPLETE");
},
},
},
toolChoice: "required",
messages: [
{
role: "system",
content: `You are a helpful assistant that uses Screenpipe to answer user questions.
First, suggest queries to the user and ask for confirmation. If confirmed, proceed with the search.
If not confirmed, adjust based on user feedback. Use the query_screenpipe tool to search for information,
and then use the stream_response tool to provide the final answer to the user.
Rules:
- User's today's date is ${new Date().toISOString().split("T")[0]}
- Use multiple queries to get more relevant results
- If the results of the queries are not relevant, adjust the query and ask for confirmation again. Minimize user's effort.
- ALWAYS END WITH the stream_response tool to stream the final answer to the user
- In the suggest_queries tool, always tell the user the parameters available to you (e.g. types, etc. Zod given to you) so the user can adjust the query if needed. Suggest few other changes on the arg you used so the user has some ideas.
- Make sure to use enough data but not too much. Usually 50k+ rows a day.
`,
},
{
role: "user",
content: input,
},
],
maxToolRoundtrips: 10,
}); but i suspect this use more tokens than it should (on the final answer of generateText?) PS: i hope you won't call the LLM police regarding my prompt engineering techniques... |
@lgrammel Any updates on adding this feature, or it's too complex? |
Might add I'd prefer a sensible default so the behaviour matches 1:1 the one I'm getting with OpenAI SDK, otherwise this might feel like a "downgrade". |
WIP PR: #2836 |
Available in |
Feature Description
Would be similar to how this setting works with
generateText
.Use Case
In the examples, when streaming with a tool, it's the client (useChat) that makes the request again after a tool is being called. It would be great if that is automatically handled by the server.
Additional context
No response
The text was updated successfully, but these errors were encountered: