feat: For models that can use tools, allow them to choose when to use the tool, possibly multiple times, and expose search as a tool instead of hard coded to happen at the beginning of the completion #8918

kaytwo · 2025-01-25T16:54:29Z

kaytwo
Jan 25, 2025

The current app implementation hard-codes search to happen at the beginning of a completion. This is helpful for models that aren't tool-aware, but if the task I'm asking the model to do doesn't entail doing a search right away as the very beginning of the response, it confuses the model by injecting the search prompt and search results into the context as the very beginning of the response, basically making the reply "get off on the wrong foot" and is very unlikely to be useful.

What would be a nice feature is that, if tool use is available, the toggle should simply enable a built in search tool that the model can use whenever it deems it necessary to do a web search.

A reasonable workaround currently is to keep search turned off unless a search is needed first-thing in a given point in a chat, but that is sub optimal especially now that we have chain of thought models that might want to use search, code running, etc within their internal monologue.

I'm using Brave search - another thing that will probably be worthwhile is to simply code up a Tool that uses the Brave Search along with the provided API key to make results available as a tool I imagine it wouldn't take too long to code that up, and such a tool could quite possibly be popular.

kaytwo · 2025-01-25T18:38:38Z

kaytwo
Jan 25, 2025
Author

Closely related - reading through the code, tool use can only happen once and at the beginning of a response, but should not be limited. Based off of the way that tool calling works in Ollama and the underlying way that llama handles tool use, one could imagine a proper approach to tool usage would look more like:

tool_iteration_limit = 5
iterations = 0
next_response = chat(messages=messages, tools="...")
while next_response.message.tool_calls and iterations < tool_iteration_limit:
  # ignore the fact that there might be multiple tool calls
  output = call_tool(...)
  messages.append({role: tool, content: output, ...})
  next_response = chat(messages = messages, tools="...")
  iterations += 1
return next_response

within the open-webui codebase, I think this would look more like refactoring this call:

open-webui/backend/open_webui/utils/middleware.py

Line 681 in b72150c

async def process_chat_payload(request, form_data, metadata, user, model):

to implement the above loop, and only call

open-webui/backend/open_webui/utils/middleware.py

Line 752 in b72150c

form_data = await chat_web_search_handler(

and

open-webui/backend/open_webui/utils/middleware.py

Line 175 in b72150c

async def chat_completion_tools_handler(

as requested by the model's (potentially intermediate) response.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: For models that can use tools, allow them to choose when to use the tool, possibly multiple times, and expose search as a tool instead of hard coded to happen at the beginning of the completion #8918

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

feat: For models that can use tools, allow them to choose when to use the tool, possibly multiple times, and expose search as a tool instead of hard coded to happen at the beginning of the completion #8918

Uh oh!

kaytwo Jan 25, 2025

Replies: 1 comment

Uh oh!

Uh oh!

kaytwo Jan 25, 2025 Author

kaytwo
Jan 25, 2025

kaytwo
Jan 25, 2025
Author