Skip to content

Tool Use: automatic tool execution #163

@jingyun19

Description

@jingyun19

Goal

revamp the initial tool use design that integrates JS tool execution within the prompt() call itself, and list out detailed API behaviors.

Background & Motivation

For tool use, we already aligned on an open loop tool use design which allows clients to handle a tool call. It would be convenient if the API can handle automatic execution sometimes, so we'd like to propose an API that's compatible with the open loop design, so that clients can switch between depending on use cases.

The initial design omit a lot of details of API shape and behavior, so this issue tries to address them.

That said, there are still many use cases where open-loop API behavior is preferred. Here lists some examples

Proposal

API Spec

Everything added for open loop API spec can be reuse. i.e, closed loop mode still supports appending tool-call and tool-response objects (more on this below)

Extra spec for automatic execution mode:

dictionary LanguageModelCreateCoreOptions {
 //..... existing fields

// Tool now also contains an `execute` function
 sequence<LanguageModelToolDeclaration> tools;
 AutomaticToolUseConfig tool_use_config; 
}

dictionary AutomaticToolUseConfig {
  bool enabled;
  // Max number of tool execution within a prompt()/promptStreaming() call. When this number of tool is reached, throw error "Max number of tool call reached". Batch/Parallel tool executions are also counted as-is.
  int max_tool_calls;
}

Return types of prompt() and promptStreaming() will always be DOMString.

Various API Behaviors

Tool Execution

  • Sequential Tool execution will always be blocking. i.e, model waits for all tool results before running the next decode.
  • Batch tools will be parallel. I.e, if model decodes “tool-call-1”, “tool-call-2”, their execute function will run in parallel. Planner will loop wait for all Promise to resolve to start the next decode.

Append & Generate

  • Appending tool calls and tool responses is still supported, but "tool-call" and "tool-result" must be appended together in pairs.
  • At the time of prompt() / promptStreaming(), API throws error if there’s a "tool-call" not followed by "tool-response" in the argument. It’s okay to prompt() with [tool-call, tool-result]. The model will continue decode with future tool call handled by API automatically.

Streaming vs. Unary

  • Consider this example model generation sequence: [response1, tool1, tool2, response2]
  • For prompt(), the promise waits until response2 is generated to resolve
  • For promptStreaming(), client will first get parts of response1 from the ReadableStream, then the client will wait for tool execution to complete, and finally gets parts of response2.

Tool’s Error handling

  • Issue: If tool impl throws error, should the planner loop stops (and the session is no-longer usable because it is in an undefined state)? Or continue with the error message?
  • Proposal: Continue planner loop using error.message as tool output

Constraint Decoding

  • Issue: it's unclear whether the constraint is applied for each decoding step in the planner loop, or only the first step. It's very easy to be misused
  • Proposal: do not support settingresponseConstraint and prefix if using closed loop mode. Suggest using open loop api if callers need it.

Other Aspects

Observability

  • Issue: Closed loop API needs to provide some observability. E.g, after-the-fact traces.
  • Proposal: expose a new session.history() function which returns Promise<sequence<LanguageModelMessageContent>>. It returns all messages for all roles (incl. Initial prompts) so far appended & generated in the session.

Much credit to @FrankLi-MSFT 's initial closed loop implementation design + prototyping + alignment on open loop design.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions