Tool Use: automatic tool execution

## Goal
revamp the [initial](https://github.com/webmachinelearning/prompt-api/issues/7#issuecomment-2942909973) tool use design that integrates JS tool execution within the prompt() call itself, and list out detailed API behaviors.

## Background & Motivation
For tool use, we already aligned on an open loop tool use [design](https://github.com/webmachinelearning/prompt-api/issues/159) which allows clients to handle a tool call. It would be convenient if the API can handle automatic execution sometimes, so we'd like to propose an API that's compatible with the open loop design, so that clients can switch between depending on use cases.

The [initial](https://github.com/webmachinelearning/prompt-api/issues/7#issuecomment-2942909973) design omit a lot of details of API shape and behavior, so this issue tries to address them. 

That said, there are still many use cases where open-loop API behavior is preferred. [Here](https://docs.google.com/document/d/1Cyhk8X9jgpU4FFYQZKb8A5RgTQMqT9xIN-JnxlMdEus/edit?resourcekey=0-Date8jy3LWnhpwRzqJ-aDg&tab=t.gobrtoscc8g9#heading=h.lvss16puj8wm) lists some examples

# Proposal

 ## API Spec

Everything added for open loop API spec can be reuse. i.e, closed loop mode still supports appending tool-call and tool-response objects (more on this below)

Extra spec for automatic execution mode:
```
dictionary LanguageModelCreateCoreOptions {
 //..... existing fields

// Tool now also contains an `execute` function
 sequence<LanguageModelToolDeclaration> tools;
 AutomaticToolUseConfig tool_use_config; 
}

dictionary AutomaticToolUseConfig {
  bool enabled;
  // Max number of tool execution within a prompt()/promptStreaming() call. When this number of tool is reached, throw error "Max number of tool call reached". Batch/Parallel tool executions are also counted as-is.
  int max_tool_calls;
}

```
Return types of prompt() and promptStreaming() will always be `DOMString`.


 ## Various API Behaviors
Tool Execution
- Sequential Tool execution will always be **blocking**. i.e, model waits for all tool results before running the next decode.
- Batch tools will be **parallel**. I.e, if model decodes “tool-call-1”, “tool-call-2”, their execute function will run in parallel. Planner will loop wait for all Promise  to resolve to start the next decode.

Append & Generate
- Appending tool calls and tool responses is still supported, but "tool-call" and "tool-result" must be appended together in pairs.
- At the time of prompt() / promptStreaming(), API throws error if there’s a "tool-call" not followed by "tool-response" in the argument. It’s okay to prompt() with [tool-call, tool-result]. The model will continue decode with future tool call handled by API automatically.

Streaming vs. Unary
- Consider this example model generation sequence: [response1, tool1, tool2, response2]
- For prompt(), the promise waits until response2 is generated to resolve
- For promptStreaming(), client will first get parts of response1 from the ReadableStream, then the client will wait for tool execution to complete, and finally gets parts of response2.
 

Tool’s Error handling
- Issue: If tool impl throws error, should the planner loop stops (and the session is no-longer usable because it is in an undefined state)? Or continue with the error message?
- Proposal: Continue planner loop using error.message as tool output


Constraint Decoding
- Issue: it's unclear whether the constraint is applied for each decoding step in the planner loop, or only the first step. It's very easy to be misused
- Proposal: **do not support** setting`responseConstraint` and `prefix` if using closed loop mode. Suggest using open loop api if callers need it.


 ## Other Aspects

Observability
- Issue: Closed loop API needs to provide some observability. E.g, after-the-fact traces.
- Proposal: expose a new `session.history()` function which returns `Promise<sequence<LanguageModelMessageContent>>`. It returns all messages for all roles (incl. Initial prompts) so far appended & generated in the session.

------
Much credit to @FrankLi-MSFT 's  initial [closed loop implementation design](https://docs.google.com/document/d/1ZY8i0FUv3wVK6-vzK2fNocSEVAKBxYSBJd6y3JVK178/edit?tab=t.0#heading=h.7nki9mck5t64) + prototyping + alignment on open loop design.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tool Use: automatic tool execution #163

Goal

Background & Motivation

Proposal

API Spec

Various API Behaviors

Other Aspects

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tool Use: automatic tool execution #163

Description

Goal

Background & Motivation

Proposal

API Spec

Various API Behaviors

Other Aspects

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions