Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support tool call round trips in streamUI #1895

Open
mfclarke-cnx opened this issue Jun 9, 2024 · 14 comments
Open

Support tool call round trips in streamUI #1895

mfclarke-cnx opened this issue Jun 9, 2024 · 14 comments
Labels
ai/rsc enhancement New feature or request

Comments

@mfclarke-cnx
Copy link

mfclarke-cnx commented Jun 9, 2024

Feature Description

Currently only supported in generateText: https://sdk.vercel.ai/docs/ai-sdk-core/tools-and-tool-calling#tool-roundtrips

Support in streamUI would be great. Currently I'm working around this by returning a hidden component at the completion of the tool call that makes a new user request asking the model to continue. This isn't great because I have to account for this workaround in other areas (restoring saved chats etc)

Use Case

Any tool call that doesn't require the user to confirm. For example, looking up information to use in a response.

Additional context

No response

@lgrammel lgrammel added enhancement New feature or request ai/rsc labels Jun 11, 2024
@Floriferous
Copy link

This is crucial if you want to have a real conversation with an AI that can call tools. Right now I get the toolCall result back, and then I can't chat anymore.

Here's my conversation:

Me: Hi, what can you do?
AI: Hi, I can do X
Me: Ok please do X
AI: toolcall.result = Y // I would expect some commentary to come with the result here
Me: Ok thanks
AI: toolcall.result = Y
Me: What do you think about the result?
AI: toolcall.result = Y

How do you go around this behavior?

@benjaminalgreen
Copy link

This is possible using frontend tool calls, but tool calls for streamed responses is supposed to be coming soon. #1574 (comment)

@nsenthilkumar
Copy link

while a QoL update to automate round trips would be great for right now you can recursively pipe tool call responses back into your submitMessage() yourself

func myTool(){
     do stuff
     generate: return{submitMessage(toolResult)}

@kevinschaich
Copy link

kevinschaich commented Jun 26, 2024

Also interested in this – one thought is maybe the streamUI API would benefit from keeping structured tool output separate from the desired UI (generate function in most of the examples).

Manually piping tool output back in via submitMessage as mentioned by @nsenthilkumar is interesting, but wouldn't that only provide additional context to the following user message in the chat history? It doesn't direct the assistant to take another step after the current tool run completes in order to make incremental progress using multiple tool invocations against the original prompt. If I'm missing something I'd love to see a more complete example.

Open to ideas but something like this would provide a history of previously called tool invocations similar to generateText:

tools: {
    getWeather: {
        description: 'Get Weather',
        parameters: z.object({
            city: z.string().describe('The city to get weather for'),
        }),
        generate: async function* ({ city }) {
            const weatherForCity = getWeather(city)
            yield (
                {
                    output: weatherForCity,
                    widget: <Weather weather={weatherForCity} />,
                }
            )
        },
    }
}

@bastotec
Copy link

bastotec commented Jul 1, 2024

I'm also curious how to persist tool result messages using streamUI. Right now OpenAI requires that every tool call have a corresponding tool result message on the chat history array. Since tools with streamUI return a ReactNode, are we supposed to save that serialized node as a ToolResultPart and feed that back into the history?

@cryptoKevinL
Copy link

I've made a support request to Vercel for this issue as well. Was very surprised there was no existing support or some other suggested workaround/solution.

@karam-khanna
Copy link

Big +1 on this

@yamz8
Copy link

yamz8 commented Aug 3, 2024

I've made a support request to Vercel for this issue as well. Was very surprised there was no existing support or some other suggested workaround/solution.

any updates from vercel?

@cryptoKevinL
Copy link

Would be nice - is there a technical reason why its not possible? Like because the streamUI wants to start returning data back to the user before its got a full response and needs to wait and process the full result in case its needs to call another tool (thus defeating the purpose of streaming in the first place?). Just curious and would help developing knowing the path we need to take one way or the other.

@yamz8
Copy link

yamz8 commented Aug 4, 2024

Would be nice - is there a technical reason why its not possible? Like because the streamUI wants to start returning data back to the user before its got a full response and needs to wait and process the full result in case its needs to call another tool (thus defeating the purpose of streaming in the first place?). Just curious and would help developing knowing the path we need to take one way or the other.

in the latest version, they added support for text stream but not stream UI https://vercel.com/blog/introducing-vercel-ai-sdk-3-2 .

@KeKs0r
Copy link

KeKs0r commented Oct 8, 2024

Did anyone find a good workaround for this yet? I also have some tool calling to get some data, then have generic rendering for that data, but would love to get some kind of summary or answer depending on the actual user question.

@zzh8829
Copy link

zzh8829 commented Oct 10, 2024

I was able to get this working by creating a custom streamUI implementation powered by streamText here is the idea of how it generally works.

export async function streamUIV2<TOOLS extends Record<string, CoreTool> = {}>({
  model,
  tools,
  toolChoice,
  system,
  prompt,
  messages,
  maxRetries,
  abortSignal,
  headers,
  initial,
  text,
  experimental_providerMetadata: providerMetadata,
  onFinish,
  ...settings
} {
  const ui = createStreamableUI(initial)
  const nodes: [string, ReactNode][] = []
  // map of tool call id to step number
  const stepsMap = new Map<string, number>()

  const updateNodes = (id: string, node: ReactNode): ReactNode => {
    if (stepsMap.has(id)) {
      const step = stepsMap.get(id)
      if (step !== undefined) {
        nodes[step] = [id, node]
      }
    } else {
      nodes.push([id, node])
      stepsMap.set(id, nodes.length - 1)
    }

    return (
      <>
        {nodes.map(([id, node], _) => (
          <div key={id}>{node}</div>
        ))}
      </>
    )
  }

  // The default text renderer just returns the content as string.
  const textRender = text || defaultTextRenderer

  let finished: Promise<void> | undefined

  let finishEvent: {
    finishReason: FinishReason
    usage: LanguageModelUsage
    warnings?: CallWarning[]
    rawResponse?: {
      headers?: Record<string, string>
    }
  } | null = null

  async function render({
    args,
    renderer,
    streamableUI,
    isLastCall,
    stepId
  }: {
    renderer: undefined | Renderer<any>
    args: [payload: any] | [payload: any, options: any]
    streamableUI: ReturnType<typeof createStreamableUI>
    isLastCall?: boolean
    stepId: string
  }): Promise<string> {
    if (!renderer) return ''

    // create a promise that will be resolved when the render call is finished.
    // it is appended to the `finished` promise chain to ensure the render call
    // is finished before the next render call starts.
    const renderFinished = createResolvablePromise<void>()
    finished = finished
      ? finished.then(() => renderFinished.promise)
      : renderFinished.promise

    const rendererResult = await renderer(...args)
    let data = ''

    if (isAsyncGenerator(rendererResult) || isGenerator(rendererResult)) {
      while (true) {
        const { done, value } = await rendererResult.next()
        const node = await value.node

        if (isLastCall) {
          streamableUI.done(updateNodes(stepId, node))
        } else {
          streamableUI.update(updateNodes(stepId, node))
        }

        if (done) {
          data = await value.data
          break
        }
      }
    } else {
      const node = await rendererResult.node
      data = await rendererResult.data

      if (isLastCall) {
        streamableUI.done(updateNodes(stepId, node))
      } else {
        streamableUI.update(updateNodes(stepId, node))
      }
    }

    // resolve the promise to signal that the render call is finished
    renderFinished.resolve()

    return data
  }

  /* for each tool wrap 'generate' with execute */
  const reformattedTools =
    tools &&
    (Object.entries(tools).reduce((acc, [name, tool]) => {
      const generate = tool.generate
      acc[name] = {
        ...tool,
        execute: (args: any, options: any) => {
          return render({
            renderer: tool.generate,
            args: [args],
            streamableUI: ui,
            stepId: `${name}-${nanoid()}`
          })
        }
      }
      return acc
    }, {}) as {
      [name in keyof TOOLS]: TOOLS[name]
    })

  let textSection: number = 0

  const result = await streamText({
    model,
    tools: reformattedTools,
    toolChoice,
    system,
    prompt,
    messages,
    maxRetries,
    abortSignal,
    headers,
    maxSteps: 3,
    onChunk: event => {
      if (event.chunk.type === 'text-delta') {
        render({
          renderer: textRender,
          args: [{ done: false, delta: event.chunk.textDelta }],
          streamableUI: ui,
          stepId: `text-${textSection}`
        })
      } else {
        textSection++
      }
    },
    onFinish: event => {
      onFinish &&
        onFinish({
          ...event,
          value: ui.value
        })

      finished
        ? finished.then(() => {
            ui.done()
          })
        : ui.done()
    }
  })

  ;(async () => {
    for await (const _ of result.textStream) {
      // without this the stream will not work
    }
  })()

  return {
    ...result,
    value: ui.value
  }
}

then in the caller you can do this

const result = await streamUIV2({
...
   tools: {
      fetchData: {
        description: 'Fetch data',
        parameters: z.object({
          description: z.string()
        }),
        generate: async function* ({ description }) {
          yield {
            data: 'loading',
            node: <SystemMessage>Fetching data...</SystemMessage>
          }

          const toolCallId = nanoid()
          const data = await fetchData()
   
          return {
            data: JSON.stringify(data),
            node: (
              <BotCard>
                <SystemMessage>
                  Data Fetched: {JSON.stringify(data, null, 2)}
                </SystemMessage>
              </BotCard>
            )
          }
        }
      },
...

@zzh8829
Copy link

zzh8829 commented Oct 10, 2024

with this strategy the UI can properly display multiple round trip / parallel tools calls with text before and after. the end result looks pretty similar to what chatgpt can get you these days.

@zzh8829
Copy link

zzh8829 commented Oct 10, 2024

a lot of the boilerplate code is not needed they are only there because i forked from streamUI with very minimal modification. i imagine the official implementation of this could be a lot cleaner with a better interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ai/rsc enhancement New feature or request
Projects
None yet
Development

No branches or pull requests