Update Flows reference docs and guide#281
Conversation
512fadd to
f4eab5c
Compare
| LLMs decide when to run each function, via their function calling (or tool calling) mechanism. | ||
| - You require precise control over how a conversation progresses | ||
| - Your bot needs to handle a complex task that can be decomposed into discrete steps | ||
| - You're trying to improve the accuracy of your LLM's response or tool use |
There was a problem hiding this comment.
Not sure I grok this 3rd point. Can you explain?
There was a problem hiding this comment.
Ah, you're referring to cases where you might otherwise have a huge prompt and many tools registered at once. I think I understand. The follow paragraph makes it clearer.
There was a problem hiding this comment.
I reworded this section. Hopefully this helps to avoid this point.
| - `name`: The name of the node; used as a reference to transition to the node | ||
| - `role_messages`: A list of message `dicts` defining the bot's role/personality | ||
| - `task_messages`: A list of message `dicts` defining the current node's objectives | ||
| - `functions`: A list of function definitions in either a provider-specific format or a FlowsFunctionSchema |
There was a problem hiding this comment.
Or the function itself directly, if it's defined as a "direct" function...
Should we maybe keep it vaguer here, since this is just a high-level intro? (Unless we're about to go into the nitty-gritty; haven't read ahead yet)
There was a problem hiding this comment.
Let's keep it high level. I'll mention the function call and its corresponding handler.
| } | ||
| } | ||
| ``` | ||
| Some handlers may not need to return a next node, in which case you can return `None`. Also, some handlers may not need to return a result, in which case you can return `None` for the result as well. |
There was a problem hiding this comment.
Is it the case that you'd only return None for the result if you're only switching conversation nodes? (As opposed to doing some work where the only purpose of a result would be to say "OK, I'm done")?
If so, would suggest changing this to emphasize why you'd use None for either value:
| Some handlers may not need to return a next node, in which case you can return `None`. Also, some handlers may not need to return a result, in which case you can return `None` for the result as well. | |
| Some handlers may not want to transition conversational state, in which case you can return `None` for the next node. Also, some handlers may only serve to transition to a next node, in which case you can return `None` for the result. |
There was a problem hiding this comment.
Maybe we should encourage using None only for the case where you want to do work in the node and not transfer. I think you pretty much always want to return a result. I'll make that update.
There was a problem hiding this comment.
I think you pretty much always want to return a result
I don't think that's true. We would want to return only a next node and no result in all the cases where we'd previously use a transition_* by itself without a handler. Fixing.
| ## State Management | ||
|
|
||
| ### Example Implementation | ||
| Pipecat Flows includes built-in state management through the `flow_manager.state` dictionary. This persistent storage lets you share data across nodes throughout the entire conversation. |
There was a problem hiding this comment.
Never noticed this before, but: not sure I would call the flow_manager.state persistent storage "state management" (which I think of more as managing transitions between state). Maybe it's more "Conversation-wide state" or maybe "Cross-node state"?
| Pipecat Flows includes built-in state management through the `flow_manager.state` dictionary. This persistent storage lets you share data across nodes throughout the entire conversation. | |
| ## Cross-Node State | |
| Pipecat Flows supports cross-node state through the `flow_manager.state` dictionary. This persistent storage lets you share data across nodes throughout the entire conversation. |
There was a problem hiding this comment.
Yeah! Cross-Node State is a great concept. The idea is to make it easier to have state to pass around, avoiding the need for globals.
| Pipecat Flows supports three ways to define function calls: | ||
|
|
||
| # Flow Editor | ||
| #### FlowsFunctionSchema (Recommended) |
There was a problem hiding this comment.
As I've been reading this doc and looking at our examples, I'm wondering if/when we should flip to recommending direct functions as the main recommended way of doing things, with function schemas as an alternative if you happen to need more fine-grained control (mainly over the properties object). Using functions directly really does cut down boilerplate and opportunities for making mistakes...
There was a problem hiding this comment.
(This can be a question for later, after this PR)
There was a problem hiding this comment.
Once we get a better sense of how well this works, I think we can make this recommendation. For now, we can stick with FlowsFunctionSchema.
| Uses the provider's native format (OpenAI, Anthropic, etc.). While supported, we recommend using FlowsFunctionSchema for better portability. | ||
|
|
||
| ## Naming Conventions | ||
| ### Actions |
There was a problem hiding this comment.
Wait...weren't we talking about actions earlier in this doc? Why did we circle back to it? (A little lost on the structure)
There was a problem hiding this comment.
I set this up as:
- Technical overview (all sections)
- Usage (all sections)
Though, the actions examples are trivial, so I'll integrate them into the technical overview section.
There was a problem hiding this comment.
Actually, I opted to restructure this and move the examples into the overview section. That leaves the example to look at a single dynamic flows example.
| ``` | ||
|
|
||
| When using the Flow Editor, function handlers can be specified using the `__function__:` token: | ||
| #### Custom Actions |
There was a problem hiding this comment.
Wonder if we can find a way to de-emphasize custom actions. I would think a user would almost always be better served by a "function" action for custom needs. This sections implies that if you're not using one of our pre-canned actions you should reach for custom actions, which isn't quite right.
There was a problem hiding this comment.
Also: I had recently renamed "built-in" to "pre-canned" in the reference docs to distinguish between 3 types:
- pre-canned
- function
- custom (to be used rarely)
There was a problem hiding this comment.
In deleting this section, I have one less mention of custom actions, which is probably a good thing.
Maybe we even deprecate custom actions in favor of using functions. WDYT? It should be functionally equivalent. But "functions" offer the guarantee of timing.
There was a problem hiding this comment.
Maybe we even deprecate custom actions in favor of using functions. WDYT? It should be functionally equivalent.
That might be a good idea. Could minimize confusion, for sure.
They're technically not completely functionally equivalent, as a custom action as the first action in the post_actions list would run immediately after the LLM completion whereas a function action in the same position would run after the bot finished speaking.
Regardless, let's revisit this question after this PR.
| guide](/guides/features/pipecat-flows) first. | ||
| </Tip> | ||
|
|
||
| Pipecat Flows is a structured conversation framework for building AI applications with Pipecat. It enables both predefined conversation paths (static flows) and runtime-determined flows (dynamic flows) with comprehensive state management, function calling, and cross-provider LLM compatibility. |
There was a problem hiding this comment.
Similar comment as above: maybe we can do without alluding to static v dynamic in this kind of compact high-level summary
There was a problem hiding this comment.
I'll just use the same overview. I think that's fine.
Rewrite of both docs pages: