runtime SDK for executing voiceflow projects and conversational state management across different platforms.
⚠️ This repository is still undergoing active development: Major breaking changes may be pushed periodically and the documentation may become outdated - a stable version has not been released
yarn add @voiceflow/runtime
npm i @voiceflow/runtime
At a very high level, you can think of the whole Voiceflow system like code:
- The frontend creator system allows conversation designers to "write" in a visual programming language, like an IDE. Each of their
diagrams
is like a function/instruction. - When they press Upload or Export all their flow
diagrams
are "compiled" intoprograms
- The
runtime
is like the CPU that reads theseprograms
and executes them based on end user input. - these
programs
can be read through Local Project Files or through an Voiceflow API where they are hosted with Voiceflow databases - the runtime handles state management and tracks the user's state at each turn of conversation
Where conversational state management differs from traditional code execution is that it is heavily I/O based and always blocked when awaiting user input:
It is important to understand the Conversation Request/Response Webhook Model
The end user session storage is determined by the implementation of the runtime, but stores the most sensitive data in this architecture (what the user has said, where they are in the conversation, variables, etc.)
As a practical example, here's what happens when a user speaks to a custom skill (app) on their Alexa device. Most other conversational platforms function in this manner - if it is a text/or chat, it would be the exact same, just without a voice-to-text/text-to-voice layer.
- user says something to alexa, alexa uses natural language processing to transcribe user intent, then sends it via webhook (along with other metadata i.e. userID) to
alexa-runtime
, which implements@voiceflow/runtime
- fetch user state (JSON format) from end user session storage based on a userID identifier
- fetch the current program (flow) that the user is on from Voiceflow API/Local Project File
- go through nodes in the flow and update the user state
- save the final user state to end user session storage
- generate a webhook response based on the final user state, send back to alexa
- alexa interprets response and speaks to user
repeat all steps each time a user speaks to the alexa skill, to perform a conversation. (here's Amazon Alexa's documentation)
import Client, { LocalDataApi } from '@voiceflow/runtime';
import CustomHandlers from './handlers';
const client = new Client({
handlers: {
...DefaultHandlers,
...CustomHandlers,
},
api: new LocalDataApi({ projectSource: PROJECT_SOURCE /* local project file */ }),
});
// if you want to inject custom handlers during lifecycle events
client.setEvent(EventType.handlerDidCatch, (err, runtime) => {
logger.log(err);
throw err;
});
Client
can be initialized with either LocalDataApi
, which reads a local file for project and program source, or ServerDataApi
, which requires an API key (adminToken
) and endpoint (dataEndpoint
) and fetches the data remotely.
// incoming webhook request
const handleRequest = async (userID, versionID, payload) => {
// retrieve the previous user state
const rawState = DB.fetchState(userID);
const runtime = client.createRuntime(versionID, rawState, payload);
// update the state and generate new runtime (step through the program)
runtime.update();
// save the new user state
DB.saveState(userID, runtime.getFinalState());
// generate a response based on trace
let response = {};
runtime.trace.forEach((trace) => {
if (trace.type === 'card') response.card = trace.payload;
if (trace.type === 'speak') response.speak += trace.payload;
});
return response; // the SDK usually handles this
};
import { HandlerFactory } from '@voiceflow/runtime';
// create a handler for a particular node type
export const CustomNodeHandler: HandlerFactory<Node, typeof utilsObj> = (utils) => ({
// return a boolean whether or not this handler can handle a particular node
canHandle: (node) => {
return node.type === 'CUSTOM_NODE' && typeof node.custom === 'string';
},
// perform side effects and return the next node to go to
handle: (node, runtime, variables) => {
if (runtime.storage.get('customSetting')) {
// add something to the trace to help generate the response
runtime.trace.addTrace({
type: 'card',
payload: node.custom,
});
// maybe add something to analytics when this block is triggered
metrics.log(node.custom);
}
// if you return null it will immediately end the current flow
return node.nextId;
},
});
Return Types
- return
null
: ends the current flow, and pops it off the stack - return different
nodeID
: as long as the nodeID is present in the current flow program, it will attempt to be handled next - if it is not found, same behavior as returnnull
- return self
nodeID
: if the handler'shandle()
returns the same nodeID as the node it is handling, then the execution of this interaction (runtime.update()
) will end in the exact same state and wait for the next user interaction/webhook. The next request will begin on this same node. You would do this to await user input. Example: Choice Node
request: what the end user has done (intent, push button, etc)
runtime: general purpose object that provides an API to manipulate state
platform: alexa, google, IVR, messenger, slack, twilio, etc.
frame: program frame - contains local variables, pointers, etc.
stack: stack of program frames (like code execution stack)
program: flow program object (equivalent to text/code in memory model) (READ ONLY)
node: node metadata - each individual step/block in the program
variables: global variables
storage: object full of things you want to persist between requests
turn: object full of things that only matter during this turn
handlers: array of handlers that handle specific node functionalities