Azure OpenAI /realtime: an interactive chat with multi-assistants

This repo contains a node sample application that uses AOAI Realtime Audio endpoint. See more detail about the SDK at AOAI Realtime Audio SDK

This sample switches multiple assistants (system prompt + tools set) seamlessly depending on your intent.

Scenario

You can ask about mobile service, such as

Weather
Mobile phone billing
Mobile phoen current plan
Mobile phone options
Consulation on usage
Mobile phoe store related question, etc.

You can find the assistant definitions at assistants.ts. See all tools set for each assistant to understand what each assistant can do, or modify as you need.

Prereqs

Node.js installation (https://nodejs.org)
Azure Open AI account
GPT-4o realtime model
Bing Search Resource

Using the sample

Navigate to this folder
Run npm install to download a small number of dependency packages (see package.json)
Rename .env_sample to .env and update variables
Run npm run dev to start the web server, navigating any firewall permissions prompts
Use any of the provided URIs from the console output, e.g. http://localhost:5173/, in a browser
If you want to debug the application, press F5 that will launch the browser for debug.
Check Chat Only if you prefer to use text input only, otherwise you can use both Speech and text.
Click the "Start" button to start the session; accept any microphone permissions dialog
You should see a << Session Started >> message in the left-side output, after which you can speak to the app
You can interrupt the chat at any time by speaking and completely stop the chat by using the "Stop" button
Optionally, you can use chat area to talk to the bot rather than speak to.
Assitant name will be displayed in the assistant name text input whenever an assistant is loaded.
To delete the specific message, enter the Id of the message to Delete Item which you can find in the chat history and click Delete that will strike sthough the idem.

Known issues

Connection errors are not yet gracefully handled and looping error spew may be observed in script debug output. Please just refresh the web page if an error appears.
Voice selection is not yet supported.
More authentication mechanisms, including keyless support via Entra, will come in a future service update.

Code description

This sample uses a custom client to simplify the usage of the realtime API. The client package is included in this repo in the rt-client-0.4.7.tgz file. Check the AOAI Realtime Audio SDK to see if there is a newer version of the package if you need the latest version of the SDK.

The primary file demonstrating /realtime use is src/main.ts; the first few functions demonstrate connecting to /realtime using the client, sending an inference configuration message, and then processing the send/receive of messages on the connection.

Assistants

In this repo, we define an assistant as:

has system prompt
has tools (function calling definitions)

We use function calling feature to switch to other assistant.

For example, the generic assistant has following function calling definition.

{
    name: 'Assistant_MobileAssistant',
    description: 'Help user to answer mobile related question, such as billing, contract, etc.',
    parameters: {
        type: 'object',
        properties: {}
    },
    returns: async (arg: string) => "Assistant_MobileAssistant"
}

This function will be called whenever you asked about mobile phone related question. When we excute the function, instead of returns the function calling result back to the LLM, we send:

SessionUpdateMessage to switch the assistant.
response.create to let the model to continue the message.

Function Calling

To simplify the demo, we define the function calling metadata and the function defintion into one object. The returns property contains the anonymous function that returns the function calling result.

The below example is the get weather function, that always returns the weather as 40F and rainy with the location name.

{
    name: 'get_weather',
    description: 'get the weather of the locaion',
    parameters: {
        type: 'object',
        properties: {
            location: { type: 'string', description: 'location for the weather' }
        }
    },
    returns: async (arg: string) => `the weather of ${JSON.parse(arg).location} is 40F and rainy`
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.vscode		.vscode
public		public
src		src
.env_sample		.env_sample
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
index.html		index.html
package.json		package.json
rt-client-0.4.7.tgz		rt-client-0.4.7.tgz
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Azure OpenAI /realtime: an interactive chat with multi-assistants

Scenario

Prereqs

Using the sample

Known issues

Code description

Assistants

Function Calling

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

microsoft/aoai-realtime-multi-assistants

Folders and files

Latest commit

History

Repository files navigation

Azure OpenAI /realtime: an interactive chat with multi-assistants

Scenario

Prereqs

Using the sample

Known issues

Code description

Assistants

Function Calling

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages