Skip to content

Latest commit

 

History

History
141 lines (111 loc) · 16 KB

File metadata and controls

141 lines (111 loc) · 16 KB

Windows Voice Assistant Client

The Windows Voice Assistant Client is a Windows Presentation Foundation (WPF) application in C# that makes it easy to test interactions with your bot before creating a custom client application. It demonstrates how to use the Azure Speech Services Speech SDK to manage communication with your Azure Bot-Framework bot. To use this client, you need to register your bot with the Direct Line Speech channel. Windows Voice Assistant Client is used in the tutorial Voice-enable your bot using the Speech SDK.

Following the introduction of Custom Commands in Speech SDK 1.8, this tool was updated to accept a custom command application ID. This allows you to test your task completion or command-and-control scenario hosted on the Custom Command service.

Note: This sample replaces the "Direct Line Speech Client" that used to be hosted in the deprecated GitHub repo Azure-Samples/Cognitive-Services-Direct-Line-Speech-Client. Functionality is the same. The name was changed to Voice Assistant Client (or "Windows Voice Assistant Client"), to reflect the support for both Custom Commands applications as well as bots registered with Direct Line Speech channel. It has moved to this new repo to be in proximity to other voice assistant client sample code.

Features

  • Fully configurable to support any bot registered with the Direct Line Speech channel or Custom Commands application
  • Accepts typed text and speech captured by a microphone as inputs for your bot
  • Supports playback of audio response
  • Supports use of custom wake-words
  • Supports sending custom Bot-Framework Activities as JSON to the bot
  • Displays Adaptive Cards sent from your bot. On the client we implement the .NET WPF version of the adaptive card SDK
  • Exports the transcript and activity logs to a file

Getting Started

Prerequisites

Let's review the hardware, software, and subscriptions that you'll need to use this client application.

Quickstart

  1. To download and run a pre-build executable:

    • Go to the Releases section of this GitHub repo
    • Look for the latest release named "Windows Voice Assistant Client". Each release has a tag in the form of YYYYMMDD.# indicating the build date
    • Download the latest ZIP package named WindowsVoiceAssistantClient-YYYYMMDD.#.zip, and unpack it to a local drive.
    • Run the executable VoiceAssistantClient.exe
  2. Alternatively, to build the executable from source code:

    • The first step is to clone the repository:
    git clone https://github.com/Azure-Samples/Cognitive-Services-Voice-Assistants.git
    • Then change directories:
    cd Cognitive-Services-Voice-Assistants\clients\csharp-wpf
    • Launch Visual Studio 2017 or newer by opening the solution VoiceAssistantClient.sln. Build the solution (the default build flavor is Debug x64)
    • Run the executable. For example, for Debug x64 build, this will be the executable: VoiceAssistantClient\bin\x64\Debug\VoiceAssistantClient.exe.
  3. When you first run the application, the Settings page will open. If this is not the first time you run the application, click on the gears icon on the top right to access the settings page. The first three fields are required (all others are optional).

    • Enter Connection Profile. A name of your choice to identify this connection. The tool remembers multiple connection profiles so you can easily switch between them.
    • Enter Speech service key. This is your Azure Speech Services Key.
    • Enter Speech service region. This is the Azure region of your key in the format specified by the "Speech SDK Parameter" column in this table (for example "westus").
    • Leave the field Custom commands app Id empty (unless you plan to use Custom Commands).
    • The default input language is "en-us" (US English). Update the Language field as needed to select a different language code from the "Speech-to-text" list.
    • Press Save and Apply Profile when you're done.
    • Your entries will be saved under this profile name and will be available when you launch the app again. Setting page
  4. Press Reconnect. The application will try to connect to your bot via Direct Line Speech channel, and your connection profile name will be shown at the top. The message New conversation started -- type or press the microphone button will appear below the text bar if the connection succeeded. Main Page

  5. You'll be prompted to allow microphone access. If you want to use the microphone, allow access.

  6. Press the microphone icon to begin recording. While speaking, intermediate recognition results will be shown in the application. The microphone icon will turn red while recording is in progress. It will automatically detect end of speech and stop recording.

  7. If everything works, you should see your bot's response on the screen and hear it speak the response. You can click on lines in the Activity Log window to see the full activity payload from the bot in JSON. Note: You'll only hear the bot's voice response if the Speak field in the bot's output activity was set. Main Page with Activity

Troubleshooting

If an error messages was shown in red in the main application window, use this table to troubleshoot:

Error What should you do?
Error (AuthenticationFailure) : WebSocket Upgrade failed with an authentication error (401). Please check for correct speech service key (or authorization token) and service region In the Settings page of the application, make sure you entered the Speech service key and its region correctly.
Error (ConnectionFailure) : Connection was closed by the remote host. Error code: 1011. Error details: We could not connect to the bot before sending a message Make sure you checked the "Enable Streaming Endpoint" box and/or toggled "Web sockets" to On
Make sure your Azure App Service is running. If it is, try restarting your App Service.
Error (ConnectionFailure) : Connection was closed by the remote host. Error code: 1002. Error details: The server returned status code '503' when status code '101' was expected Make sure you checked the "Enable Streaming Endpoint" box and/or toggled "Web sockets" to On
Make sure your Azure App Service is running. If it is, try restarting your App Service.
Error (ConnectionFailure) : Connection was closed by the remote host. Error code: 1011. Error details: Response status code does not indicate success: 500 (InternalServerError) Your bot specified a Neural Voice in its output Activity Speak field, but the Azure service region associated with your Speech service key does not support Neural Voices. See Standard and neural voices.

See also Debugging section in Voice-first virtual assistants Preview: Frequently asked questions

A note on connection time out

If you are connected to a bot or custom command application and no activity happened in the last 5 minutes, the service will automatically close the websocket connection with the client and with the bot. This is by design. A message will appear in the bottom bar: "Active connection timed out but ready to reconnect on demand". You do not need to press the "Reconnect" button - simply press the microphone button and start talking, type in a text message, or say the keyword (if one is enabled). The connection will automatically be reestablished.

Sending custom activities to your bot

Windows Voice Assistant Client allows you to author and send a custom JSON activity to your bot. This is done using the "Custom Activity" bar at the bottom of the main window and the "New", "Edit" and "Send" buttons. Enter a valid JSON format that conforms to the Bot-Framework Activity schema. An example is given in the file example.json.

Add custom keyword activation

See section Add custom keyword activation in the tutorial.

Use custom speech-recognition (SR) endpoint

The Speech Studio Portal allows you to create Custom Speech in order to build, analyze, and deploy custom speech recognition models. The portal will provide an "Endpoint ID" (a GUID). Enter this GUID in the "Endpoint Id" field in the settings page (under "Custom SR settings"), and check the "Enabled" box below it to have it used. Direct Line Speech channel will use your custom SR endpoint to transcribe voice.

Use custom text-to-speech (TTS) voice

The Speech Studio Portal allows you to create a Custom Voice, where you can record and upload training data to create a unique voice font for your applications. The portal will provide a deployment ID (a GUID). Enter this GUID in the "Voice deployment Ids" field in the settings page (under "Custom TTS settings"), and check the "Enabled" box below it to have it used. Direct Line Speech channel will use your custom TTS endpoint to create the bot's voice response. Note that the "speak" field in the bot's reply Activity must contain the name of the voice you created in the portal.

Use custom commands

If you built your dialog using the Custom Commands service (instead of a Bot-Framework bot registered with Direct Line Speech channel), enter your custom commands application ID in the setting page ("custom commands app id"). This client application will connect to the custom commands service that hosts your dialog.

Use adaptive cards

If your bot sends down adaptive cards, the client will display them and you will be able to click on the cards and the data from the action will be displayed in a message box. If you would like to send responses back to the bot, you will have to change the code to call the SendActivityAsync() API. This is done by implementing the Action.Submit feature of the adaptive cards defined here: Action.Submit

The code to override is in the RenderedCard_OnAction method. Here is an example of how to send a message back:

var botFrameworkActivity = Activity.CreateMessageActivity();
botFrameworkActivity.Text = submitAction.Data.ToString();
if (!string.IsNullOrEmpty(FromId))
{
    botFrameworkActivity.From = new ChannelAccount(this.settings.RuntimeSettings.Profile.FromId);
}

var jsonConnectorActivity = JsonConvert.SerializeObject(botFrameworkActivity);
this.Messages.Add(new MessageDisplay(botFrameworkActivity.Text, Sender.User));
this.Activities.Add(new ActivityDisplay(jsonConnectorActivity, botFrameworkActivity, Sender.User));
string id = this.connector.SendActivityAsync(jsonConnectorActivity).Result;
Debug.WriteLine($"SendActivityAsync called, id = {id}");

Resources