Rename speech-related subsystems and add a sample scene for dictation #11348

MaxWang-MS · 2023-01-05T23:29:47Z

Overview

Rename *SpeechRecognitionSubsystem to *DictationSubsystem and rename *PhraseRecognitionSubsystem to *KeywordRecognitionSubsystem based on internal feedback.
Add a sample scene for DictationSubsystem.
Add default profiles for the recognition subsystems.

Zee2 · 2023-01-06T21:54:06Z

Missing a SampleSceneHandMenu, make sure it has one so that folks can navigate through the scenes!

UnityProjects/MRTKDevTemplate/Assets/Scripts/SpeechRecognitionHandler.cs

Zee2 · 2023-01-06T21:58:39Z

The naming is starting to confuse me. If "SpeechRecognition" is actually speech-to-text, or dictation/transcription (instead of event-based recognitions like see-it-say-it) can we call it Dictation or something?

Zee2 · 2023-01-06T21:59:22Z

UnityProjects/MRTKDevTemplate/Assets/Scripts/SpeechRecognitionHandler.cs

+    /// events fired by SpeechRecognitionSubsystem.
+    /// </summary>
+    [AddComponentMenu("MRTK/Examples/Speech Recognition Handler")]
+    public class SpeechRecognitionHandler : MonoBehaviour


Can we make it internal just so folks don't end up taking a dep on it?

This script is a sample script (i.e. in Assembly-CSharp) so not sure if internaling it would do much.

Zee2 · 2023-01-06T22:27:00Z

Mysterious threading error when exiting Play mode

Zee2 · 2023-01-06T22:28:35Z

Huge lagspike when starting recognition

The StartRecognition method takes 230ms, most of it in DictationRecognizer.Create()

Zee2 · 2023-01-06T22:38:21Z

Click the button to stop transcribing doesn't work, it just keeps going! I can take a video if you want, but clicking the button to stop it doesn't seem to have any effect at all

Zee2

Hopefully we can get the threading errors to go away. The only other thing I consider blocking is the fact that the button doesn't seem to stop transcription.

The perf spike is very unfortunate. Hopefully we can offload to a background thread.

MaxWang-MS · 2023-01-10T00:34:36Z

@Zee2 I looked into the spike and tried to move the instantiate of the Unity DictationRecognizer to another thread, but unfortunately discovered that is not possible:

MaxWang-MS · 2023-01-10T00:37:42Z

Fortunately, the spike only appears the first time the Unity recognizer gets created. I will be calling the recognizer constructor once in the subsystem constructor / Start() so that hopefully the hit is only taken at app launch if the user enables the subsystem via the profile.

MaxWang-MS · 2023-01-11T05:18:18Z

Fortunately, the spike only appears the first time the Unity recognizer gets created. I will be calling the recognizer constructor once in the subsystem constructor / Start() so that hopefully the hit is only taken at app launch if the user enables the subsystem via the profile.

Upon further investigation dummy initialization at start does not work. Will file a ticket to Unity regarding the time spike.

MaxWang-MS · 2023-01-11T05:19:05Z

Click the button to stop transcribing doesn't work, it just keeps going! I can take a video if you want, but clicking the button to stop it doesn't seem to have any effect at all

Now fixed!

MaxWang-MS · 2023-01-11T05:19:52Z

Mysterious threading error when exiting Play mode

Fixed as well.

keveleigh · 2023-01-11T21:06:44Z

com.microsoft.mrtk.input/Utilities/SpeechUtils.cs

        {
-            return XRSubsystemHelpers.GetFirstRunningSubsystem<PhraseRecognitionSubsystem>() as PhraseRecognitionSubsystem;
+            return XRSubsystemHelpers.GetFirstRunningSubsystem<KeywordRecognitionSubsystem>() as KeywordRecognitionSubsystem;


nit: is the as KeywordRecognitionSubsystem part needed? I thought this helper returned the same type as the passed-in T

I also wonder if we want to move this to the new pattern @Zee2 is introducing for the hands aggregator in #11333: https://github.com/microsoft/MixedRealityToolkit-Unity/pull/11333/files#diff-6bb0070ef5fe1b4e52210a345ab2f2c2f693e0b9cff69e18e05456aef8969376R160

Basically, deprecating the old helper from HandsUtils and putting it into XRSubsystemHelpers directly

keveleigh · 2023-01-11T21:12:04Z

com.microsoft.mrtk.core/Subsystems/Speech/DictationEventArgs.cs

+    /// <summary>
+    /// Event data associated with the result of dictation.
+    /// </summary>
+    public class DictationResultEventArgs


are these event args reused or one-time-use? if the latter, could they be rewritten as readonly structs to more clearly express the design intent?

Now rewritten as readonly structs!

Issues fixed

…dRealityToolkit-Unity into add-speech-scene

UnityProjects/MRTKDevTemplate/Assets/Scripts/DictationHandler.cs

…microsoft#11348) * Add a sample scene for SpeechRecognitionSubsystem * Feedback * Update WindowsSpeechRecognitionSubsystem.cs * Interface and subsystem rename part 1 * Interface and subsystem rename part 2 * Interface and subsystem rename part 3 * Rename configs * More renaming * Rename sample scene * Update DictationExample.unity * Feedback * Add private setters to expose properties in the inspector * Add sample scene hand menu * Better display of error message Co-authored-by: Finn Sinclair <finnnorth@gmail.com>

Add a sample scene for SpeechRecognitionSubsystem

80f3146

MaxWang-MS requested review from david-c-kline, keveleigh, maluoi, RogPodge and Zee2 as code owners January 5, 2023 23:29

github-actions bot added the MRTK3 label Jan 5, 2023

Merge branch 'mrtk3' into add-speech-scene

071979c

keveleigh reviewed Jan 6, 2023

View reviewed changes

UnityProjects/MRTKDevTemplate/Assets/Scripts/SpeechRecognitionHandler.cs Outdated Show resolved Hide resolved

Zee2 reviewed Jan 6, 2023

View reviewed changes

Zee2 previously requested changes Jan 6, 2023

View reviewed changes

MaxWang-MS added 8 commits January 10, 2023 15:43

Feedback

d6636ce

Update WindowsSpeechRecognitionSubsystem.cs

69b3fe7

Interface and subsystem rename part 1

35a59c5

Interface and subsystem rename part 2

3cc6633

Interface and subsystem rename part 3

5b6c3f9

Rename configs

e5e4d6f

More renaming

e5e6e01

Merge branch 'mrtk3' into add-speech-scene

9d1a265

MaxWang-MS changed the title ~~Add a sample scene for SpeechRecognitionSubsystem~~ Rename speech-related subsystems and add a sample scene for dictation Jan 11, 2023

MaxWang-MS requested review from Zee2 and keveleigh January 11, 2023 05:16

MaxWang-MS mentioned this pull request Jan 11, 2023

Error retrieving configuration when starting WindowsPhraseRecognitionSubsystem (MRTK3) #11366

Closed

keveleigh reviewed Jan 11, 2023

View reviewed changes

Merge branch 'mrtk3' into add-speech-scene

6604012

MaxWang-MS added 4 commits January 11, 2023 13:17

Rename sample scene

fb9e905

Update DictationExample.unity

a3b9cb7

Merge branch 'add-speech-scene' of https://github.com/MaxWang-MS/Mixe…

69fe5b7

…dRealityToolkit-Unity into add-speech-scene

Feedback

ea93d9a

MaxWang-MS requested a review from keveleigh January 12, 2023 01:43

MaxWang-MS added 3 commits January 11, 2023 20:21

Add private setters to expose properties in the inspector

c3769fc

Add sample scene hand menu

95e4b40

Better display of error message

5443832

MaxWang-MS enabled auto-merge (squash) January 12, 2023 23:52

Merge branch 'mrtk3' into add-speech-scene

bb3a894

keveleigh reviewed Jan 13, 2023

View reviewed changes

UnityProjects/MRTKDevTemplate/Assets/Scripts/DictationHandler.cs Show resolved Hide resolved

keveleigh approved these changes Jan 13, 2023

View reviewed changes

MaxWang-MS merged commit 0421c57 into microsoft:mrtk3 Jan 13, 2023

MaxWang-MS deleted the add-speech-scene branch January 13, 2023 01:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename speech-related subsystems and add a sample scene for dictation #11348

Rename speech-related subsystems and add a sample scene for dictation #11348

MaxWang-MS commented Jan 5, 2023 •

edited

Zee2 commented Jan 6, 2023

Zee2 commented Jan 6, 2023

Zee2 Jan 6, 2023

MaxWang-MS Jan 6, 2023

Zee2 commented Jan 6, 2023

Zee2 commented Jan 6, 2023

Zee2 commented Jan 6, 2023 •

edited

Zee2 left a comment

MaxWang-MS commented Jan 10, 2023

MaxWang-MS commented Jan 10, 2023

MaxWang-MS commented Jan 11, 2023

MaxWang-MS commented Jan 11, 2023

MaxWang-MS commented Jan 11, 2023

keveleigh Jan 11, 2023

keveleigh Jan 11, 2023

MaxWang-MS Jan 12, 2023

keveleigh Jan 11, 2023

MaxWang-MS Jan 12, 2023

Rename speech-related subsystems and add a sample scene for dictation #11348

Rename speech-related subsystems and add a sample scene for dictation #11348

Conversation

MaxWang-MS commented Jan 5, 2023 • edited

Overview

Zee2 commented Jan 6, 2023

Zee2 commented Jan 6, 2023

Zee2 Jan 6, 2023

Choose a reason for hiding this comment

MaxWang-MS Jan 6, 2023

Choose a reason for hiding this comment

Zee2 commented Jan 6, 2023

Zee2 commented Jan 6, 2023

Zee2 commented Jan 6, 2023 • edited

Zee2 left a comment

Choose a reason for hiding this comment

MaxWang-MS commented Jan 10, 2023

MaxWang-MS commented Jan 10, 2023

MaxWang-MS commented Jan 11, 2023

MaxWang-MS commented Jan 11, 2023

MaxWang-MS commented Jan 11, 2023

keveleigh Jan 11, 2023

Choose a reason for hiding this comment

keveleigh Jan 11, 2023

Choose a reason for hiding this comment

MaxWang-MS Jan 12, 2023

Choose a reason for hiding this comment

keveleigh Jan 11, 2023

Choose a reason for hiding this comment

MaxWang-MS Jan 12, 2023

Choose a reason for hiding this comment

MaxWang-MS commented Jan 5, 2023 •

edited

Zee2 commented Jan 6, 2023 •

edited