-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added SpeechInputSource for integration of the KeywordRecognizer in the InputManager #354
Added SpeechInputSource for integration of the KeywordRecognizer in the InputManager #354
Conversation
…object and also a global listener.
…object and also a global listener.
…ada/HoloToolkit-Unity into KeywordManagerAsInputSource # Conflicts: # Assets/HoloToolkit/Input/Tests/Scenes/KeywordManager.unity
…HoloToolkit-Unity into KeywordManagerAsInputSource
Hi @aalmada, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution! The agreement was validated by Microsoft and real humans are currently evaluating your PR. TTYL, MSBOT; |
@maxouellet This is the way I found to extend InputManager although I expressed before some reservations on having to change IInputSource and InputManager every time we integrate a new form of input. Is this what you had in mind? |
SemanticMeanings = semanticMeanings; | ||
Text = text; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PhraseRecognizedEventArgs class should be in its own file.
@@ -19,17 +21,15 @@ namespace HoloToolkit.Unity.InputModule | |||
/// Edit -> Project Settings -> Player -> Settings for Windows Store -> Publishing Settings -> Capabilities | |||
/// or in your Visual Studio Package.appxmanifest capabilities. | |||
/// </summary> | |||
public partial class KeywordManager : MonoBehaviour | |||
public partial class KeywordManager : BaseInputSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create a new class (SpeechInputSource) instead of modifying the existing Keyword manager? Then, if it works well, we could simply eventually delete the old KeywordManager.
// This helps easily link the keyword recognized to the UnityEvent to be invoked. | ||
responses = KeywordsAndResponses.ToDictionary(keywordAndResponse => keywordAndResponse.Keyword, | ||
keywordAndResponse => keywordAndResponse.Response); | ||
return SupportedInputEvents.SourceUpAndDown; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need to add a new type of SupportedInputEvents (SpeechKeyword?) and change this accordingly
@@ -130,5 +132,31 @@ public void StopKeywordRecognizer() | |||
keywordRecognizer.Stop(); | |||
} | |||
} | |||
|
|||
private void OnPhraseRecognized(UnityEngine.Windows.Speech.ConfidenceLevel confidence, | |||
TimeSpan phraseDuration, DateTime phraseStartTime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: I suggest putting all input params on the same line, or all on separate line.
Also, why I'd suggest calling this method from KeywordRecognizer_OnPhraseRecognized, so that we only have one entry point to RaisePhraseRecognizedEvent
@@ -119,6 +119,11 @@ public interface IInputSource | |||
event EventHandler<NavigationEventArgs> NavigationCanceled; | |||
|
|||
/// <summary> | |||
/// Event triggered when a speech phrase is recognized. | |||
/// </summary> | |||
event EventHandler<PhraseRecognizedEventArgs> PhraseRecognized; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a naming perspective, I wonder if this should be something like SpeechKeywordRecognizedEventArgs, or something else. I don't have a problem with Phrase, just bringing up the discussion in case somebody has a strong opinion on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the UnityEngine.Windows.Speech naming but I also don't like it much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm OK with leaving like that for now unless someone comes up with a better name. Makes sense to align with existing name for things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxouellet I just noticed that changing to SpeechKeywordRecognizedEventArgs avoids having to use explicit namespaces because of name clashing. I would then change the IInputSource event to SpeechKeywordRecognized. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works for me, I agree that having to use explicit namespaces is annoying.
@aalmada Yeah, this is similar to what I had in mind. This looks pretty good to me, other than the small comments I added. The main thing from my perspective is that I think we should make a clean break from KeywordManager, and create a new SpeechInputSource. That will prevent breaking people that might be using KeywordManager (and we can phase it out eventually, as we "fix" the various test scenes). I understand the concern around modifying the interface + InputManager every time you add something new, but then again, I don't think that adding something new is something that will happen very frequently. Once we've nailed most types of inputs, we might add information in the event data classes, but that wouldn't require a change to IInputSource or InputManager. I also don't have a better, cleaner and easy-to-understand solution to the problem. |
This should be resubmitted on master, now that we've merged the branches. |
/// </summary> | ||
public string Text { get; private set; } | ||
|
||
public PhraseRecognizedEventArgs(IInputSource inputSource, uint sourceId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: formatting
/// <summary> | ||
/// The text that was recognized. | ||
/// </summary> | ||
public string Text { get; private set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: more descriptive name than just Text. Could be ambiguous if used improperly. suggested name: RecognizedText
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replicated the names from PhraseRecognizedEventArgs but I can change it.
@@ -6,6 +6,8 @@ | |||
using UnityEngine; | |||
using UnityEngine.Events; | |||
using UnityEngine.Windows.Speech; | |||
using UnityEngine.VR.WSA.Input; | |||
using System; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: System should be first.
@@ -20,7 +20,7 @@ void Start() | |||
} | |||
|
|||
textPanel.text = "Try saying:\n"; | |||
foreach (KeywordManager.KeywordAndResponse k in keywordManager.KeywordsAndResponses) | |||
foreach (KeywordManager.KeywordAndKeyCode k in keywordManager.KeywordsAndKeys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I know this is like this in the old version but I think we should rename k
to be key
|
||
public void OnPhraseRecognized(PhraseRecognizedEventData eventData) | ||
{ | ||
switch (eventData.Text.ToLower()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use switch instead of if statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readability. under the covers, switch on a string generates the equivalent IL that if/else-if blocks would up to a "magic number" of cases at which time it uses a hashtable for better performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great explanation Tim!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the existing SphereKeyword.cs as a reference and it already uses a switch.
{ | ||
[Tooltip("The keyword to recognize.")] | ||
public string Keyword; | ||
[Tooltip("The KeyCode to recognize.")] | ||
public KeyCode KeyCode; | ||
[Tooltip("The UnityEvent to be invoked when the keyword is recognized.")] | ||
public UnityEvent Response; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This UnityEvent is basically the original point of the KeywordManager. It allows a developer to assign an action in the Inspector, instead of having to worry about the KeywordRecognizer's implementation in code.
There's another comment in here about leaving KeywordManager as is and creating a new class, which I support as well. If you're able to migrate this without losing functionality, then I'd support using KeywordManager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The propose of this PR is to use the InputManager handling mechanisms so I thought the UnityEvent wouldn't make sense anymore.
Anyway, I'm going to follow the suggestion of creating a new SpeechInputSource and leave the KeywordManager like it was.
…ada/HoloToolkit-Unity into KeywordManagerAsInputSource # Conflicts: # Assets/HoloToolkit/Input/Tests/Scenes/SpeechInputSource.unity
I'll work on this tomorrow. |
No worries, great work so far! |
@HodgsonSDAS I did what you suggested to the shader and it works great for this case. Note that when using [PerRendererData] the Color property isn’t visible anymore in the material editor – per renderer properties can only be changed by script. |
@aalmada, Sorry, I didn't realize doing that would cause the color property to disappear from the inspector window, but it is the best way to change colors per renderer without the memory leaks discussed earlier. This may also require a custom material inspector editor script or a Custom property drawer as well to get the color property to show in the inspector again. It's also important to note that even though all objects share the same material, MaterialPropertyBlocks break dynamic batching. If you feel like this color stuff is too much or doesn't exactly relate to this PR I'm down for just skipping it for now. |
@@ -154,7 +154,7 @@ Material: | |||
m_Colors: | |||
- first: | |||
name: _Color | |||
second: {r: 0.8897059, g: 0.8897059, b: 0.8897059, a: 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably revert this back.
switch (eventData.RecognizedText.ToLower()) | ||
{ | ||
case "reset all": | ||
foreach (Renderer renderer in GetComponentsInChildren<Renderer>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably best to cache all the renderers here, and be sure to check/update the cache if children are added/removed. (forgive the brevity)
private MaterialPropertyBlock propertyBlock;
private Renderer[] childRenderCache;
private void Start()
{
propertyBlock = new MaterialPropertyBlock();
childrenRenderCache = GetComponentsInChildren<Renderer>()
}
public void OnSpeechKeywordRecognized(SpeechKeywordRecognixedEventData eventData)
{
switch(keywords)
{
case "reset all":
for(int i=0; i < childrenRenderCache.Length; i++)
{
ChangeColor(childrenRenderCache[i], Color.white);
{
break;
}
}
private void ChangeColor(Renderer, Color color)
{
// Do Color stuff
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this performance tweak be left for after this massive PR is reviewed and merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely
/// Edit -> Project Settings -> Player -> Settings for Windows Store -> Publishing Settings -> Capabilities | ||
/// or in your Visual Studio Package.appxmanifest capabilities. | ||
/// </summary> | ||
public partial class SpeechInputSource : BaseInputSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be replacing KeywordManager.cs? (If so should we be removing it/Depreciating it to give others time to migrate over?)
Or does this only exist on the object that will be executing the logic when the keyword is spoken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HodgsonSDAS In the new InputManager architecture, event sources register with it. It then looks for a message handler in the object with the focus. If not there, it goes up the object hierarchy. It also supports global handlers and fallback handlers.
The SpeechInputSource is only the source that could replace the KeywordManager. I currently only support handlers for when the keyword is spoken as scripts. I'm prototyping support for inspector.
If this PR results in a better solution than the KeywordsManager, it would be nice to turn it obsolete as having both can be confusing to users.
It's also my fault but I've been trying to clear all reviews and this PR is loosing some focus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, yeah I agree we should skip the other review content and get back to it later.
I don't mind trying to take a look at updating the material color changes in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't consider Inspector compatibility to be blocking at this point, since that's one of the reasons we requested these changes be added to something new instead of the KeywordManager. I also don't think these material changes (adding custom Inspector components and new scripts just for a test scene, etc) are necessary in this case, as it's a one off test script and, if we have issues because of one extra material instance, well...haha. or, if there's only one sphere in the scene, just use sharedMaterial. |
While I agree with skipping the material issue, the inspector compatibility might be more of a sticking point for some. If the |
Definitely agreed on that (it's a sticking point for me as far as replacing KeywordManager completely #354 (comment) 😃)! If we want to add that functionality in now, I'm all for it. I'm just not against adding this side-by-side without it for now, with it potentially being added later and completing the KeywordManager replacement. |
Ahh, my apologies, I must have missed those comments somehow. Well I'm down for keeping the KeywordManager for now and marking it obsolete, then removing it altogether once we implement the inspector window assignments in the SpeechInputSource. |
All good! I felt like we were more or less on the same page; just wanted to be sure :) |
@HodgsonSDAS @keveleigh Please note that all the InputManager handlers are still scripts. I started working on inspector support for the handler but I'd rather do it in a future PR. It raises more questions than the source itself. |
@aalmada Sorry I haven't been very responsive lately, we just had a baby and that's taken most of my time. I haven't had a chance to look at this coe much, but overall, my opinion is that the input source itself should only be registering the keywords it needs and routing those to the appropriate destination through the InputManager. If we want to emulate the behavior of the old KeywordManager, we should create a global listener script (KeywordHandler.cs or something like that?) that implements ISpeechHandler and essentially adds itself as a modal listener to the InputManager. That script could then have the same kind of editor logic that KeywordManager made available (associating keywords to actions). The one thing I never quite figured out is how to make all of that logic type-safe... I'm annoyed at having to check for strings in order to know which word was said by the user. Ideally, we'd have some kind of enum or equivalent that defines that instead, but not sure how to implement that cleanly without requiring every user to modify an enum when they want to add new keywords. It would sure be nice to be able to dynamically add/remove new keywords dynamically, or at least turn them on/off dynamically at app runtime. |
First congratulations @maxouellet! @aalmada, I want to apologize for making this more difficult than anticipated. I didn't mean to complicate the PR. I think we can revert all changes to the shader and material assignments. I've went ahead and pulled your code so I have a copy as is and can start working on the suggested changes after new year's. I think it's best to only submit the relevant changes right now. You've done fantastic work. Thanks. |
@maxouellet Congrats! That's a real lifetime project. ;) |
{ | ||
private void Start() | ||
{ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably remove if empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing it makes the script enable checkbox disappear but we aren't handling it anyway and this is just test code...
@@ -8,7 +8,7 @@ Shader "HoloToolkit/BlinnPhong Configurable" | |||
{ | |||
[Header(Main Color)] | |||
[Toggle] _UseColor("Enabled?", Float) = 0 | |||
_Color("Main Color", Color) = (1,1,1,1) | |||
_Color("Main Color", Color) = (1,1,1,1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tab offset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn tabs ;)
Besides the two little things I think this PR is ready. |
Agreed! Thanks so much for all the work that went into this, @aalmada! |
KeywordManager implementation as an input source. This makes it coherent with the new InputManager infrastructure.
The KeywordManager test scene was changed to show 3 spheres. Only the sphere with focus changes color. A global "reset all" voice command changes all spheres to grey even with no sphere in focus.