Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added SpeechInputSource for integration of the KeywordRecognizer in the InputManager #354

Merged
merged 42 commits into from
Dec 20, 2016
Merged

Added SpeechInputSource for integration of the KeywordRecognizer in the InputManager #354

merged 42 commits into from
Dec 20, 2016

Conversation

aalmada
Copy link

@aalmada aalmada commented Nov 30, 2016

KeywordManager implementation as an input source. This makes it coherent with the new InputManager infrastructure.

The KeywordManager test scene was changed to show 3 spheres. Only the sphere with focus changes color. A global "reset all" voice command changes all spheres to grey even with no sphere in focus.

@msftclas
Copy link

Hi @aalmada, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution!
You've already signed the contribution license agreement. Thanks!

The agreement was validated by Microsoft and real humans are currently evaluating your PR.

TTYL, MSBOT;

@aalmada
Copy link
Author

aalmada commented Dec 1, 2016

@maxouellet This is the way I found to extend InputManager although I expressed before some reservations on having to change IInputSource and InputManager every time we integrate a new form of input. Is this what you had in mind?

SemanticMeanings = semanticMeanings;
Text = text;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PhraseRecognizedEventArgs class should be in its own file.

@@ -19,17 +21,15 @@ namespace HoloToolkit.Unity.InputModule
/// Edit -> Project Settings -> Player -> Settings for Windows Store -> Publishing Settings -> Capabilities
/// or in your Visual Studio Package.appxmanifest capabilities.
/// </summary>
public partial class KeywordManager : MonoBehaviour
public partial class KeywordManager : BaseInputSource

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a new class (SpeechInputSource) instead of modifying the existing Keyword manager? Then, if it works well, we could simply eventually delete the old KeywordManager.

// This helps easily link the keyword recognized to the UnityEvent to be invoked.
responses = KeywordsAndResponses.ToDictionary(keywordAndResponse => keywordAndResponse.Keyword,
keywordAndResponse => keywordAndResponse.Response);
return SupportedInputEvents.SourceUpAndDown;
Copy link

@maxouellet maxouellet Dec 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need to add a new type of SupportedInputEvents (SpeechKeyword?) and change this accordingly

@@ -130,5 +132,31 @@ public void StopKeywordRecognizer()
keywordRecognizer.Stop();
}
}

private void OnPhraseRecognized(UnityEngine.Windows.Speech.ConfidenceLevel confidence,
TimeSpan phraseDuration, DateTime phraseStartTime,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: I suggest putting all input params on the same line, or all on separate line.

Also, why I'd suggest calling this method from KeywordRecognizer_OnPhraseRecognized, so that we only have one entry point to RaisePhraseRecognizedEvent

@@ -119,6 +119,11 @@ public interface IInputSource
event EventHandler<NavigationEventArgs> NavigationCanceled;

/// <summary>
/// Event triggered when a speech phrase is recognized.
/// </summary>
event EventHandler<PhraseRecognizedEventArgs> PhraseRecognized;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a naming perspective, I wonder if this should be something like SpeechKeywordRecognizedEventArgs, or something else. I don't have a problem with Phrase, just bringing up the discussion in case somebody has a strong opinion on it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the UnityEngine.Windows.Speech naming but I also don't like it much.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm OK with leaving like that for now unless someone comes up with a better name. Makes sense to align with existing name for things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxouellet I just noticed that changing to SpeechKeywordRecognizedEventArgs avoids having to use explicit namespaces because of name clashing. I would then change the IInputSource event to SpeechKeywordRecognized. What do you think?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me, I agree that having to use explicit namespaces is annoying.

@maxouellet
Copy link

@aalmada Yeah, this is similar to what I had in mind. This looks pretty good to me, other than the small comments I added. The main thing from my perspective is that I think we should make a clean break from KeywordManager, and create a new SpeechInputSource. That will prevent breaking people that might be using KeywordManager (and we can phase it out eventually, as we "fix" the various test scenes).

I understand the concern around modifying the interface + InputManager every time you add something new, but then again, I don't think that adding something new is something that will happen very frequently. Once we've nailed most types of inputs, we might add information in the event data classes, but that wouldn't require a change to IInputSource or InputManager. I also don't have a better, cleaner and easy-to-understand solution to the problem.

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Dec 1, 2016

This should be resubmitted on master, now that we've merged the branches.

/// </summary>
public string Text { get; private set; }

public PhraseRecognizedEventArgs(IInputSource inputSource, uint sourceId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: formatting

/// <summary>
/// The text that was recognized.
/// </summary>
public string Text { get; private set; }
Copy link
Contributor

@StephenHodgson StephenHodgson Dec 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: more descriptive name than just Text. Could be ambiguous if used improperly. suggested name: RecognizedText

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replicated the names from PhraseRecognizedEventArgs but I can change it.

@@ -6,6 +6,8 @@
using UnityEngine;
using UnityEngine.Events;
using UnityEngine.Windows.Speech;
using UnityEngine.VR.WSA.Input;
using System;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: System should be first.

@@ -20,7 +20,7 @@ void Start()
}

textPanel.text = "Try saying:\n";
foreach (KeywordManager.KeywordAndResponse k in keywordManager.KeywordsAndResponses)
foreach (KeywordManager.KeywordAndKeyCode k in keywordManager.KeywordsAndKeys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I know this is like this in the old version but I think we should rename k to be key


public void OnPhraseRecognized(PhraseRecognizedEventData eventData)
{
switch (eventData.Text.ToLower())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use switch instead of if statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readability. under the covers, switch on a string generates the equivalent IL that if/else-if blocks would up to a "magic number" of cases at which time it uses a hashtable for better performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great explanation Tim!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the existing SphereKeyword.cs as a reference and it already uses a switch.

{
[Tooltip("The keyword to recognize.")]
public string Keyword;
[Tooltip("The KeyCode to recognize.")]
public KeyCode KeyCode;
[Tooltip("The UnityEvent to be invoked when the keyword is recognized.")]
public UnityEvent Response;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This UnityEvent is basically the original point of the KeywordManager. It allows a developer to assign an action in the Inspector, instead of having to worry about the KeywordRecognizer's implementation in code.

There's another comment in here about leaving KeywordManager as is and creating a new class, which I support as well. If you're able to migrate this without losing functionality, then I'd support using KeywordManager.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The propose of this PR is to use the InputManager handling mechanisms so I thought the UnityEvent wouldn't make sense anymore.
Anyway, I'm going to follow the suggestion of creating a new SpeechInputSource and leave the KeywordManager like it was.

@aalmada aalmada changed the base branch from ExtensibleInputModule to master December 2, 2016 13:06
@aalmada
Copy link
Author

aalmada commented Dec 15, 2016

I'll work on this tomorrow.

@StephenHodgson
Copy link
Contributor

No worries, great work so far!

@aalmada
Copy link
Author

aalmada commented Dec 16, 2016

@HodgsonSDAS I did what you suggested to the shader and it works great for this case. Note that when using [PerRendererData] the Color property isn’t visible anymore in the material editor – per renderer properties can only be changed by script.
I also looked into the other scripts reported in #236 but there's an issue. SelectedObjectMessageReceiver.cs keeps the original color on Start() but MaterialPropertyBlock doesn't have a GetColor().

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Dec 16, 2016

@aalmada, Sorry, I didn't realize doing that would cause the color property to disappear from the inspector window, but it is the best way to change colors per renderer without the memory leaks discussed earlier. This may also require a custom material inspector editor script or a Custom property drawer as well to get the color property to show in the inspector again.

It's also important to note that even though all objects share the same material, MaterialPropertyBlocks break dynamic batching.

If you feel like this color stuff is too much or doesn't exactly relate to this PR I'm down for just skipping it for now.

@@ -154,7 +154,7 @@ Material:
m_Colors:
- first:
name: _Color
second: {r: 0.8897059, g: 0.8897059, b: 0.8897059, a: 1}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably revert this back.

switch (eventData.RecognizedText.ToLower())
{
case "reset all":
foreach (Renderer renderer in GetComponentsInChildren<Renderer>())
Copy link
Contributor

@StephenHodgson StephenHodgson Dec 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably best to cache all the renderers here, and be sure to check/update the cache if children are added/removed. (forgive the brevity)

private MaterialPropertyBlock propertyBlock;
private Renderer[] childRenderCache;

private void Start()
{
    propertyBlock = new MaterialPropertyBlock();
    childrenRenderCache = GetComponentsInChildren<Renderer>()
}

public void OnSpeechKeywordRecognized(SpeechKeywordRecognixedEventData eventData)
{
    switch(keywords)
    {
        case "reset all":
            for(int i=0; i < childrenRenderCache.Length; i++)
            {
                ChangeColor(childrenRenderCache[i], Color.white);
            {
            break;
    }
}

private void ChangeColor(Renderer, Color color)
{
    // Do Color stuff
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this performance tweak be left for after this massive PR is reviewed and merged?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely

/// Edit -> Project Settings -> Player -> Settings for Windows Store -> Publishing Settings -> Capabilities
/// or in your Visual Studio Package.appxmanifest capabilities.
/// </summary>
public partial class SpeechInputSource : BaseInputSource
Copy link
Contributor

@StephenHodgson StephenHodgson Dec 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be replacing KeywordManager.cs? (If so should we be removing it/Depreciating it to give others time to migrate over?)

Or does this only exist on the object that will be executing the logic when the keyword is spoken?

Copy link
Author

@aalmada aalmada Dec 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HodgsonSDAS In the new InputManager architecture, event sources register with it. It then looks for a message handler in the object with the focus. If not there, it goes up the object hierarchy. It also supports global handlers and fallback handlers.
The SpeechInputSource is only the source that could replace the KeywordManager. I currently only support handlers for when the keyword is spoken as scripts. I'm prototyping support for inspector.
If this PR results in a better solution than the KeywordsManager, it would be nice to turn it obsolete as having both can be confusing to users.
It's also my fault but I've been trying to clear all reviews and this PR is loosing some focus.

Copy link
Contributor

@StephenHodgson StephenHodgson Dec 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, yeah I agree we should skip the other review content and get back to it later.
I don't mind trying to take a look at updating the material color changes in a separate PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only thing blocking this PR is the ability to assign the GameObject and script method to trigger with each of the keywords without hard coding it in a class.
image

I think the only way to really do this is through a UnityEvent.

@keveleigh
Copy link
Contributor

I don't consider Inspector compatibility to be blocking at this point, since that's one of the reasons we requested these changes be added to something new instead of the KeywordManager. I also don't think these material changes (adding custom Inspector components and new scripts just for a test scene, etc) are necessary in this case, as it's a one off test script and, if we have issues because of one extra material instance, well...haha. or, if there's only one sphere in the scene, just use sharedMaterial.

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Dec 16, 2016

While I agree with skipping the material issue, the inspector compatibility might be more of a sticking point for some. If the SpeechInputSource class is intended to replace the KeywordManger, then we should keep similar functionality, like the assignment of GameObjects and functions they call when the keyword is triggered. This will also help with decoupling code from the toolkit.

@keveleigh
Copy link
Contributor

Definitely agreed on that (it's a sticking point for me as far as replacing KeywordManager completely #354 (comment) 😃)! If we want to add that functionality in now, I'm all for it. I'm just not against adding this side-by-side without it for now, with it potentially being added later and completing the KeywordManager replacement.

@StephenHodgson
Copy link
Contributor

Ahh, my apologies, I must have missed those comments somehow.

Well I'm down for keeping the KeywordManager for now and marking it obsolete, then removing it altogether once we implement the inspector window assignments in the SpeechInputSource.

@keveleigh
Copy link
Contributor

All good! I felt like we were more or less on the same page; just wanted to be sure :)

@aalmada
Copy link
Author

aalmada commented Dec 17, 2016

@HodgsonSDAS @keveleigh Please note that all the InputManager handlers are still scripts. I started working on inspector support for the handler but I'd rather do it in a future PR. It raises more questions than the source itself.
@maxouellet I added a SetGlobalListener.cs. It's a one liner script that registers the GameObject with the InputManager as a global listener. This adds editor support for this operation. It makes sense to be at this level as the GameObject could have multiple handler attached that would try to register it multiple times. It could support registration and unregistration at runtime but maybe in a future PR?
@HodgsonSDAS Should I revert all shader work or should I keep it and you work from there once PR merged?

@maxouellet
Copy link

@aalmada Sorry I haven't been very responsive lately, we just had a baby and that's taken most of my time. I haven't had a chance to look at this coe much, but overall, my opinion is that the input source itself should only be registering the keywords it needs and routing those to the appropriate destination through the InputManager.

If we want to emulate the behavior of the old KeywordManager, we should create a global listener script (KeywordHandler.cs or something like that?) that implements ISpeechHandler and essentially adds itself as a modal listener to the InputManager. That script could then have the same kind of editor logic that KeywordManager made available (associating keywords to actions).

The one thing I never quite figured out is how to make all of that logic type-safe... I'm annoyed at having to check for strings in order to know which word was said by the user. Ideally, we'd have some kind of enum or equivalent that defines that instead, but not sure how to implement that cleanly without requiring every user to modify an enum when they want to add new keywords. It would sure be nice to be able to dynamically add/remove new keywords dynamically, or at least turn them on/off dynamically at app runtime.

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Dec 19, 2016

First congratulations @maxouellet!

@aalmada, I want to apologize for making this more difficult than anticipated. I didn't mean to complicate the PR. I think we can revert all changes to the shader and material assignments. I've went ahead and pulled your code so I have a copy as is and can start working on the suggested changes after new year's. I think it's best to only submit the relevant changes right now. You've done fantastic work. Thanks.

@aalmada
Copy link
Author

aalmada commented Dec 19, 2016

@maxouellet Congrats! That's a real lifetime project. ;)
I came up with an idea after reading your comment. We can have a custom editor for the handler that lists on a dropdown, all the registered keywords from all SpeechInputSources (as this is not a singleton). It would look just like as an enum. I still have an issue on how to map the keywords to actions but I'll leave it all for another PR, once this one is approved and merged.
@HodgsonSDAS No problem. I'll revert the shader related code.

{
private void Start()
{
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably remove if empty.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing it makes the script enable checkbox disappear but we aren't handling it anyway and this is just test code...

@@ -8,7 +8,7 @@ Shader "HoloToolkit/BlinnPhong Configurable"
{
[Header(Main Color)]
[Toggle] _UseColor("Enabled?", Float) = 0
_Color("Main Color", Color) = (1,1,1,1)
_Color("Main Color", Color) = (1,1,1,1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tab offset.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn tabs ;)

@StephenHodgson
Copy link
Contributor

Besides the two little things I think this PR is ready.

@keveleigh
Copy link
Contributor

Agreed! Thanks so much for all the work that went into this, @aalmada!

@keveleigh keveleigh merged commit b8e74e1 into microsoft:master Dec 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants