Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HoloToolkit Input Module: design discussion #277

Closed
maxouellet opened this issue Oct 12, 2016 · 81 comments
Closed

HoloToolkit Input Module: design discussion #277

maxouellet opened this issue Oct 12, 2016 · 81 comments

Comments

@maxouellet
Copy link

This is the initial suggested design for the new input module. Feel free to ask questions / request changes.

I already have a version of this implemented, to which I will make improvements as needed. Goal is to add it to HoloToolkit so that new features can be added by the community as needed. It has been designed to be as extensible as possible.

Why design a new Input Module?

A lot of applications have a need for an efficient way to handle input. In an ideal world, game objects can handle the input events they are interested in without having to write the same code in multiple components. The current solution in HoloToolkit isn't very performant and relies on Unity's messaging system, which is not very performant and doesn't make it obvious that behaviours are handling input events. It also doesn't extend well to support multiple different input mechanisms.

Unity provides an event system that it currently leverages to send out inputs to game objects. It contains a HoloLensInputModule that interprets HoloLens input and sends out the standard Unity events. The solution I propose does not use this input module, for the following reasons:

  • It does not easily support gaze stabilization: instead, it always assumes that you are targeting the center point of your camera, using a raycaster attached to your camera.
  • It only supports sending out the standard Unity events, which are 2D events. This lacks some of the key information that is needed to manipulate 3D objects with input devices that have more than 2 degrees of freedom.

The proposed design leverages Unity's event system, but implements its own version of an input module (called InputManager in this case). This currently gives us more flexibility in exploring various interaction mechanisms, which is very helpful when working with novel inputs devices such as what HoloLens provides.

Class diagram

Overall view of the main classes and interfaces that make up the initial version of this input module. Minor changes could still be done before submission (for example, HandsInput will likely be merged with GesturesInput, as their functionalities are very similar).

inputmodule

Input Module Design

The input module is designed to be extensible: it could support various input mechanisms and various types of gazers.

Each input source (hands, gestures, others) implement a IInputSource interface. The interface defines various events that the input sources can trigger. The input sources register themselves with the InputManager, whose role it is to forward input to the appropriate game objects. Input sources can be dynamically enabled / disabled as necessary, and new input sources can be created to support different input devices.

Game objects that want to consume input events can implement one or many input interfaces, such as:

  • IFocusable for focus enter and exit. The focus can be triggered by the user's gaze or any other gaze source.
  • IHoldHandle for the Windows hold gesture.
  • IInputHandler for source up, down and clicked. The source can be a hand that tapped, a clicker that was pressed, etc.
  • IManipulationHandler for the Windows manipulation gesture.
  • ISourceStateHandler to handle when an input source is detected or lost. This can be triggered when a hand comes into view of the HoloLens, for example.

The input manager listens to the various events coming from the input sources, and also takes into account the gaze. Currently, that gaze is always coming from the GazeManager class, but this could be extended to support multiple gaze sources if the need arises.

By default, input events are sent to the currently focused game object, if that object implements the appropriate interface. Modals input handlers can also be added to the input manager: these modal handlers will take priority over the currently focused object Fallback handlers can also be defined, so that the application can react to global inputs that aren't targeting a specific element. Any event sent by the input manager always bubbles up from the object to its ancestors.

In recap, the input manager forwards the various input sources events to the appropriate game object, using the following order:

  1. The registered modal input handlers, in LIFO order of registration
  2. The currently focused object
  3. The fallback input handlers, in LIFO order of registration

The input manager also has a pointer to the currently active Cursor, allowing it to be accessed from there. The cursor currently also depends on the gaze coming from the GazeManager class.

@StephenHodgson
Copy link
Contributor

LIFO?

@maxouellet
Copy link
Author

Last In, First Out. It's a stack

@genereddick
Copy link

When will this be ready for testing?

@maxouellet
Copy link
Author

Good question. I'm aiming for the week of the 24th. Current plan is to initially put this in a separate branch, give it a few weeks to communicate the breaking changes + stabilize as needed, and then RI into master.

I have this working locally on a few projects, but there's some cleanup I need to do beforehand to cleanly integrate it. Notably, this comes with some other utility fixes and additions that might have ripple effects, such as a modified Singleton that doesn't call FindByType.

@StephenHodgson
Copy link
Contributor

@maxouellet Any News?

@aalmada
Copy link

aalmada commented Oct 25, 2016

Have you considered using Reactive Extensions (UniRx)? It allows compact and robust code specially when handling asynchronous inputs from multiple sources. Here is a reactive version of the GazeGestureManager for the Origami sample: https://github.com/aalmada/OrigamiRx/blob/master/Assets/Scripts/GazeGestureManager.cs

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Oct 25, 2016

I was considering UniRx as well, but I didn't like that it was essentially using Linq.

Were you able to record definitive performance tests?

@aalmada
Copy link

aalmada commented Oct 25, 2016

The Linq performance issue was already raised in the Hololens forum and here is the clarification by the UniRx developer: https://forums.hololens.com/discussion/comment/7938/#Comment_7938

@maxouellet
Copy link
Author

@HodgsonSDAS I'll likely get started on this today or tomorrow. I have all the code written for a first version, it's just a matter of cleaning it up so that i can publish it externally. My schedule is clearing up this afternoon, so hoping to have a first pull request out by end of the week,

@aalmada I have not considered UnirRx. My understanding is that we want to keep HoloToolkit as light and standalone as possible, and also make it simple to use for new developers. While I have no doubt that UniRx is great, I'm not convinced that we should add it as a mandatory dependency to the core HoloToolkit. That being said, this can definitely be brought up in a separate discussion if someone is willing to do the work. Maybe it could be an optional extension to HoloToolkit, or something like that.

@NeerajW
Copy link

NeerajW commented Oct 25, 2016

I agree with Max. We should keep HoloToolkit strongly typed, easy to read code and extensible.
Personally, I felt UnirRx did not meet the readable code criteria.
I don't doubt its power but I'd rather not take a dependency on that less known design pattern. Interface/Singleton based patterns are understood by all. If not then they can be easy to ramp up to. Hope this helps!

@darax
Copy link
Member

darax commented Oct 25, 2016

@maxouellet are you planning on switching back to the singleton that requires setting the instance in awake?

@NeerajW
Copy link

NeerajW commented Oct 25, 2016

Max here are some questions/thoughts based on the interface diagram:

Gaze:
Since we have named all classes so far related to Gaze to be Gaze*something does it makes sense to call IFocusable to be IGazeable and OnGazeEnter and OnGazeExit? This might have the transition easier as well for current code bases.

Gesture:

  • Would it make sense to define all interfaces for gestures? And not just Hold/Manipulation.
  • ISourceStateHandler Vs IInputHandler: Looks like these split out the UnityEngine.VR.WSA.Input.InteractionManager functionality. I think the names are confusing. Perhaps keep the interface names closer to the actual API? For example: It can be confusing to know which interface to use for what in this design. Unless I misunderstood something?

Voice:

  • I don't see much around that. Any thoughts?

@aalmada
Copy link

aalmada commented Oct 25, 2016

I totally agree with @maxouellet reasoning but not with @NeerajW . Rx is strongly typed, easier to read and easier to extend. For these same reasons so many people are now adopting functional languages.
I already started working on a reactive version of HoloToolkit but I'm still trying to figure out the best way to take advantage of your work without replicating it. If it was natively reactive it would be much simpler and that's why I raised the issue.
Thanks a lot for the feedback and for all the amazing work.

@maxouellet
Copy link
Author

@darax Yes, but we have improvements to it so that it automatically sets itself up in Awake. There are two drawbacks to this, but at least it's better coding practice than doing FindByType across your whole scene.

  1. If your singleton class needs to declare an Awake, it has to call base.Awake() to set the instance.
  2. Singletons are only guaranteed to be set in Start(), so you should never call into a Singleton in Awake. This is consistent with the expectation that Awake is to set-up your class, and Start is to setup its dependencies and behaviors.

@darax
Copy link
Member

darax commented Oct 25, 2016

var is the opposite of strongly typed. :)

Having a function like .RefCount is a hint that people working on it will need to understand object lifetime of Rx. This makes it harder to extend without introducing leaks or crashes.

I don't consider Linq to be more readable. The script you posted is hard for me to follow. What does .Publish() do? How does the focused object get set? I'd expect to see a raycast somewhere to set the focused object, but I don't see that.

I like where it looks like Max is going with the interfaces, and I'm sure that Rx can be wrapped around that once he's done.

@darax
Copy link
Member

darax commented Oct 25, 2016

FindByType is only called once per singleton. I'm not sure why that's a problem.

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Oct 25, 2016

I agree, Linq isn't explicitly typed and can be confusing to read at first, but the definitive answer lies in the hard data of performance testing.

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Oct 25, 2016

@maxouellet how different is this new singleton generic?
What was wrong with the current implementation?

@maxouellet
Copy link
Author

I'll make it simpler by just sharing out what we're using right now.

    /// <summary>
    /// Singleton behaviour class, used for components that should only have one instance
    /// </summary>
    /// <typeparam name="T"></typeparam>
    public class Singleton<T> : MonoBehaviour where T : Singleton<T>
    {
        private static T instance;
        public static T Instance
        {
            get
            {
                return instance;
            }
        }

        /// <summary>
        /// Returns whether the instance has been initialized or not.
        /// </summary>
        public static bool IsInitialized
        {
            get
            {
                return instance != null;
            }
        }

        /// <summary>
        /// Base awake method that sets the singleton's unique instance.
        /// </summary>
        public virtual void Awake()
        {
            if (instance != null)
            {
                Debug.LogErrorFormat("Trying to instantiate a second instance of singleton class {0}", GetType().Name);
            }
            else
            {
                instance = (T) this;
            }
        }

        public virtual void OnDestroy()
        {
            if (instance == this)
            {
                instance = null;
            }
        }
    }

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Oct 25, 2016

I like it, but we will need to make sure everyone calls the base classes when utilizing it.
I wish there was a way to make calling the base class required.

@maxouellet
Copy link
Author

You get a warning in the editor if you override Awake without specifying "override". Also, if you don't call base.Awake(), you will get a bunch of null exceptions every time you try to access the instance.

@timGerken
Copy link
Contributor

should Awake and OnDestroy be marked as "protected" instead of "public"

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Oct 25, 2016

What happens if someone doesn't call base.OnDestory()?

@timGerken
Copy link
Contributor

timGerken commented Oct 25, 2016

I've also seen this pattern, in the (instance != null) case in Awake() call DestroyImmediate(this) just to be sure.

@maxouellet
Copy link
Author

@timGerken Yes it should be protected, the reason it is public is that when this was written at a time where there was a bug in Unity where protected inheritance caused a runtime exception. I'd fix it before submitting :)

If you don't call base.OnDestroy, you have a dangling static reference, which I think will eventually be cleaned up by the garbage collector. Realistically, this is only useful in cases where you want to destroy a Singleton before your app closes. That's already the case with the existing Singleton.

I'm not sure there is a need to call DestroyImmediate in this case, but not strongly opposed to it either.

@timGerken
Copy link
Contributor

Should this Singleton pattern survive Unity scene transitions?
If so, in Awake, it should also call DontDestroyOnLoad(this);

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Oct 25, 2016

@timGerken I think that should be optional.
Generally this is a separate script you just attach to your parent object anyway.

@StephenHodgson
Copy link
Contributor

@maxouellet I think documenting the required call to the base class should be good enough.

@aalmada
Copy link

aalmada commented Nov 11, 2016

I have a 3D model designed in CAD that is structured as a hierarchy with many leaf objects with a mesh and its collider. I want to be able to hand drag entire branches of this hierarchy instead of individual objects.
InputManager sends the messages to the focused object. I can make this work if I add a HandDraggable to each leaf object and set HostTransform to the parent branch node.
The issue is that adding HandDraggable to each leaf object is very tedious. Is there an alternative?

@maxouellet
Copy link
Author

maxouellet commented Nov 11, 2016

The events bubble up according to your game object hierarchy, so you don`t have to have the HandDraggable script (or any other script that implements the input interfaces) on the game object that has the collider. In your case, assuming you have something like the following:

Root
   Branch1
      Leaf1
      Leaf2
   Branch2
      Leaf3
      Leaf4

You can put the HandDraggable script on Branch1 and Branch2, and that will work with the individual colliders that are in their children.

The one exception to this is if the leaves are already handling the same input event. Each input callback passed in an *EventData object that as a Use() function. If something calls that Use() function, the event is considered handled, and thus it will stop bubbling up the game object hierarchy.

Let me know if that doesn`t handle your scenario!

@aalmada
Copy link

aalmada commented Nov 14, 2016

@maxouellet Works just like you explained. Thanks!

@aalmada
Copy link

aalmada commented Nov 16, 2016

@maxouellet KeywordManager doesn't have the same bubbling mechanism. It calls a specific method in a specific object. I have a workaround but I think it would be interesting to make all these managers consistent with how InputManager invokes events.

@maxouellet
Copy link
Author

@aalmada I agree that would be great, that work just hasn't been done yet, so I left the original classes in there. There are a few things that could be added when the need arises, notably:

  • Keyboard input routing
  • Controller input routing
  • Voice commands input routing (that one is particularly tricky to get right...)

I do not intent to address those soon due to lack of time, but will be happy to review anyone's changes that tries to address them. I think keyboard and controller should be relatively straightforward to do.

@aalmada
Copy link

aalmada commented Nov 17, 2016

@maxouellet fair enough

@aalmada
Copy link

aalmada commented Nov 18, 2016

I do not intent to address those soon due to lack of time, but will be happy to review anyone's changes that tries to address them.

@maxouellet Challenge accepted! ;)
While researching on how to extend the InputManager I noticed there are two different patterns. GazeManager is explicitly registered in the InputManager while all others, like GestureInput, derive from BaseInputSource. Why doesn't GazeManager follow the more generic second pattern?

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Nov 18, 2016

GazeManager is the one that handles most of the logic about which objects we're currently focused on, and many of these other classes utilize that information. If you look at GazeManager.RaycastUnityUI, you'll see we actually utilize the built in EventSystem too.

It seems that there's two ways input events have to be handled in order to get things to work correctly.

@aalmada
Copy link

aalmada commented Nov 18, 2016

@HodgsonSDAS Yes, I do understand that. My question is more like, is really GazeManager a special case? If not, the architecture would be somewhat simpler. Just food for thought. ;)

@StephenHodgson
Copy link
Contributor

Oh I agree simpler is better, I'll have to take a closer look at it later. nom nom nom.

@maxouellet
Copy link
Author

@aalmada Yeah, I didn't have enough time to figure out a great way to implement the GazeManager without breaking everything that depends on it being a singleton...

The long term solution I was thinking was that we should have a one or many FocusSources that register themselves with the InputManager. GazeManager would be one implementation of a FocusSource (a very important one that we might want to keep as a SIngleton), but you could have others (a 6DOF Vive controller, for example). Then, every input event would not only include the input source data, but also the FocusSource that triggered the event on the object.

@paseb
Copy link

paseb commented Nov 18, 2016

Well there goes my reply :), @maxouellet beat me to it. I was going to say what max wrote. Because we should support multiple input focus sources having them independent and register with input manager is a good choice. We internally separated out a users GazeFocus from the other FocusSources since where the head is looking tends to be complimentary to other focus sources. Examples would be gaze contextual voice commands, gaze based rendering and culling, etc.

@aalmada
Copy link

aalmada commented Nov 23, 2016

@maxouellet Is IInputSource.SupportedEvents used somewhere?

@maxouellet
Copy link
Author

@aalmada Not right now, it's there as a convenience in case someone ever needs it. Eventually, as need arises, we might also want to add an InputType enum that tells you whether the input is Hands, Controller or whatever else the Unity APIs currently support.

@aalmada
Copy link

aalmada commented Dec 7, 2016

There are a few things that could be added when the need arises, notably:

Keyboard input routing
Controller input routing
Voice commands input routing (that one is particularly tricky to get right...)

@maxouellet I was revisiting your post and thinking that the KeywordManager and the SpeechInputSource handle voice commands but also keyboard input. The issue is that they assume that there is always a voice command associated with a key. To make this fully agnostic and reusable maybe there should be the notion of a command that can be mapped to triggers from input sources.

@yacuzo
Copy link

yacuzo commented Dec 8, 2016

The diagrams in this discussion are great for learning the new system. Could they be put somewhere intuitive, and have the Input/readme.md link to them?

Edit: Good work on this new system! Also, upgrading from the old system was a lot less work than i thought it would be. I ended up with less code than before, yay.

@StephenHodgson
Copy link
Contributor

@yacuzo #376

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Dec 14, 2016

@maxouellet Does OnInputClicked happen before or after OnInputUp?

And is there a way to change the speed for clicking? I noticed that if I click and hold for a certain amount of time the OnInputClicked doesn't fire.

@maxouellet
Copy link
Author

@HodgsonSDAS OnInputClicked is not guaranteed to happen before or after OnInputUp. This is by design, because OnInputClicked is a OS-level gesture. The clicking speed is defined by the OS, so you can't modify it. It is also by design from the OS that if you click and hold for a certain amount of time, the OnInputClicked doesn't fire. This is all defined by the OS gestures: when you hold for a certain time, it ends up triggering a Hold or a Manipulation/Navigation event.

If you need to customize a click somehow (which I would not recommend, as this would go against the OS definition of a click and might break some user-specific accessibility settings down the line), you could write your own input source that has your own implementation of a click.

@StephenHodgson
Copy link
Contributor

StephenHodgson commented Dec 15, 2016

No worries, I just wanted to make sure I understand the timing. Thanks for the info.

@maxouellet
Copy link
Author

Closing this issue, as this has been integrated to HoloToolkit for a while now.

@basuarya
Copy link

basuarya commented Feb 7, 2017

@maxouellet Hello, I would like to know how to rotate a gameobject using ManipulationHandler. Please help? Can I use the same to rotate an object using pinch drag?

@vbielkin
Copy link

vbielkin commented Apr 5, 2017

@maxouellet, InputManager supports some of Unity UI Events (like ExecuteEvents.pointerClickHandler, ExecuteEvents.pointerEnterHandler, etc.) but there is nothing about ExecuteEvents.beginDragHandler and ExecuteEvents.dragHandler. As the result, using Unity ScrollBars and ScrollRect with HoloToolkit InputManager becomes impossible.
Are these Unity event handlers going to be supported in future?

@maxouellet
Copy link
Author

@vbielkin Support could be added by anyone who needs that functionality. The clean way to implement it would be to have InputManager interpret the gestures navigation event and send the appropriate drag events (like ExecuteEvents.beginDragHandler that you pointed out).

Alternatively, you could simply attach a script to your Unity scroll bars that has a reference to your Unity UI Scrollbar and implements the INavigationHandler interface. From that script, you could then smartly manipulate the Value of the scrollbar to scroll it appropriately based on the navigation input you receive.

I don't think I'll get to add native support for drag events to InputManager any time soon, but if someone wants to try it, they are welcome to do so! Otherwise, the alternate solution I proposed should be relatively simple to implement in a project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests