Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced WebXR Controller Input #392

Closed
toji opened this issue Sep 12, 2018 · 15 comments
Closed

Advanced WebXR Controller Input #392

toji opened this issue Sep 12, 2018 · 15 comments
Assignees
Milestone

Comments

@toji
Copy link
Member

toji commented Sep 12, 2018

From @toji on July 11, 2018 21:53

Given feedback from multiple developers citing concerns about our initial plans for a limited input system, the WebXR community group wanted to re-open a discussion about how to handle more advanced input. We agreed on a recent call to put together a proposal for how such a system might work so that we can iterate on the design in public and gather feedback from relevant parties (such as the OpenXR working group and the W3C Gamepad CG).

With WebVR we had exposed VR controller state with extensions to the gamepad API. We feel that building this new advanced input on top of the Gamepad API as it exists today is problematic, though, for a couple of reasons:

  • Developers frequently complained about the difficulty mapping multiple controllers using that method, and as a result we saw a lot of experiences that excluded all controllers that didn't have a specific name string.
  • OpenXR's proposed input model makes it effectively impossible to enumerate all available inputs on a controller (or, for that matter, all available controllers)
  • XR controllers appear to be converging on a small set of common layouts relatively quickly, which should allow us to expose something more limited in scope than the generic gamepad API with higher confidence than usual that it'll be relevant for multiple years to come.

That said, we don't want to re-invent wheels that we don't have to, so we're open to further discussing this proposal with the individuals maintaining the Gamepad API at the moment to see if there's a common ground that can be reached that isn't deterimental to this use case.

Proposed IDL

interface XRInputState {
  readonly attribute boolean pressed;
  readonly attribute boolean touched;
  readonly attribute FrozenArray<double> value;
};

interface XRControllerState {
  readonly attribute XRInputState? trigger;
  readonly attribute XRInputState? joystick;
  readonly attribute XRInputState? touchpad;
  readonly attribute XRInputState? grip;
  readonly attribute FrozenArray<XRInputState> buttons;
};

partial interface XRInputSource {
  readonly attribute XRControllerState? controllerState;
};

And some really brief sample code:

let inputSource = xrSession.getInputSources()[0];

if (inputSource.controllerState) {
  // Is a controller with buttons and stuff!

  let joystick = inputSource.controllerState.joystick;
  if (joystick && joystick.value.length == 2) {
    // Has a 2-axis joystick!
    PlayerMove(joystick.value[0], joystick.value[1]);
  }

  if (inputSource.controllerState.buttons.length > 0) {
    let button = inputSource.controllerState.buttons[0];
    if (button.pressed) {
      PlayerJump();
    }
  }
  
  // etc.
}

These snippets are not intended to be taken verbatim, but are meant to serve as a concrete starting point for further conversation.

Mapping to native APIs

One of our primary concerns when structuring this API is ensuring that it can work successfully on top of OpenXR, which we expect to power a non-trivial amount of the XR ecosystem at some future date. As explained in the Khronos group's GDC session, that API is currently planning on exposing an input system that revolves around binding actions to recommended paths (ie: "/user/hand/left/input/trigger/click"), which the system can then re-map as needed.

We would expect the above interface to be implented on top of an OpenXR-like system by creating actions that are a fairly literal description of the expected input type and map it to the "default" paths for that input type when available. (ie: controllerState.trigger.value[0] is backed by an action binding called "triggerValue" with a default path of "/user/hand//input/trigger/value") If the binding fails that particular input is set to null to indicate it's not present. This is a fairly ridgid use of the binding system, but does allow users to do some basic, browser-wide remapping when needed.

For any other native API, the inputs are usually delivered as a simple struct of values, which are trivial to map to this type of interface.

Random thoughts/comments

  • This does not replace the existing "select" events and it's variants, but instead is additive on top of that. We still expect many apps will want to use the simple "select"-based input method for the broadest possible compatibility, whereas thie proposal is only applicable to systems that use controller devices.
  • The value array maps well to OpenXR's idea of actions as vectors. I like that the length makes it easy to test what kind of value you're dealing with: 0 == boolean, 1 == scalar, 2+ == vector
  • value elements should all be normalized to a [-1, 1] range where 0 is the neutral value
  • You could feasibly map something like Knuckles finger tracking as a vector of 5 values in the grip input (1st value gives overall grip value for backwards compat.)
  • controllerState name is very intentionally exclusive of things that are not controller-shaped. I would expect alternative tracked inputs like hands to ommit the controller state altogether and rely solely on select events for now. I think if we want more than that we'll need hand-centric input state.
  • Following the previous item, and as we've discussed before, I think explicit hand skeletal poses are something that could exist as a separate entity on the XRInputSource, I like the idea of keeping those kinds of states bundled together in discreet, easily testable interfaces that live side-by-side under the input source.
  • For OpenXR action mappings in relation to the button array we'd probably just end up speculatively binding to a primary and secondary button (or A and B or whatever the path ends up being for the known controllers) and see if it takes. Not great, but functional. We would want to declare a way to determine the ordering in the spec, by the way. (Closest to the user's palm first or something)
  • I'm not advocating adding state change events to this model at this time, but we may want to do so in the future.

Questions

  • Can you detect that a joystick is clickable? Do you need to?
  • Some devices can detect how far a finger is off the input rather than just touched/not. Is that something we care about exposing here?
  • Do we need to declare if a value has a range of [-1, 1] or [0, 1]? Is it implied by the number of values? By the input name?
  • Do we care about haptics in this first pass? I'm leaning towards no for simplicity, but could easily be convinced otherwise.

Copied from original issue: immersive-web/proposals#17

@toji
Copy link
Member Author

toji commented Sep 12, 2018

From @fernandojsg on July 12, 2018 11:29

It's looking great Brandon, thanks for sharing it. Just few comments:

readonly attribute XRInputState? trigger;
readonly attribute XRInputState? joystick;
readonly attribute XRInputState? touchpad;

I know that all the controllers I have seen so far has just one of these, but do you think it could be useful to define them as an array as you do with buttons so we could support new fancy controllers in the future?
triggers[0] will still be the default trigger, but who knows maybe you could have an additional trigger for the middle finger or so.


Can you detect that a joystick is clickable? Do you need to?

I believe it's valuable to have that feature, both for joystick and touchpad.
I expect that it will works as it does right now am I correct? I mean If you are moving the joystick or using the touchpad without pressing, you'll get the axis values on the array of values and touched will be true, but pressed false, and once you click on it, you'll also set pressed to true.
Regarding detecting if it's clickable or not, what about adding a third value to the array. If we assume all the joysticks and touchpad has 2 axis (For 1 axis buttons like LT or RT on the xbox controller we will use the buttons array anyway), we could add a third one indicating that it's clickable and it will contains 0 or 1 depending on the state.


Do we need to declare if a value has a range of [-1, 1] or [0, 1]? Is it implied by the number of values? By the input name?

Currently we're using touchpads and joysticks values as [-1, 1] and analog buttons like triggers with [0, 1].
If we have boolean buttons with values 0 or 1, I'd expect analog buttons to be on the same range too.
Otherwise we should provide a way to detect if the button is analog or digital so you could map the [-1,1] range back to [0,1]. Otherwise you could end up mapping digital 0 & 1 to 0.5 and 1.


Do we care about haptics in this first pass? I'm leaning towards no for simplicity, but could easily be convinced otherwise.

I'm happy to skip that too on the first iteration.


The value array maps well to OpenXR's idea of actions as vectors. I like that the length makes it easy to test what kind of value you're dealing with: 0 == boolean, 1 == scalar, 2+ == vector

I know that we always try to define the APIs as simple as possible and remove any extra fancy feature that we can infer from it in order to simplify it. But I would like to hear opinions about adding explicit definitions for things like this when detecting the kind of value, or as my previous proposal for the "is clickable" value. As if we end up realizing that everyone using the API will need to do that type of tests by themselves maybe it could be worth to include an specific param for that?

Thanks again for kicking off this!

@toji
Copy link
Member Author

toji commented Sep 12, 2018

From @johnshaughnessy on July 13, 2018 18:44

Link for the lazy : In the video @toji linked, Nick Whiting talks about input handling in OpenXR at https://youtu.be/U-CpA5d9MjI?t=28m15s until ~36:40.

We would expect the above interface to be implented on top of an OpenXR-like system by creating actions that are a fairly literal description of the expected input type and map it to the "default" paths for that input type when available.

I am concerned that this approach removes the benefits of the OpenXR-like input system.
As an application developer, the device-abstraction provided by OpenXR allows me to write an application against named, typed actions.
I want to replace code written like this:

let button = inputSource.controllerState.buttons[0];
if (button.pressed) {
  PlayerJump();
}

with code written like this:

// In initialization code
const jumpActionHandle = Input.idForAction("/foo_platforming/in/jump");

// In game loop
if (Input.getBoolean(jumpActionHandle)){
  PlayerJump();
}

I would love to write my web application as if it were an OpenXR application.
However, there are problems with this, stemming from the fact that the browser is the OpenXR application, not the web app.
Here are some of those problems and potential solutions. I'd love to hear feedback on these.

  • The user wants to customize bindings on a per-(web)application basis.
  • The web application developer wants to define action sets and write logic against actions.
  • The browser must interface with OpenXR to define all possible actions.

One way I imagine these all to be satisfied is as follows:

When the browser starts, it gives the runtime an action manifest for all of its own actions:

{
  name: "/browser_default/in/pointer_position",
  type: "vector3"
},
{
  name: "/browser_default/in/pointer_orientation",
  type: "quaternion"
},
{
  name: "/browser_default/in/pointer_click",
  type: "boolean"
},
{
  name: "/browser_menu/in/up",
  type: "boolean"
},
{
  name: "/browser_menu/in/down",
  type: "boolean"
},
{
  name: "/browser_menu/in/select",
  type: "boolean"
},
{
  name: "/browser_menu/in/cancel",
  type: "boolean"
},

This action manifest would allow the browser (being an OpenXR application), to allow the user to click on things in the browser chrome and open menus and such. A real example would include more than this. Here I wanted to show that the browser would specify the actions it needs to operate in two different action sets ("browser_default" and "browser_menu").

When the browser navigates to a web application ("Foo"), the web application signals to the browser that it wants to use OpenXR-like input. The web app supplies its action manifest, which the browser composes with its own. The browser then reregisters its actions with OpenXR, potentially appearing as a different application (It goes from being called "Browser" to "Browser-Foo" in OpenXR). I don't know if this is possible, or if there's some way to make this possible! The browser submits a new action manifest to OpenXR:

{
  name: "/browser_default/in/pointer_position",
  type: "vector3"
},
...
{
  name: "/browser_menu/in/cancel",
  type: "boolean"
},
{
  name: "/foo_menu/in/up",
  type: "boolean"
},
{
  name: "/foo_menu/in/down",
  type: "boolean"
},
{
  name: "/foo_menu/in/select",
  type: "boolean"
},
{
  name: "/foo_menu/in/cancel",
  type: "boolean"
},
{
  name: "/foo_platforming/in/jump",
  type: "boolean"
}
...

What I don't know is whether the browser can appear as a different application to the runtime for each web application that is running inside / on top of it. If there's a way to do that, then users can store/share their favorite bindings for web applications the same way they do for native applications. There are some obvious hurdles to this approach, assuming it's possible. For example, if switching tabs back and forth unregisters and reregisters the browser with OpenXR, then OpenXR handles to actions held by the web application may be invalidated, and there may be significant overhead in doing the switch.

If there's not a way to do that, that is unfortunate. Perhaps the browser can to store this information on behalf of the user and suggest bindings to OpenXR whenever the user navigates to a new page, but this seems like a MUCH more difficult arrangement to pull off.

In summary, I'm concerned that writing against inputSource.controllerState.buttons[0] will

  1. leave users in a place where they store N custom bindings for their browser, one for each web application they use.
  2. not allow developers to write "input agnostic" code and leaves them to figure out and implement a pattern of using actions / action sets on their own.

If we can find a way to share the benefits of the OpenXR runtime input system (_or an OpenXR-like input system), that would be ideal.

From that perspective, potential answers to your questions:

Can you detect that a joystick is clickable? Do you need to?

The application specifies the need for a boolean action. The user can bind that action to the joystick click via the runtime.

Some devices can detect how far a finger is off the input rather than just touched/not. Is that something we care about exposing here?

Handled by the runtime as an analog value the user can bind to an analog action (or some filter).

Do we need to declare if a value has a range of [-1, 1] or [0, 1]? Is it implied by the number of values? By the input name?

The application declares an analog action with range [-1, 1] or [0, 1]. The runtime fulfills this need.

Do we care about haptics in this first pass? I'm leaning towards no for simplicity, but could easily be convinced otherwise.

It would be nice to have. Replace "/in/" with "/out/" in those action manifests above to have a named haptic action the application can use to signal to the runtime when a vibration should occur.

Thank you @toji for opening up this discussion. I think this is a difficult problem and it would benefit a lot of people (myself included) if solved well.

@toji
Copy link
Member Author

toji commented Sep 12, 2018

This is excellent feedback, @johnshaughnessy! Within the CG we've discussed a few of the items you've brought up, so I want to surface some of the discussion here to give more context. Please don't take it as a dismissal of your points, though!

First off, I completely agree that a browser which you interact with in AR/VR would provide it's own OpenXR bindings for interactions such as clicking, scrolling, back, etc. The fact that OpenXR would make those remappable for the user is great for accessibility and forward compatibility, so wins all around!

As for the page mappings themselves, OpenXR provides what it calls "Action sets" (See Page 29 of the GDC slides), which would typically be used in something like a game to provide a different set of mappings between, say, gameplay and menu modes. Here we could create different action sets for traditional browsing and WebXR content, which generally covers the concerns you had about making the browser appear as a different application.

Given that, there's two approaches to how those action mappings could be applied: One is that we provide a single, static mapping for all WebXR content, which is basically the approach that my first post advocated for. We'd get a mapping that says something like:

[{
 actionName: "triggerClick",
 defaultBinding: "/user/hand/left/input/trigger/click"
 valueType: "boolean",
},
{
 actionName: "triggerValue",
 defaultBinding: "/user/hand/left/input/trigger/value"
 valueType: "scalar",
},
/* etc... */
]

And every WebXR page would use it. This does allow some remapping, in that you can set browser-wide changes for the various inputs, but you couldn't set a custom mapping for just, say, threejs.org.

The second approach, which is what you've implicitly suggested, is to have a separate action mapping per page. This feels attractive at first glance, but it leads to some tough technical and privacy issues.

  • Is the action set managed per-origin or per-url?
    • Allowing pages to set their own native action set name is a non-starter because of namespace conflicts, so it has to be tied to the domain somehow.
    • If it's per-origin then you're going to make life very hard on portal sites that collect a lot of experiences in a single location.
    • If it's per-url, you have an issue now differentiating between pages that use query args. is example.com/xr-stuff.html?state=foo the same app as example.com/xr-stuff.html?state=bar? Usually yes, sometimes no.
  • In both cases you have an issue of privacy, in that the remapping utility has now become a secondary browser history that captures every WebXR-enabled page you visit. (Or at least every domain you visit, which isn't a whole lot better)
  • Or, alternately, we somehow prevent those action sets from being captured by the remapping utility, and now you've lost one of the big benefits of using action mapping in the first place.
  • Or, lets say that you obfuscate the URLs. Now there's new problems: You've made it significantly harder to figure out what action set you want to remap, and you've filled your remapping utility with lots of junk entries from sites you only intended to visit once. Plus you still have an issue with the action names themselves being potentially privacy-sensitive. "Teleport" is reasonably generic, but "igniteLightsaber" is highly suggestive that you're playing a Star Wars game. I'll allow you to extrapolate how this applies to adult sites.

Beyond those issues, however, we'd also run into the problem that in order to expose mapping of this type to Javascript you'd likely have to either just expose the OpenXR symantic path syntax directly or do something that is trivially and directly transformable into it. That carries an implication that we view OpenXR as the canonical backend for WebXR which, name similarities aside, is simply not the case. In order to avoid that and be more platform agnostic we'd probably end up creating a syntax that would require browser developer intervention to enable new devices/paths as they became available, and that's not an improvement in any meaningful way over simply having some static input types pre-mapped.

I haven't addressed every point you raised, but I'm going to have to leave it at that for now due to schedule constraints on my end. I'll try to leave additional comments soon to respond to anything I missed. Still, hopefully that gives some insight into some of the reasons we've been reluctant to fully embrace the OpenXR input model on the web so far. (I'm overall pretty positive about that model for more traditional apps. This is just one of those areas, as with so many other areas of computing, where the web is weird).

@toji
Copy link
Member Author

toji commented Sep 12, 2018

From @AlbertoElias on July 14, 2018 20:39

Thanks for kicking this off @toji. I like the initial proposal and I like how a hand api would go side by side with this and controllers like Touch and Knuckles could have both.

I do think it's important to have access to all possible information such as how far off the fingers are from the controller and haptics, which I think should at least have the same support as the current Gamepad API does as some WebVR sites are using it like Space Rocks.

I also think it's important to not use the same API that OpenXR offers, even though it's very versatile and a nice abstraction. It doesn't feel very webby to me and hard for people to get into it. Maybe in the future it becomes the common way to interact with all possible XR input sources and at the point we can create an API like that, and as you're already looking into how this API works on top of the OpenXR one, it could be brought in more easily.

@toji
Copy link
Member Author

toji commented Sep 12, 2018

From @thetuvix on August 14, 2018 17:14

Thanks for writing this up, @toji! While it would be nice to expose some action-like system in the future, I believe this more literal mapping is the right path for now, given the issues you discussed above.

Some comments on the details of the proposal:

readonly attribute FrozenArray<XRInputState> buttons;

One of the key advantages of this approach vs. the current Gamepad API is to give strong names to the axes: trigger, joystick, etc. We should explore doing the same for buttons such as menu as well, especially since what is an "axis" and what is a "button" is a squishy line given that controls like the "grip" can be touch-sensitive or not or have analog triggers or not depending on the particular controller. If we just define a flat array, I worry that various UAs and XR platforms will diverge in the index they give to equivalent controls, which will get us back to where are today with the Gamepad API in WebVR.

If we believe the buttons array was giving us an escape hatch for more exotic XR controllers (e.g. an XR belt with 10 buttons), we should define that escape hatch more explicitly. For example, perhaps next to the well-known attributes like trigger, we define a controls dictionary, which has the same well-known attributes as keys, along with an ability for UAs to expose other keys with some UA-specific prefix to avoid collisions. If we don't like this due to the possibility for divergence across UAs, we should then think carefully about exposing unspecified buttons values, which will likely lead to the same divergence, but with numbers instead of string keys.

For example, the sample code currently checks for a 2-axis touchpad by seeing if the touchpad's value array has two axes. However, if a touchpad had only 1 axis, it's not clear if that would imply a horizontal x touchpad or a vertical y touchpad. By testing for well-known attributes, we can make this explicit and make the code easier to read:

interface XRInputStateDouble {
  readonly attribute boolean pressed;
  readonly attribute boolean touched;
  readonly attribute double value;
};

interface XRInputStateVector2 {
  readonly attribute boolean pressed;
  readonly attribute boolean touched;
  readonly attribute double x;
  readonly attribute double y;
};

interface XRControllerState {
  readonly attribute XRInputStateDouble? trigger;
  readonly attribute XRInputStateVector2? joystick;
  readonly attribute XRInputStateVector2? touchpad;
  readonly attribute XRInputStateDouble? grip;
  readonly attribute XRInputStateDouble? menu;
  readonly attribute XRInputStateDouble? a;
  readonly attribute XRInputStateDouble? b;
};

partial interface XRInputSource {
  readonly attribute XRControllerState? controllerState;
};

Combining explicit button attributes and the stronger state interfaces, this results in more readable WebXR input code:

let inputSource = xrSession.getInputSources()[0];

if (inputSource.controllerState) {
  // Is a controller with buttons and stuff!

  let joystick = inputSource.controllerState.joystick;
  if (joystick && joystick.value.x && joystick.value.y) {
    // Has a 2-axis joystick!
    PlayerMove(joystick.value.x, joystick.value.y);
  }

  let jumpButton;
  if (inputSource.controllerState.a) {
    jumpButton = inputSource.controllerState.a;
  } else if (inputSource.controllerState.menu) {
    jumpButton = inputSource.controllerState.menu;
  }
  
  if (jumpButton && jumpButton.pressed) {
    PlayerJump();
  }
  
  // etc.
}

Also, while doing the feature tests inline with the values feels very webby, we should decide what it means for inputSource.controllerState to be available or not. If a controller is paired but is not sending data at the moment, can I still inspect the truthiness of its attributes to see what controls will be available? Or should a UA just simplify things for apps and give some default values for any controller that is enumerating?

@toji
Copy link
Member Author

toji commented Sep 12, 2018

From @AoiGhost on September 12, 2018 8:3

I think we will need hand centric options for things like leap motion. That said, it would have to track a hand skeleton, and I'm not sure how to handle things like the steamvr knuckles.

That said, the possibility of other biometric data being read such as heartbeat tracking should be considered as well, maybe that should be it's own thing? If that were to be implemented, that would need a privacy/security perspective included and permissions for obvious reasons.

@NellWaliczek
Copy link
Member

At our last F2F it was agreed that this issue will need to be addressed in the timeline of the first version of WebXR to reach Recommendation status. As such, it has been moved into the WebXR repo. The tool used to migrate this issue has some unfortunate side-effects (issues show as filed by the person doing the move and add a From foo on date/time at the top).
Thanks for your patience!

@dmarcos
Copy link
Contributor

dmarcos commented Sep 13, 2018

Thanks for the proposal and reconsidering the previous API. Super appreciated. After a quick review:

  • Will there be anyway to identify the controller? Oculus Touch vs. Vive vs. Go... similar to the Gamepad API id?

In A-Painter and Suprecraft for instance we rely on the Gamepad id to load custom models, show instructions, position UI and configure raycaster origins and angles according to each controller.

Number of buttons, joysticks and touchpads is not a reliable way to identify a controller. e.g: Today's 3DOF controllers all have the same inputs (trigger, touchpad, button)

An id attribute on the XRInputSource object would work but not a fan of the Gamepad id string.

Many applications often make different input choices for each hand like for instance A-Painter that uses one hand for the color palette and the other for the brush. Each hand has a different button mapping. Is it possible to add a hand attribute to the XRInputSource object?

Can you detect that a joystick is clickable? Do you need to?

This is an interesting question. An application might want to map input based on button capabilities. In Supermedium for instance we map the browser menu to a long press on the Oculus Touch joystick. The logic now relies on the Gamepad id but a way to check for capabilities would be simpler and more elegant.

Some devices can detect how far a finger is off the input rather than just touched/not. Is that something we care about exposing here?

Adding another [0, 1] pair somewhere as @fernandojsg suggested will work.

Do we need to declare if a value has a range of [-1, 1] or [0, 1]? Is it implied by the number of values? By the input name?

I expect 0 to be the resting position. Joysticks and touchpads move from [-1,1] and buttons [0, 1]. I think it would be implied by the input name if I'm not missing anything.

Do we care about haptics in this first pass? I'm leaning towards no for simplicity, but could easily be convinced otherwise.

We use controller vibration extensively in Supercraft / Supermedium. It makes a big difference increasing immersion and very useful to guide the user in non visual ways. We would love to help find a way to keep the functionality.

@daoshengmu
Copy link

daoshengmu commented Oct 18, 2018

I would like to propose this action-based like IDL for XRInput

interface XRInputBinding {
  readonly attribute DOMString binding;
  readonly attribute DOMString[] actions;
  readonly attribute DOMString valueType; // "boolean, double, Float32Array", "short"
}

partial interface XRInputSource {
  readonly attribute DOMString id;  // Like "OpenVR Controller", "Oculus Touch". We provide this info for users to show up matched 3d models.
  readonly attribute sequence<XRInputBinding> bindings;
  object getBindings();  // Output the binding information as JSON format.

  // If it fails to bind actions, for example when there's no this binding, it returns false.
  boolean registerActionBinding(DOMString action, DOMString binding);

  boolean getAction(DOMString action);
  double getAction(DOMString action);
  short getAction(DOMString action);
  Float32Array getAction(DOMString action);
  
  void setAction(DOMString action, double value);
  void setAction(DOMString action, boolean value);
  void setAction(DOMString action, Float32Array value);
};

I was thinking my previous experience for implementating Gamepad API to expose events from VR controllers. We take a lot of efforts to handle the button/axis mapping for all possible controllers. So, I hope we can make Gamepad API take back its job as a low-level hardware API. That means it will only expose the raw data information from hardware directly. Then, we will propose a more flexible high-level API for VR inputs. So, I am thinking action-based API is a good way to go. We will let users register their actions with the bindings that browsers generate from the hardware in the beginning.

function init() {
  let input = xrSession.getInputSources()[0];
  let bindings = input.bindings;  // Getting all available bindings from the current input.

  // The binding format would be
  // "bodyPart: {hand, foot, etc}"/
  // "side: {left or right}"/
  // "from: {input or output}"/
  // "item: {btn, axis, joystick, trackpad}"/
  // "eventType: {click, touch, value}"
  input.registerActionBinding("menu_select", "hand/left/input/btn_A/click");
  input.registerActionBinding("trigger_press", "hand/left/input/trigger/value");
  input.registerActionBinding("shoot", "hand/left/input/trigger/value");
  input.registerActionBinding("pose_value", "hand/left/input/pose/value");
  input.registerActionBinding("bone_count", "hand/left/input/skeleton/count");
  input.registerActionBinding("bone_matrix", "hand/left/input/skeleton/transform");
  input.registerActionBinding("haptics_value", "hand/left/output/haptics/value");

  console.log(input.getBindings());
}

After the registration, the bindings would look like as below,

{
  { 
    "binding", "hand/left/input/btn_A/click",
    "actions" : ["menu_select"],
    "valueType", "boolean"
  },
  { 
    "binding", "hand/left/input/btn_A/touched",  // We didn't register it, so it shows no action.
    "valueType", "boolean"
  },
  {
    "binding", "hand/left/input/trigger/value",
    "actions" : ["trigger_press", "shoot"],  // `actions` is an array, that means the binding can be bound to multiple actions. 
    "valueType", "double"
  },
  {
    "binding", "hand/left/input/pose/value",
    "actions" : ["pose_value"],
    "valueType", "Float32Array"
  },
  {
    "binding", "hand/left/input/bone/count",
    "actions" : ["bone_count"],
    "valueType", "short"
  },
  {
    "binding", "hand/left/input/bone/transform",
    "actions" : ["bone_matrix"],
    "valueType", "Float32Array"
  },
  {
    "binding", "hand/left/output/haptics/value",
    "actions" : ["haptics_value"],
    "valueType", "Float32Array"
  }
}

Then, we can operate these actions in game_loop()

function game_loop() {
  if (input.getAction("menu_select") == true) {
    // ...
  }

  let trigger_value = input.getAction("trigger_value");
  let pose = input.getAction("pose_value");  // Getting the pose position from the input.

  // If we are using a hand-like controller, i.e Knuckles. We also have chance to get the skeletal matrix from the controller.
  let bone_count = input.getAction("bone_count");
  let bone_matrix = input.getAction("bone_matrix");
  
  input.setAction("haptics_value", new Float32Array([0.1, 3000]));
}

My idea is I would like to make the Web API as flexibility as possible, and we can take time for the binding path rule when there is a new hardware joins.

@johnshaughnessy
Copy link

johnshaughnessy commented Oct 25, 2018

Given the upcoming discussion of this issue at TPAC 2018 (which I'll try to attend remotely), I wanted to share an upcoming change in the way we handle cross-device input in hubs (a webvr app) as a case study in how some of these issues are playing out currently. Sorry to submit this last minute -- I didn't realize this issue was scheduled to be readdressed in the near future.

While this example is not all-encompassing, I'm hoping that in capabilities and requirements it is representative of a variety of web apps we expect to see in the webvr space. Perhaps this example can help draw out some things that can make the next input-related Web API most desirable for users and developers.

Not everything is covered here (e.g. we do not handle haptic feedback), and the pattern is admittedly intertwined with some application-specific code. I hope the signal can be made out clearly through the noise.

One concern we have is that our pattern differs from others we've seen proposed in that actions don't belong to sets, and instead different bindings allow for actions to be bound however the user wishes. We know that it's very possible to make a mistake in customizing the text of the binding definitions "by hand" (and not thru a configuration tool or wizard, as is commonly done with other platforms). We don't yet understand is the implications of the subtle differences between our notions of bindings, actions, and sets are to those implemented by e.g. SteamVR API and others.

So, below is what I wanted to share as a kind of case study. I hope it aids in the ongoing discussions and I am excited to learn more from everyone's experiences with this stuff!

@johnshaughnessy
Copy link

johnshaughnessy commented Oct 25, 2018

Thanks @toji for the responses to my first chunk of feedback a few months ago. It helped me gain better awareness of those those problems you posed around privacy for bindings, being flexible to allow different webvr backends to power the user experience.

As far as I can tell, nothing in the "case study" I referenced in the comment above would make it fundamentally incompatible with your initial proposal, namely

we provide a single, static mapping for all WebXR content

still wonder about two issues @dmarcos brought up about wanting to show controller models and also wanting to give specific haptic feedback when those capabilities are in the devices.

@NellWaliczek NellWaliczek modified the milestones: CR for 1.0, Jan '19 F2F Nov 7, 2018
@dmarcos
Copy link
Contributor

dmarcos commented Nov 13, 2018

Thanks everybody for the feedback. Just to make sure that the info does not get lost. The two outstanding requirements we care the most are:

  • Ability to identify a specific controller (Oculus Touch vs Quest vs Vive...). Needed to load custom models, rearrange UI elements according to controller geometry, display instructions, tutorials... Now we use the Gamepad API id for that.
  • Haptics.

@NellWaliczek NellWaliczek modified the milestones: Jan '19 F2F, FPWD Jan 4, 2019
@NellWaliczek NellWaliczek self-assigned this Jan 4, 2019
@cwilso cwilso modified the milestones: FPWD, Next Working Draft Jan 9, 2019
@AdaRoseCannon AdaRoseCannon added the agenda Request discussion in the next telecon/FTF label Jan 9, 2019
@NellWaliczek NellWaliczek removed their assignment Jan 17, 2019
@kearwood
Copy link
Contributor

One particular use case that needs this functionality is to allow expression with gestures in a multiuser networked environment. In this scenario, even if the UI only needs a “select” gesture, the other axis would be needed in order to render the representation of the users hands with their avatar visible by other users on the network.

In
#336
I propose that a json file stored with the controller models on the CDN could reference the input values in this interface while mapping them to bone transforms within a skinned mesh model.

Perhaps this could be made more explicit in the webidl for these axis, buttons, etc with an easily associated id value.

@NellWaliczek NellWaliczek removed the agenda Request discussion in the next telecon/FTF label Feb 13, 2019
@toji
Copy link
Member Author

toji commented Apr 1, 2019

This functionality has been added to the explainer, modulo a couple of questions that now have their own issues, and the spec changes are pending review (#553). Closing this in favor of the more granular issues.

@toji toji closed this as completed Apr 1, 2019
@toji
Copy link
Member Author

toji commented Apr 1, 2019

Ah, yes. @ddorwin helpfully suggested that I should note that the solution we ultimately went with is not what was described above but instead a Gamepad-API centric approach that's detailed in #499

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants