-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advanced WebXR Controller Input #392
Comments
From @fernandojsg on July 12, 2018 11:29 It's looking great Brandon, thanks for sharing it. Just few comments:
I know that all the controllers I have seen so far has just one of these, but do you think it could be useful to define them as an array as you do with
I believe it's valuable to have that feature, both for joystick and touchpad.
Currently we're using touchpads and joysticks values as [-1, 1] and analog buttons like triggers with [0, 1].
I'm happy to skip that too on the first iteration.
I know that we always try to define the APIs as simple as possible and remove any extra fancy feature that we can infer from it in order to simplify it. But I would like to hear opinions about adding explicit definitions for things like this when detecting the kind of value, or as my previous proposal for the "is clickable" value. As if we end up realizing that everyone using the API will need to do that type of tests by themselves maybe it could be worth to include an specific param for that? Thanks again for kicking off this! |
From @johnshaughnessy on July 13, 2018 18:44 Link for the lazy : In the video @toji linked, Nick Whiting talks about input handling in OpenXR at https://youtu.be/U-CpA5d9MjI?t=28m15s until ~36:40.
I am concerned that this approach removes the benefits of the OpenXR-like input system.
with code written like this:
I would love to write my web application as if it were an OpenXR application.
One way I imagine these all to be satisfied is as follows: When the browser starts, it gives the runtime an action manifest for all of its own actions:
This action manifest would allow the browser (being an OpenXR application), to allow the user to click on things in the browser chrome and open menus and such. A real example would include more than this. Here I wanted to show that the browser would specify the actions it needs to operate in two different action sets ( When the browser navigates to a web application ("Foo"), the web application signals to the browser that it wants to use OpenXR-like input. The web app supplies its action manifest, which the browser composes with its own. The browser then reregisters its actions with OpenXR, potentially appearing as a different application (It goes from being called "Browser" to "Browser-Foo" in OpenXR). I don't know if this is possible, or if there's some way to make this possible! The browser submits a new action manifest to OpenXR:
What I don't know is whether the browser can appear as a different application to the runtime for each web application that is running inside / on top of it. If there's a way to do that, then users can store/share their favorite bindings for web applications the same way they do for native applications. There are some obvious hurdles to this approach, assuming it's possible. For example, if switching tabs back and forth unregisters and reregisters the browser with OpenXR, then OpenXR handles to actions held by the web application may be invalidated, and there may be significant overhead in doing the switch. If there's not a way to do that, that is unfortunate. Perhaps the browser can to store this information on behalf of the user and suggest bindings to OpenXR whenever the user navigates to a new page, but this seems like a MUCH more difficult arrangement to pull off. In summary, I'm concerned that writing against
If we can find a way to share the benefits of the OpenXR runtime input system (_or an OpenXR-like input system), that would be ideal. From that perspective, potential answers to your questions:
The application specifies the need for a boolean action. The user can bind that action to the joystick click via the runtime.
Handled by the runtime as an analog value the user can bind to an analog action (or some filter).
The application declares an analog action with range [-1, 1] or [0, 1]. The runtime fulfills this need.
It would be nice to have. Replace Thank you @toji for opening up this discussion. I think this is a difficult problem and it would benefit a lot of people (myself included) if solved well. |
This is excellent feedback, @johnshaughnessy! Within the CG we've discussed a few of the items you've brought up, so I want to surface some of the discussion here to give more context. Please don't take it as a dismissal of your points, though! First off, I completely agree that a browser which you interact with in AR/VR would provide it's own OpenXR bindings for interactions such as clicking, scrolling, back, etc. The fact that OpenXR would make those remappable for the user is great for accessibility and forward compatibility, so wins all around! As for the page mappings themselves, OpenXR provides what it calls "Action sets" (See Page 29 of the GDC slides), which would typically be used in something like a game to provide a different set of mappings between, say, gameplay and menu modes. Here we could create different action sets for traditional browsing and WebXR content, which generally covers the concerns you had about making the browser appear as a different application. Given that, there's two approaches to how those action mappings could be applied: One is that we provide a single, static mapping for all WebXR content, which is basically the approach that my first post advocated for. We'd get a mapping that says something like:
And every WebXR page would use it. This does allow some remapping, in that you can set browser-wide changes for the various inputs, but you couldn't set a custom mapping for just, say, threejs.org. The second approach, which is what you've implicitly suggested, is to have a separate action mapping per page. This feels attractive at first glance, but it leads to some tough technical and privacy issues.
Beyond those issues, however, we'd also run into the problem that in order to expose mapping of this type to Javascript you'd likely have to either just expose the OpenXR symantic path syntax directly or do something that is trivially and directly transformable into it. That carries an implication that we view OpenXR as the canonical backend for WebXR which, name similarities aside, is simply not the case. In order to avoid that and be more platform agnostic we'd probably end up creating a syntax that would require browser developer intervention to enable new devices/paths as they became available, and that's not an improvement in any meaningful way over simply having some static input types pre-mapped. I haven't addressed every point you raised, but I'm going to have to leave it at that for now due to schedule constraints on my end. I'll try to leave additional comments soon to respond to anything I missed. Still, hopefully that gives some insight into some of the reasons we've been reluctant to fully embrace the OpenXR input model on the web so far. (I'm overall pretty positive about that model for more traditional apps. This is just one of those areas, as with so many other areas of computing, where the web is weird). |
From @AlbertoElias on July 14, 2018 20:39 Thanks for kicking this off @toji. I like the initial proposal and I like how a hand api would go side by side with this and controllers like Touch and Knuckles could have both. I do think it's important to have access to all possible information such as how far off the fingers are from the controller and haptics, which I think should at least have the same support as the current Gamepad API does as some WebVR sites are using it like Space Rocks. I also think it's important to not use the same API that OpenXR offers, even though it's very versatile and a nice abstraction. It doesn't feel very webby to me and hard for people to get into it. Maybe in the future it becomes the common way to interact with all possible XR input sources and at the point we can create an API like that, and as you're already looking into how this API works on top of the OpenXR one, it could be brought in more easily. |
From @thetuvix on August 14, 2018 17:14 Thanks for writing this up, @toji! While it would be nice to expose some action-like system in the future, I believe this more literal mapping is the right path for now, given the issues you discussed above. Some comments on the details of the proposal:
One of the key advantages of this approach vs. the current Gamepad API is to give strong names to the axes: If we believe the For example, the sample code currently checks for a 2-axis touchpad by seeing if the touchpad's value array has two axes. However, if a touchpad had only 1 axis, it's not clear if that would imply a horizontal x touchpad or a vertical y touchpad. By testing for well-known attributes, we can make this explicit and make the code easier to read: interface XRInputStateDouble {
readonly attribute boolean pressed;
readonly attribute boolean touched;
readonly attribute double value;
};
interface XRInputStateVector2 {
readonly attribute boolean pressed;
readonly attribute boolean touched;
readonly attribute double x;
readonly attribute double y;
};
interface XRControllerState {
readonly attribute XRInputStateDouble? trigger;
readonly attribute XRInputStateVector2? joystick;
readonly attribute XRInputStateVector2? touchpad;
readonly attribute XRInputStateDouble? grip;
readonly attribute XRInputStateDouble? menu;
readonly attribute XRInputStateDouble? a;
readonly attribute XRInputStateDouble? b;
};
partial interface XRInputSource {
readonly attribute XRControllerState? controllerState;
}; Combining explicit button attributes and the stronger state interfaces, this results in more readable WebXR input code: let inputSource = xrSession.getInputSources()[0];
if (inputSource.controllerState) {
// Is a controller with buttons and stuff!
let joystick = inputSource.controllerState.joystick;
if (joystick && joystick.value.x && joystick.value.y) {
// Has a 2-axis joystick!
PlayerMove(joystick.value.x, joystick.value.y);
}
let jumpButton;
if (inputSource.controllerState.a) {
jumpButton = inputSource.controllerState.a;
} else if (inputSource.controllerState.menu) {
jumpButton = inputSource.controllerState.menu;
}
if (jumpButton && jumpButton.pressed) {
PlayerJump();
}
// etc.
} Also, while doing the feature tests inline with the values feels very webby, we should decide what it means for |
From @AoiGhost on September 12, 2018 8:3 I think we will need hand centric options for things like leap motion. That said, it would have to track a hand skeleton, and I'm not sure how to handle things like the steamvr knuckles. That said, the possibility of other biometric data being read such as heartbeat tracking should be considered as well, maybe that should be it's own thing? If that were to be implemented, that would need a privacy/security perspective included and permissions for obvious reasons. |
At our last F2F it was agreed that this issue will need to be addressed in the timeline of the first version of WebXR to reach Recommendation status. As such, it has been moved into the WebXR repo. The tool used to migrate this issue has some unfortunate side-effects (issues show as filed by the person doing the move and add a From foo on date/time at the top). |
Thanks for the proposal and reconsidering the previous API. Super appreciated. After a quick review:
In A-Painter and Suprecraft for instance we rely on the Gamepad id to load custom models, show instructions, position UI and configure raycaster origins and angles according to each controller. Number of buttons, joysticks and touchpads is not a reliable way to identify a controller. e.g: Today's 3DOF controllers all have the same inputs (trigger, touchpad, button) An
Many applications often make different input choices for each hand like for instance A-Painter that uses one hand for the color palette and the other for the brush. Each hand has a different button mapping. Is it possible to add a
This is an interesting question. An application might want to map input based on button capabilities. In Supermedium for instance we map the browser menu to a long press on the Oculus Touch joystick. The logic now relies on the Gamepad id but a way to check for capabilities would be simpler and more elegant.
Adding another [0, 1] pair somewhere as @fernandojsg suggested will work.
I expect 0 to be the resting position. Joysticks and touchpads move from [-1,1] and buttons [0, 1]. I think it would be implied by the input name if I'm not missing anything.
We use controller vibration extensively in Supercraft / Supermedium. It makes a big difference increasing immersion and very useful to guide the user in non visual ways. We would love to help find a way to keep the functionality. |
I would like to propose this action-based like IDL for interface XRInputBinding {
readonly attribute DOMString binding;
readonly attribute DOMString[] actions;
readonly attribute DOMString valueType; // "boolean, double, Float32Array", "short"
}
partial interface XRInputSource {
readonly attribute DOMString id; // Like "OpenVR Controller", "Oculus Touch". We provide this info for users to show up matched 3d models.
readonly attribute sequence<XRInputBinding> bindings;
object getBindings(); // Output the binding information as JSON format.
// If it fails to bind actions, for example when there's no this binding, it returns false.
boolean registerActionBinding(DOMString action, DOMString binding);
boolean getAction(DOMString action);
double getAction(DOMString action);
short getAction(DOMString action);
Float32Array getAction(DOMString action);
void setAction(DOMString action, double value);
void setAction(DOMString action, boolean value);
void setAction(DOMString action, Float32Array value);
}; I was thinking my previous experience for implementating Gamepad API to expose events from VR controllers. We take a lot of efforts to handle the button/axis mapping for all possible controllers. So, I hope we can make function init() {
let input = xrSession.getInputSources()[0];
let bindings = input.bindings; // Getting all available bindings from the current input.
// The binding format would be
// "bodyPart: {hand, foot, etc}"/
// "side: {left or right}"/
// "from: {input or output}"/
// "item: {btn, axis, joystick, trackpad}"/
// "eventType: {click, touch, value}"
input.registerActionBinding("menu_select", "hand/left/input/btn_A/click");
input.registerActionBinding("trigger_press", "hand/left/input/trigger/value");
input.registerActionBinding("shoot", "hand/left/input/trigger/value");
input.registerActionBinding("pose_value", "hand/left/input/pose/value");
input.registerActionBinding("bone_count", "hand/left/input/skeleton/count");
input.registerActionBinding("bone_matrix", "hand/left/input/skeleton/transform");
input.registerActionBinding("haptics_value", "hand/left/output/haptics/value");
console.log(input.getBindings());
} After the registration, the bindings would look like as below,
Then, we can operate these actions in function game_loop() {
if (input.getAction("menu_select") == true) {
// ...
}
let trigger_value = input.getAction("trigger_value");
let pose = input.getAction("pose_value"); // Getting the pose position from the input.
// If we are using a hand-like controller, i.e Knuckles. We also have chance to get the skeletal matrix from the controller.
let bone_count = input.getAction("bone_count");
let bone_matrix = input.getAction("bone_matrix");
input.setAction("haptics_value", new Float32Array([0.1, 3000]));
} My idea is I would like to make the Web API as flexibility as possible, and we can take time for the binding path rule when there is a new hardware joins. |
Given the upcoming discussion of this issue at TPAC 2018 (which I'll try to attend remotely), I wanted to share an upcoming change in the way we handle cross-device input in hubs (a webvr app) as a case study in how some of these issues are playing out currently. Sorry to submit this last minute -- I didn't realize this issue was scheduled to be readdressed in the near future. While this example is not all-encompassing, I'm hoping that in capabilities and requirements it is representative of a variety of web apps we expect to see in the webvr space. Perhaps this example can help draw out some things that can make the next input-related Web API most desirable for users and developers. Not everything is covered here (e.g. we do not handle haptic feedback), and the pattern is admittedly intertwined with some application-specific code. I hope the signal can be made out clearly through the noise. One concern we have is that our pattern differs from others we've seen proposed in that actions don't belong to sets, and instead different bindings allow for actions to be bound however the user wishes. We know that it's very possible to make a mistake in customizing the text of the binding definitions "by hand" (and not thru a configuration tool or wizard, as is commonly done with other platforms). We don't yet understand is the implications of the subtle differences between our notions of bindings, actions, and sets are to those implemented by e.g. SteamVR API and others. So, below is what I wanted to share as a kind of case study. I hope it aids in the ongoing discussions and I am excited to learn more from everyone's experiences with this stuff!
|
Thanks @toji for the responses to my first chunk of feedback a few months ago. It helped me gain better awareness of those those problems you posed around privacy for bindings, being flexible to allow different webvr backends to power the user experience. As far as I can tell, nothing in the "case study" I referenced in the comment above would make it fundamentally incompatible with your initial proposal, namely
still wonder about two issues @dmarcos brought up about wanting to show controller models and also wanting to give specific haptic feedback when those capabilities are in the devices. |
Thanks everybody for the feedback. Just to make sure that the info does not get lost. The two outstanding requirements we care the most are:
|
One particular use case that needs this functionality is to allow expression with gestures in a multiuser networked environment. In this scenario, even if the UI only needs a “select” gesture, the other axis would be needed in order to render the representation of the users hands with their avatar visible by other users on the network. In Perhaps this could be made more explicit in the webidl for these axis, buttons, etc with an easily associated id value. |
This functionality has been added to the explainer, modulo a couple of questions that now have their own issues, and the spec changes are pending review (#553). Closing this in favor of the more granular issues. |
From @toji on July 11, 2018 21:53
Given feedback from multiple developers citing concerns about our initial plans for a limited input system, the WebXR community group wanted to re-open a discussion about how to handle more advanced input. We agreed on a recent call to put together a proposal for how such a system might work so that we can iterate on the design in public and gather feedback from relevant parties (such as the OpenXR working group and the W3C Gamepad CG).
With WebVR we had exposed VR controller state with extensions to the gamepad API. We feel that building this new advanced input on top of the Gamepad API as it exists today is problematic, though, for a couple of reasons:
That said, we don't want to re-invent wheels that we don't have to, so we're open to further discussing this proposal with the individuals maintaining the Gamepad API at the moment to see if there's a common ground that can be reached that isn't deterimental to this use case.
Proposed IDL
And some really brief sample code:
These snippets are not intended to be taken verbatim, but are meant to serve as a concrete starting point for further conversation.
Mapping to native APIs
One of our primary concerns when structuring this API is ensuring that it can work successfully on top of OpenXR, which we expect to power a non-trivial amount of the XR ecosystem at some future date. As explained in the Khronos group's GDC session, that API is currently planning on exposing an input system that revolves around binding actions to recommended paths (ie: "/user/hand/left/input/trigger/click"), which the system can then re-map as needed.
We would expect the above interface to be implented on top of an OpenXR-like system by creating actions that are a fairly literal description of the expected input type and map it to the "default" paths for that input type when available. (ie:
controllerState.trigger.value[0]
is backed by an action binding called "triggerValue" with a default path of "/user/hand//input/trigger/value") If the binding fails that particular input is set to null to indicate it's not present. This is a fairly ridgid use of the binding system, but does allow users to do some basic, browser-wide remapping when needed.For any other native API, the inputs are usually delivered as a simple struct of values, which are trivial to map to this type of interface.
Random thoughts/comments
value
array maps well to OpenXR's idea of actions as vectors. I like that the length makes it easy to test what kind of value you're dealing with: 0 == boolean, 1 == scalar, 2+ == vectorvalue
elements should all be normalized to a[-1, 1]
range where 0 is the neutral valuecontrollerState
name is very intentionally exclusive of things that are not controller-shaped. I would expect alternative tracked inputs like hands to ommit the controller state altogether and rely solely on select events for now. I think if we want more than that we'll need hand-centric input state.XRInputSource
, I like the idea of keeping those kinds of states bundled together in discreet, easily testable interfaces that live side-by-side under the input source.Questions
value
has a range of[-1, 1]
or[0, 1]
? Is it implied by the number of values? By the input name?Copied from original issue: immersive-web/proposals#17
The text was updated successfully, but these errors were encountered: