Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indicating how to render an input source model #336

Closed
toji opened this issue Mar 25, 2018 · 5 comments
Closed

Indicating how to render an input source model #336

toji opened this issue Mar 25, 2018 · 5 comments
Assignees
Milestone

Comments

@toji
Copy link
Member

toji commented Mar 25, 2018

The topic of how/if we indicate to developers what to render for handheld input sources has come up repeatedly. (For example, see comments from @Artyom17 here and @leweaver here

It was previously my feeling that we should omit any information about controller models for V1 of the API and try to address it more robustly in a followup. But Lewis' prototype revealed and interesting quirk in that AR headsets like HoloLens may need to expose as 'hand' input source that has a valid gripMatrix, and yet not want a generic controller model rendered over the top of the user's actual hand. This makes a lot of sense, and it's probably best to address that scenario early.

We could easily handle that specific case with a renderable boolean on the XRInputSource. However since we know that the request to give some sort of indication of the controller type is a popular one it may make more sense to address both cases with a single mechanism.

It would be nice to expose a renderable mesh directly to the page, but there's significant logistical issues to doing so. Where do the meshes come from? What format is it delivered in? How should it be animated? Etc. It seems more realistic to instead expose some form of well-formatted ID to indicate what type of device, if any, should be rendered.

In WebVR the Gamepad id served this purpose, but there were several issues that made it difficult to use:

  • The name needed to serve double duty as a human readable string and mesh ID.
  • Different browsers exposed different strings for the same controllers.
  • Hands would have had to have been given a name as well, making them more error prone to filter from not rendering.

If we're going to expose something similar we can and should do better.

For the sake of discussion I'm going to assume that we'd expose a 'renderModelName' string attribute on XRInputSource. (Name subject to bikeshedding) I'd propose that it have the following properties:

  • If the input source should not be rendered (hands in AR) it should be empty string.
  • If the input source should be rendered, but for whatever reason the device type isn't known a predetermined string ('unknown device'?) should be used.
  • Ideally we want to be able to use strings surfaced by the underlying APIs when available, but we'd want to do so in a way that at least attempts to normalize it. Things like specifying that the names should always be lower case, and that they should prefer a format of 'vendor, model' would be a start.
  • The string shouldn't contain information that would be duplicated in the XRInputSource or XRInputPose. Specifically, something like handedness should be omitted. (So 'oculus, touch' rather than 'oculus, left touch' even though the left and right controllers are different devices with different meshes.)
  • If needed we may have to maintain a table of canonical strings for various devices.

The end goal being to make it as practical as possible for developers to maintain a CDN of controller meshes that work cross-browser.

Also, this is prime fingerprinting material, so we'd want to take steps to mitigate that. To that end I think we should enforce that XRInputSources are only allowed to have a renderModelName of empty string or unknown in a non-exclusive session. That way there's at least a requirement that the page have been given a user gesture before it can start delivering fingerprintable information. And if there was a particular browser that was still concerned about exposing that information they could easily report empty string or 'unknown' in all cases.

That's my initial thoughts on the matter, curious what others think and if anyone has more concrete suggestions for ensuring stable controller names across implementations.

@thetuvix
Copy link
Contributor

thetuvix commented Mar 27, 2018

You mentioned that one tension with Gamepad's id was that it tried to serve as both a human-readable name and as a machine-readable key for lookup purposes.

To break out of that here and help with normalization, perhaps we just double down on the purpose of this attribute being specifically for render model lookup in a CDN or other asset library. Especially if we intend to return null or empty string when the controller can be seen, apps can't count on this name to be a human-readable source name to show in a list, since on AR headsets you'd get no string for the list. Therefore, we can go all the way in picking a canonical naming pattern here that optimizes for lookup without being human-readable.

Strawman suggestion:

  • For AR sources where the UA discourages rendering: empty string
  • For VR controllers: <vendor-id>-<product-id> (e.g. 045E-065D)
  • For VR hands (e.g. Leap Motion Orion): hand (app can use any preferred hand model to render a hand at gripMatrix)

If we can agree across vendors that a controller with a given VID+PID+handedness will always have the same physical form (we've committed to this for Windows MR controllers), this could provide a reliable id by which to index into a CDN to find a left and right render model.

For today's WebVR input, Gamepad.id serves one other hacky purpose: it also provides a prefix like "Spatial Controller" that gives a hint to the app what the button/axis mapping is for that controller. That hint is important even just for controller rendering, so that the WebVR middleware knows how to articulate the parts of the controller based on the underlying VR SDK's mapping for Gamepad buttons and axes. That's why the Babylon.js CDN path for Microsoft's 045E-065D controller, for example, sits within a "microsoft" directory next to "oculus" and "vive" directories: the internal node hierarchy of the glTF model is particular to the well-known mapping for a Windows MR controller.

Since the new simplified input model pares all of this back to just select events, apps won't be able to fully articulate the render model yet anyway, and so a CDN would just need to establish a common format for a model with its origin at gripMatrix that can perhaps articulate its select control. Whatever full input system we introduce down the line will then take as a key design goal full controller representation without the fragmentation we have in WebVR today, which can hopefully then buy middleware out of having to separate "microsoft" models from "oculus" models, relying instead just on the VID/PID lookup.

@Artyom17
Copy link

Artyom17 commented Apr 3, 2018

Thanks, Brandon and Alex, happy to see movement in the right direction ;) I like the idea with "" identifier (without a dash, case insensitive too). It will not reduce fingerprinting, but it will eliminate the issue we recently had, when accidentally the name of the controller was changed from "Gear VR controller" to "GearVR controller" and some frameworks stopped working with it in our browser.
Another question: are we going to include the "known -to-actual name" directory into the standard and how it is going to be updated?
How exactly we (or who else?) going to maintain the CDN with the gltf (?) models of the controllers?

@NellWaliczek
Copy link
Member

Related to issue #392

@NellWaliczek NellWaliczek added this to the TPAC 2018 milestone Sep 12, 2018
@NellWaliczek NellWaliczek modified the milestones: TPAC 2018, FPWD for 1.0 Oct 5, 2018
@NellWaliczek NellWaliczek modified the milestones: FPWD for 1.0, CR for 1.0 Oct 12, 2018
@NellWaliczek NellWaliczek modified the milestones: CR for 1.0, Jan '19 F2F Nov 7, 2018
@NellWaliczek NellWaliczek modified the milestones: Jan '19 F2F, FPWD Jan 4, 2019
@cwilso cwilso modified the milestones: FPWD, Next Working Draft Jan 9, 2019
@cwilso cwilso added the agenda Request discussion in the next telecon/FTF label Jan 9, 2019
@NellWaliczek NellWaliczek removed their assignment Jan 17, 2019
@kearwood
Copy link
Contributor

May I add that there may be multiple permutations of meshes representing a single physical. In particular, there may be a representation of a users’ hand (eg with Oculus touch), another representing just the controller, and perhaps one with both combined.

Perhaps various LOD levels may be needed in the case where a multi user environment includes representation of the controllers or hands from other users on the network.

Expressing the whole taxonomy in a single string could lead to something like the X logical font description string, which IMHO is a bit confusing:

https://wiki.archlinux.org/index.php/X_Logical_Font_Description

To ensure that this is extensible, could the Gamepad.id rather reference a json formatted file unique to each vendor+device combo?

In addition to providing the metadata informing the selection of a model to display, perhaps the json file could also assist in mapping input axis/button indices to bone transforms / slerps for articulated skinned mesh models.

@NellWaliczek NellWaliczek removed the agenda Request discussion in the next telecon/FTF label Feb 13, 2019
@cwilso cwilso modified the milestones: Next Working Draft, CR for 1.0 Feb 27, 2019
@toji toji modified the milestones: CR for 1.0, Next Working Draft Feb 27, 2019
@toji
Copy link
Member Author

toji commented Apr 1, 2019

Issue of HOW the models are selected is resolved: It will be communicated via the Gamepad.id. The exact formatting of that string is still in question, and covered in #550. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants