New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AR extensions and modifications #254

Closed
dmarcos opened this Issue Jul 5, 2017 · 26 comments

Comments

Projects
None yet
@dmarcos
Contributor

dmarcos commented Jul 5, 2017

I would like to start a conversation on what new APIs or modifications to the existing WebVR spec we would need to cover AR use cases. Some work has been done to make coordinates systems more flexible. AR tracking systems calculate a head pose similar to VR headsets. I think we should be covered with the current API. What other things do we need that are specific to AR?

  • A video frame synced with each pose?
  • Information about the environment: light estimation, detected planes, environment mesh, depth information...

I CC some people that have been thinking about this @toji @judax @fernandojsg @blairmacintyre @adaroseedwards

@dmarcos dmarcos added the enhancement label Jul 5, 2017

@huningxin

This comment has been minimized.

Show comment
Hide comment
@huningxin

huningxin Jul 5, 2017

Contributor

Please add me to the CC list. I am very interested in this topic. And I am editing mediacapture-depth which exposes depth information through getUserMedia. Providing pose synced depth video frame might be useful for AR usage.

Contributor

huningxin commented Jul 5, 2017

Please add me to the CC list. I am very interested in this topic. And I am editing mediacapture-depth which exposes depth information through getUserMedia. Providing pose synced depth video frame might be useful for AR usage.

@dmarcos

This comment has been minimized.

Show comment
Hide comment
@dmarcos

dmarcos Jul 5, 2017

Contributor

@huningxin you found the issue and commented so you will get the notifications :) The CC is for people to get a mention notification and make them aware of it.

Contributor

dmarcos commented Jul 5, 2017

@huningxin you found the issue and commented so you will get the notifications :) The CC is for people to get a mention notification and make them aware of it.

@dmarcos

This comment has been minimized.

Show comment
Hide comment
@dmarcos

dmarcos Jul 5, 2017

Contributor

I see getUserMedia as an avenue to provide some of the environment information.

Contributor

dmarcos commented Jul 5, 2017

I see getUserMedia as an avenue to provide some of the environment information.

@jsantell

This comment has been minimized.

Show comment
Hide comment
@jsantell

jsantell Jul 10, 2017

Contributor

Unfamiliar with the mediacapture-depth spec, great info @huningxin!
With the work we've done in chromium-webar, we've been prototyping AR additions to the spec, and some thoughts on the camera with getUserMedia.

  • In our prototypes with Chromium and Webkit, we couldn't run Tango/ARKit on a mobile device while a browser is using getUserMedia due to competing camera access between the two -- how would mediacapture-depth work in this scenario?

  • The chromium-webar prototype has a work around to pass in a reference to texImage2d so that the data always lives on the GPU and doesn't need to be pushed from CPU every frame. Does texImage2d with a video stream via getUserMedia benefit from the gpu-to-gpu flow? Not familiar with this part of the implementation in browsers, and if we need to buffer the data to GPU on every frame -- looking at some bugs it looks like this may already be implemented for most formats, atleast on Chrome -- (@judax is more familiar with this work)

Contributor

jsantell commented Jul 10, 2017

Unfamiliar with the mediacapture-depth spec, great info @huningxin!
With the work we've done in chromium-webar, we've been prototyping AR additions to the spec, and some thoughts on the camera with getUserMedia.

  • In our prototypes with Chromium and Webkit, we couldn't run Tango/ARKit on a mobile device while a browser is using getUserMedia due to competing camera access between the two -- how would mediacapture-depth work in this scenario?

  • The chromium-webar prototype has a work around to pass in a reference to texImage2d so that the data always lives on the GPU and doesn't need to be pushed from CPU every frame. Does texImage2d with a video stream via getUserMedia benefit from the gpu-to-gpu flow? Not familiar with this part of the implementation in browsers, and if we need to buffer the data to GPU on every frame -- looking at some bugs it looks like this may already be implemented for most formats, atleast on Chrome -- (@judax is more familiar with this work)

@blairmacintyre

This comment has been minimized.

Show comment
Hide comment
@blairmacintyre

blairmacintyre Jul 17, 2017

I personally don't think leveraging mediacapture-depth directly is the right path, if we're talking about doing webar in a similar way to webvr (or as an extension). Some of the features of mediacapture-depth may very well be useful; it may be that we want to leverage some of the ideas in the extension proposal (camera specs, ways of presenting RGBD data, etc).

My assumption has been that to integrate vision / sensor process in webvr/webar, we would want to follow a path similar to you took with chromium-webar (as I've talked with @judax about in the past): expose the underlying data, with synchronized timestamps, as efficiently as possible in the rAF callback loop (either to the rAF method, as you did, or before the rAF is called, in a worker or other callback).

Beyond just dealing with underlying conflict (i.e., multiple "parts of the system" wanting to control the camera), we will want to have efficient ways of managing data flow, integration of platform APIs (Tango, ARKit, Windows Holographic, etc) with custom user code, etc.

blairmacintyre commented Jul 17, 2017

I personally don't think leveraging mediacapture-depth directly is the right path, if we're talking about doing webar in a similar way to webvr (or as an extension). Some of the features of mediacapture-depth may very well be useful; it may be that we want to leverage some of the ideas in the extension proposal (camera specs, ways of presenting RGBD data, etc).

My assumption has been that to integrate vision / sensor process in webvr/webar, we would want to follow a path similar to you took with chromium-webar (as I've talked with @judax about in the past): expose the underlying data, with synchronized timestamps, as efficiently as possible in the rAF callback loop (either to the rAF method, as you did, or before the rAF is called, in a worker or other callback).

Beyond just dealing with underlying conflict (i.e., multiple "parts of the system" wanting to control the camera), we will want to have efficient ways of managing data flow, integration of platform APIs (Tango, ARKit, Windows Holographic, etc) with custom user code, etc.

@judax

This comment has been minimized.

Show comment
Hide comment
@judax

judax Jul 25, 2017

I agree with @blairmacintyre . My approach to the problem is to expose the MVP and make things as easy as possible. IMHO there are 3 elements that are needed:

  1. Pose estimation: Provided by WebVR.
  2. See through camera feed rendering: It should be as simple as possible both for the developer and for the user (requesting permission in every AR page could not be the best option). The synchronization of the timestamp with the pose could be done internally by the user agent so the feed composition could also be done internally in the rAF call.
  3. Placement of virtual objects in the real world: position and normal of the hitting point. Making a hit test call similar to what both Tango and ARKit provide.

Of course, other very important issues are left aside like plane detection and advanced features like depth, markers, mesh reconstruction, etc... but my approach is to start simple and slow but with a set of features/capabilities that I do not foresee should change/be removed in the future, just improved.

We are working on a explainer soon to be released so we all can comment on the proposal.

judax commented Jul 25, 2017

I agree with @blairmacintyre . My approach to the problem is to expose the MVP and make things as easy as possible. IMHO there are 3 elements that are needed:

  1. Pose estimation: Provided by WebVR.
  2. See through camera feed rendering: It should be as simple as possible both for the developer and for the user (requesting permission in every AR page could not be the best option). The synchronization of the timestamp with the pose could be done internally by the user agent so the feed composition could also be done internally in the rAF call.
  3. Placement of virtual objects in the real world: position and normal of the hitting point. Making a hit test call similar to what both Tango and ARKit provide.

Of course, other very important issues are left aside like plane detection and advanced features like depth, markers, mesh reconstruction, etc... but my approach is to start simple and slow but with a set of features/capabilities that I do not foresee should change/be removed in the future, just improved.

We are working on a explainer soon to be released so we all can comment on the proposal.

@blairmacintyre

This comment has been minimized.

Show comment
Hide comment
@blairmacintyre

blairmacintyre Jul 25, 2017

Its good to see you adopting the "render and compose reality inside the user agent" approach, I think that's necessary for allowing facilitating user privacy (and other performance ideas) down the road.

I would really like to see folks adopt concepts of Anchors (or reference points or whatever) that are abstract and could change: the trivial version of this is trivial, but forces software created in the web to work with this concept. In AR, everything should attached to an anchor that the underlying system has some control over, so that as the sensing systems change their understanding of the world, the content is "moved with it". Windows Holographic and ARKit both do this; argon.js has done this for years. We want to encourage libraries for AR to adopt this constraint.

Also, I think we need to address the pesky idea of "the real world", namely how the coordinate frames used in WebAR align with the real world. In argon.js, we try as best we can to align coordinates with the real world -- our baseline is "y is aligned with our best guess for up in the real world" but when possible we try to align xyz with East-Up-South. ARKit is the first commercial library to support this, which is great. Windows Holographic should (but, inexplicably, doesn't). Beyond that, in argon.js we try (as best as possible) to align the local coordinate frame with geolocation (using the location APIs). Obviously, this alignment is approximate (especially indoors), but by doing it in one central place, developers who want to include geo-referenced content aren't forced to do this themselves (and, as with other things, the underlying system can leverage all the knowledge it has to do these alignments).

There are other things I'd like to see in WebAR; these are very specific (and easy to do) things that we need to get in now, so higher level libraries are created with them in mind.

blairmacintyre commented Jul 25, 2017

Its good to see you adopting the "render and compose reality inside the user agent" approach, I think that's necessary for allowing facilitating user privacy (and other performance ideas) down the road.

I would really like to see folks adopt concepts of Anchors (or reference points or whatever) that are abstract and could change: the trivial version of this is trivial, but forces software created in the web to work with this concept. In AR, everything should attached to an anchor that the underlying system has some control over, so that as the sensing systems change their understanding of the world, the content is "moved with it". Windows Holographic and ARKit both do this; argon.js has done this for years. We want to encourage libraries for AR to adopt this constraint.

Also, I think we need to address the pesky idea of "the real world", namely how the coordinate frames used in WebAR align with the real world. In argon.js, we try as best we can to align coordinates with the real world -- our baseline is "y is aligned with our best guess for up in the real world" but when possible we try to align xyz with East-Up-South. ARKit is the first commercial library to support this, which is great. Windows Holographic should (but, inexplicably, doesn't). Beyond that, in argon.js we try (as best as possible) to align the local coordinate frame with geolocation (using the location APIs). Obviously, this alignment is approximate (especially indoors), but by doing it in one central place, developers who want to include geo-referenced content aren't forced to do this themselves (and, as with other things, the underlying system can leverage all the knowledge it has to do these alignments).

There are other things I'd like to see in WebAR; these are very specific (and easy to do) things that we need to get in now, so higher level libraries are created with them in mind.

@bferns

This comment has been minimized.

Show comment
Hide comment
@bferns

bferns Jul 26, 2017

Microsoft's documentation has some details on how they handle concepts around AR-specific features in coordinate systems (and also scales from local to world-scale).

https://developer.microsoft.com/en-us/windows/mixed-reality/coordinate_systems
https://developer.microsoft.com/en-us/windows/mixed-reality/spatial_anchors

Not endorsing their approach, just thought its interesting/on-topic reading for anyone who hasn't dug into the Win MR docs.

bferns commented Jul 26, 2017

Microsoft's documentation has some details on how they handle concepts around AR-specific features in coordinate systems (and also scales from local to world-scale).

https://developer.microsoft.com/en-us/windows/mixed-reality/coordinate_systems
https://developer.microsoft.com/en-us/windows/mixed-reality/spatial_anchors

Not endorsing their approach, just thought its interesting/on-topic reading for anyone who hasn't dug into the Win MR docs.

@jeromeetienne

This comment has been minimized.

Show comment
Hide comment
@jeromeetienne

jeromeetienne Jul 29, 2017

adding myself to be notified

jeromeetienne commented Jul 29, 2017

adding myself to be notified

@toji

This comment has been minimized.

Show comment
Hide comment
@toji

toji Jul 31, 2017

Member

Dropping in to add a note that I've generally be avoiding diving into this issue for the sake of staying focused on more immediate deliverables, but after catching up on the thread I do agree with the broad trends being discussed. In particular, as far as interaction with the WebVR ("2.0") API goes:

  • Opaquely compositing AR video feeds into the frame is good for privacy, API simplicity, and should be able to transparently handle transparent displays (Pun intended!) so I'm all for it. It's worth noting that it prevents certain effects like glass refraction, so there may be a good reason to expose the texture at some point to enable more complicated effects with the tradeoff being that it would incur a permissions prompt. If a texture is exposed it should be accessed via the VRPresentationFrame so that it can be guaranteed to be synced with the frame poses. Ultimately having both routes seems like the right choice. Priority should be opaque compositing.
  • Support for an "anchors" concept is in the plans for WebVR. Initially we have the VRFrameOfReference object which is used to simplify support for things like room scale, but you'll note in the explainer that it inherits from a VRCoordinateSystem which is more generic and is the type that's actually accepted for things like the VRPresentationFrame.getPose function. The reason for that inheritance hierarchy is specifically because Microsoft requested it to allow future support for what their system calls Spatial Anchors. (Tango has them too, called Points of Interest, I believe.) So eventually we'll want to have an anchor concept that also inherits from VRCoordinateSystem and allows for that kind of AR-centric tracking.
Member

toji commented Jul 31, 2017

Dropping in to add a note that I've generally be avoiding diving into this issue for the sake of staying focused on more immediate deliverables, but after catching up on the thread I do agree with the broad trends being discussed. In particular, as far as interaction with the WebVR ("2.0") API goes:

  • Opaquely compositing AR video feeds into the frame is good for privacy, API simplicity, and should be able to transparently handle transparent displays (Pun intended!) so I'm all for it. It's worth noting that it prevents certain effects like glass refraction, so there may be a good reason to expose the texture at some point to enable more complicated effects with the tradeoff being that it would incur a permissions prompt. If a texture is exposed it should be accessed via the VRPresentationFrame so that it can be guaranteed to be synced with the frame poses. Ultimately having both routes seems like the right choice. Priority should be opaque compositing.
  • Support for an "anchors" concept is in the plans for WebVR. Initially we have the VRFrameOfReference object which is used to simplify support for things like room scale, but you'll note in the explainer that it inherits from a VRCoordinateSystem which is more generic and is the type that's actually accepted for things like the VRPresentationFrame.getPose function. The reason for that inheritance hierarchy is specifically because Microsoft requested it to allow future support for what their system calls Spatial Anchors. (Tango has them too, called Points of Interest, I believe.) So eventually we'll want to have an anchor concept that also inherits from VRCoordinateSystem and allows for that kind of AR-centric tracking.
@blairmacintyre

This comment has been minimized.

Show comment
Hide comment
@blairmacintyre

blairmacintyre Jul 31, 2017

Thanks @toji. When I've suggested internal compositing of video in presentations, I've always assumed there's be a way to ask for the video frame (or other associated data, like depth, surfaces or SLAM reconstructions). I'd expect that in all these cases the User Agent would have some mechanism to get permission from the user, as all of this data is sensitive.

For video, I'd expect it to be made available in CPU and/or the GPU, but which would be available is up to the implementation (e.g., if the OS only provides it in the GPU, perhaps WebAR only provides it in the GPU). But, if the user says "no" then it's wouldn't be available.

blairmacintyre commented Jul 31, 2017

Thanks @toji. When I've suggested internal compositing of video in presentations, I've always assumed there's be a way to ask for the video frame (or other associated data, like depth, surfaces or SLAM reconstructions). I'd expect that in all these cases the User Agent would have some mechanism to get permission from the user, as all of this data is sensitive.

For video, I'd expect it to be made available in CPU and/or the GPU, but which would be available is up to the implementation (e.g., if the OS only provides it in the GPU, perhaps WebAR only provides it in the GPU). But, if the user says "no" then it's wouldn't be available.

@Utopiah

This comment has been minimized.

Show comment
Hide comment
@Utopiah

Utopiah Aug 19, 2017

Contributor

As discussed in google-ar/WebARonTango#9 would be important to also establish a consensus on AR proper specific features like localization (ADF, maps, whatever each platform is using) and their availability, not just rendering of the result.

Contributor

Utopiah commented Aug 19, 2017

As discussed in google-ar/WebARonTango#9 would be important to also establish a consensus on AR proper specific features like localization (ADF, maps, whatever each platform is using) and their availability, not just rendering of the result.

@machenmusik

This comment has been minimized.

Show comment
Hide comment
@machenmusik

machenmusik Aug 19, 2017

Re: camera passthrough - at some point folks may need it presented in dual eye views to match VR presentation, so however the implementation works, it may need synchronized access to that data as well.

Along the same lines, the permissions prompt(s) may require presentation VR context - currently for example WebRTC usage in VR in Firefox causes permissions prompts not visible in VR, which may bring up browser presentation trust/security considerations

machenmusik commented Aug 19, 2017

Re: camera passthrough - at some point folks may need it presented in dual eye views to match VR presentation, so however the implementation works, it may need synchronized access to that data as well.

Along the same lines, the permissions prompt(s) may require presentation VR context - currently for example WebRTC usage in VR in Firefox causes permissions prompts not visible in VR, which may bring up browser presentation trust/security considerations

@blairmacintyre

This comment has been minimized.

Show comment
Hide comment
@blairmacintyre

blairmacintyre Aug 21, 2017

@Utopiah yes, I think that will be needed. The problem is that there are no standard capabilities; even for a simple thing like an ADF, the two platforms that build and reuse models of space right now (Windows MR and Tango) do it differently and use incompatible files.

I think what may be needed is an agreed-upon mechanism for platform specific capabilities, coupled with some defined capabilities we can all agree on. Some examples:

  • if a platform supports "tracking things" (e.g., planes, markers, objects) then a user-agent should implement a "default trackable" or a way to "select from known trackables". One way is "shoot a ray from x/y on the screen and return the trackable objects hit" (ARKit does this, for example, and both Tango and WindowsMR could if they wanted). Platforms that don't track "things" could say so.
  • ditto for "shoot ray and intersect world" (not trackable things) (using mesh or depth cloud or something else). Platforms that don't build models of space but still track stuff could return intersections with static stuff, or just nothing.
  • have the user-agent compose the "augmentation graphics" with the view of reality. This might mean a depth cloud, it might be the mesh, doesn't really matter. Future sorts of displays might do other things. Displays that do it in hardware can leverage that.

Another problem is platform specific tracking: if I build something like Vuforia or another visual tracking thing into a user-agent (e.g., Tango and iOS ARKit+Vision both can track markers of some form, Argon4 can track with Vuforia, etc) then we need to initialize and manage it. That process will not be cross platform, but needs to be exposed if we're going to make the web a contender for AR.

blairmacintyre commented Aug 21, 2017

@Utopiah yes, I think that will be needed. The problem is that there are no standard capabilities; even for a simple thing like an ADF, the two platforms that build and reuse models of space right now (Windows MR and Tango) do it differently and use incompatible files.

I think what may be needed is an agreed-upon mechanism for platform specific capabilities, coupled with some defined capabilities we can all agree on. Some examples:

  • if a platform supports "tracking things" (e.g., planes, markers, objects) then a user-agent should implement a "default trackable" or a way to "select from known trackables". One way is "shoot a ray from x/y on the screen and return the trackable objects hit" (ARKit does this, for example, and both Tango and WindowsMR could if they wanted). Platforms that don't track "things" could say so.
  • ditto for "shoot ray and intersect world" (not trackable things) (using mesh or depth cloud or something else). Platforms that don't build models of space but still track stuff could return intersections with static stuff, or just nothing.
  • have the user-agent compose the "augmentation graphics" with the view of reality. This might mean a depth cloud, it might be the mesh, doesn't really matter. Future sorts of displays might do other things. Displays that do it in hardware can leverage that.

Another problem is platform specific tracking: if I build something like Vuforia or another visual tracking thing into a user-agent (e.g., Tango and iOS ARKit+Vision both can track markers of some form, Argon4 can track with Vuforia, etc) then we need to initialize and manage it. That process will not be cross platform, but needs to be exposed if we're going to make the web a contender for AR.

@blairmacintyre

This comment has been minimized.

Show comment
Hide comment
@blairmacintyre

blairmacintyre Aug 21, 2017

@machenmusik I doubt that WebRTC (video, at least) will play any roll in AR extensions to WebVR. It's used by various toolkits right now because that's all that "standard web" can do, but it's generally not capable of doing what we need (latency, no camera specs, no tight sync with sensors, etc etc).

Worse is the issue of mono cameras: doing stereo video pass through from one camera is "cute but not super useful." It's super-easy to support, you just need to decide if the augmentations remain stereo (and thus don't line up with the video) or are mono (render the same thing for both eyes, so everything lines up, but you lose true stereo). We opted for the later in Argon4 (you can see what it looks like by entering "Viewer Mode"), both are "non-ideal".

I suspect HMD-based AR/MR will be limited to actual MR HMD's, except for a limited subset of content.

blairmacintyre commented Aug 21, 2017

@machenmusik I doubt that WebRTC (video, at least) will play any roll in AR extensions to WebVR. It's used by various toolkits right now because that's all that "standard web" can do, but it's generally not capable of doing what we need (latency, no camera specs, no tight sync with sensors, etc etc).

Worse is the issue of mono cameras: doing stereo video pass through from one camera is "cute but not super useful." It's super-easy to support, you just need to decide if the augmentations remain stereo (and thus don't line up with the video) or are mono (render the same thing for both eyes, so everything lines up, but you lose true stereo). We opted for the later in Argon4 (you can see what it looks like by entering "Viewer Mode"), both are "non-ideal".

I suspect HMD-based AR/MR will be limited to actual MR HMD's, except for a limited subset of content.

@delapuente

This comment has been minimized.

Show comment
Hide comment
@delapuente

delapuente Aug 22, 2017

if a platform supports "tracking things" (e.g., planes, markers, objects) then a user-agent should implement a "default trackable" or a way to "select from known trackables". One way is "shoot a ray from x/y on the screen and return the trackable objects hit" (ARKit does this, for example, and both Tango and WindowsMR could if they wanted). Platforms that don't track "things" could say so.

I think the best approach in terms of versatility/performance is to offer a set of basic interactions with _tracked point-clouds/meshes_a nd provide an AR camera matrix, then let client libraries to compose on the top of it.

delapuente commented Aug 22, 2017

if a platform supports "tracking things" (e.g., planes, markers, objects) then a user-agent should implement a "default trackable" or a way to "select from known trackables". One way is "shoot a ray from x/y on the screen and return the trackable objects hit" (ARKit does this, for example, and both Tango and WindowsMR could if they wanted). Platforms that don't track "things" could say so.

I think the best approach in terms of versatility/performance is to offer a set of basic interactions with _tracked point-clouds/meshes_a nd provide an AR camera matrix, then let client libraries to compose on the top of it.

@machenmusik

This comment has been minimized.

Show comment
Hide comment
@machenmusik

machenmusik Aug 22, 2017

@blairmacintyre for webrtc, I meant for audio and data channel, not likely for video (at least in current form)

machenmusik commented Aug 22, 2017

@blairmacintyre for webrtc, I meant for audio and data channel, not likely for video (at least in current form)

@blairmacintyre

This comment has been minimized.

Show comment
Hide comment
@blairmacintyre

blairmacintyre Aug 22, 2017

@machenmusik ah yes. Problem is, without a well defined part of the display that is owned and controlled by the UA, prompts can be hidden or spoofed; so presenting anything permission-related in VR is problematic right now.

blairmacintyre commented Aug 22, 2017

@machenmusik ah yes. Problem is, without a well defined part of the display that is owned and controlled by the UA, prompts can be hidden or spoofed; so presenting anything permission-related in VR is problematic right now.

@afvc

This comment has been minimized.

Show comment
Hide comment
@afvc

afvc Aug 23, 2017

I think it'd be really cool if we could have a component to facilitate the usage of GPS coordinates, like apple has CoreLocation, we could have a component that used idk Google Maps so something like AR GPS could be created with A-Frame.
I'm not very good on dev but I found this, don't know if it helps or can be "connected" with A-Frame
https://github.com/timfpark/react-native-location

afvc commented Aug 23, 2017

I think it'd be really cool if we could have a component to facilitate the usage of GPS coordinates, like apple has CoreLocation, we could have a component that used idk Google Maps so something like AR GPS could be created with A-Frame.
I'm not very good on dev but I found this, don't know if it helps or can be "connected" with A-Frame
https://github.com/timfpark/react-native-location

@Utopiah

This comment has been minimized.

Show comment
Hide comment
@Utopiah

Utopiah Aug 23, 2017

Contributor

@afvc might want to check https://location.services.mozilla.com/ or https://developers.google.com/tango/overview/concepts#visual_positioning_service_overview . Question then becomes should an abstraction on top of all of those be required to be part of the AR specs.

Contributor

Utopiah commented Aug 23, 2017

@afvc might want to check https://location.services.mozilla.com/ or https://developers.google.com/tango/overview/concepts#visual_positioning_service_overview . Question then becomes should an abstraction on top of all of those be required to be part of the AR specs.

@machenmusik

This comment has been minimized.

Show comment
Hide comment
@machenmusik

machenmusik Aug 28, 2017

@blairmacintyre yup, I am aware that privacy/security of browser interaction is still unsolved / uunimplemented.

That being said, curtent mobile platforms require user gesture for many useful things one would like to do with AR - user media, media playback, webrtc, etc. - so unless there is some way to either disable those requirements or present user challenge for permissions, experiences constructed with Web tech will be limited.

Perhaps as an interim step, the permissions can be configured (similar to geolocation) for a given site/subdomain outside of VR/AR, and persisted so they do not require prompting on subsequent attempts?

machenmusik commented Aug 28, 2017

@blairmacintyre yup, I am aware that privacy/security of browser interaction is still unsolved / uunimplemented.

That being said, curtent mobile platforms require user gesture for many useful things one would like to do with AR - user media, media playback, webrtc, etc. - so unless there is some way to either disable those requirements or present user challenge for permissions, experiences constructed with Web tech will be limited.

Perhaps as an interim step, the permissions can be configured (similar to geolocation) for a given site/subdomain outside of VR/AR, and persisted so they do not require prompting on subsequent attempts?

@TrevorFSmith

This comment has been minimized.

Show comment
Hide comment
@TrevorFSmith

TrevorFSmith Aug 29, 2017

Member

After playing around with a variety of APIs for AR, we found that there are three aspects of usable, basic AR that are missing from the WebVR "2.0" spec: anchors, geospatial coordinate systems, and the ability to match a display (flat screen, HMD, glasses, ...) with a reality (camera, passthrough, or virtual).

We put together a scratch API (https://github.com/mozilla/webxr-api/blob/master/WebXR%20API.md) and have started a polyfill to test the API on a variety of platforms and for a variety of apps. It definitely has holes and pieces that we know need to change, but it's interesting as a talking piece.

Member

TrevorFSmith commented Aug 29, 2017

After playing around with a variety of APIs for AR, we found that there are three aspects of usable, basic AR that are missing from the WebVR "2.0" spec: anchors, geospatial coordinate systems, and the ability to match a display (flat screen, HMD, glasses, ...) with a reality (camera, passthrough, or virtual).

We put together a scratch API (https://github.com/mozilla/webxr-api/blob/master/WebXR%20API.md) and have started a polyfill to test the API on a variety of platforms and for a variety of apps. It definitely has holes and pieces that we know need to change, but it's interesting as a talking piece.

@TrevorFSmith

This comment has been minimized.

Show comment
Hide comment
@TrevorFSmith

TrevorFSmith Sep 26, 2017

Member

Just as an update, at Mozilla we've started an experimental WebXR polyfill (https://github.com/mozilla/webxr-polyfill/) that is starting to run on multiple AR capable browsers, including our iOS+ARKit test browser (https://github.com/mozilla/webxr-ios/) and Google's WebARonARCore.

The goal of this work is to explore what the WebVR 2.0 spec would be like if it supported the current era of AR capabilities like those exposed by ARKit and ARCore. There are several aspects of even basic AR that aren't supported by the current WebVR 2.0 spec, and this experiment has proven fruitful for exposing those issues and will inform the ongoing effort to write an AR explainer and plan for the future.

Eyes and hands are welcome! We're making this effort to foster exploration and discussion about how to make sure that WebVR 2.0 is ready for AR and VR.

Member

TrevorFSmith commented Sep 26, 2017

Just as an update, at Mozilla we've started an experimental WebXR polyfill (https://github.com/mozilla/webxr-polyfill/) that is starting to run on multiple AR capable browsers, including our iOS+ARKit test browser (https://github.com/mozilla/webxr-ios/) and Google's WebARonARCore.

The goal of this work is to explore what the WebVR 2.0 spec would be like if it supported the current era of AR capabilities like those exposed by ARKit and ARCore. There are several aspects of even basic AR that aren't supported by the current WebVR 2.0 spec, and this experiment has proven fruitful for exposing those issues and will inform the ongoing effort to write an AR explainer and plan for the future.

Eyes and hands are welcome! We're making this effort to foster exploration and discussion about how to make sure that WebVR 2.0 is ready for AR and VR.

@TrevorFSmith

This comment has been minimized.

Show comment
Hide comment
@TrevorFSmith

TrevorFSmith Sep 28, 2017

Member

Here's a doc for easy comparison between WebVR 2.0 and WebXR: https://github.com/mozilla/webxr-api/blob/master/design%20docs/From%20WebVR%202.0%20to%20WebXR%202.1.md

You'll need to read the full WebXR API draft (https://github.com/mozilla/webxr-api/blob/master/WebXR%20API.md) to get the nuance, but this doc is an easy skim to see what we're proposing.

Member

TrevorFSmith commented Sep 28, 2017

Here's a doc for easy comparison between WebVR 2.0 and WebXR: https://github.com/mozilla/webxr-api/blob/master/design%20docs/From%20WebVR%202.0%20to%20WebXR%202.1.md

You'll need to read the full WebXR API draft (https://github.com/mozilla/webxr-api/blob/master/WebXR%20API.md) to get the nuance, but this doc is an easy skim to see what we're proposing.

@TrevorFSmith

This comment has been minimized.

Show comment
Hide comment
@TrevorFSmith

TrevorFSmith Sep 29, 2017

Member

And a doc on how to code against WebXR: https://github.com/mozilla/webxr-polyfill/blob/master/CODING.md including how anchors and the coordinate systems work.

Member

TrevorFSmith commented Sep 29, 2017

And a doc on how to code against WebXR: https://github.com/mozilla/webxr-polyfill/blob/master/CODING.md including how anchors and the coordinate systems work.

@NellWaliczek

This comment has been minimized.

Show comment
Hide comment
@NellWaliczek

NellWaliczek Jul 17, 2018

Member

Closing this issue now that we have officially switched the API to be an umbrella for XR features. Please file specific issues for topics that are not already being discussed!

Member

NellWaliczek commented Jul 17, 2018

Closing this issue now that we have officially switched the API to be an umbrella for XR features. Please file specific issues for topics that are not already being discussed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment