Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of explainer, for early feedback #4

Merged
merged 4 commits into from
Aug 5, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions lighting-estimation-explainer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# WebXR Device API - Lighting Estimation
This document explains the portion of the WebXR APIs that enable developers to render augmented reality content that reacts to real world lighting.

## Introduction

"Lighting Estimation" is implemented by AR platforms using a combination of sensors, cameras, algorithms, and machine learning. Lighting estimation provides input to rendering algorithms and shaders to ensure that the shading, shadows, and reflections of objects appear natural when presented in a diverse range of settings.

The XRLightProbe and XRReflectionProbe interfaces expose the values that the platform offer to WebXR rendering engines. Their corresponding accessor functions, XRFrame.getGlobalLightEstimate() and XRFrame.getGlobalReflectionProbe() are only accessible once the first frame of an AR session has started. The promises may be resolved on the same frame or multiple frames later, depending on the platform capabilities. In some cases, the promises may fail, indicating that the lighting values are not available at this time and should be requested again at a later time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the page tell the difference between failure that might succeed later or failures that indicate that this platform simply doesn't support this feature for say, reflections?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the promises failing if the lighting has not been calculated yet, could we just specify a standard set of values to return instead? That would save every app from having to create their own fallback lighting, and it would aid with consistency.


Although modern render engines support multiple "Light Probes" and "Reflection Probes" in a scene, the WebXR API returns only a single corresponding XRLightProbe and XRReflectionProbe, representing the global approximated lighting values to be used in the area in close proximity to the viewer. When future platforms become capable of reporting multiple probes with precise locations away from the viewer, such support could be implemented additively without breaking changes.

The orientation of the lighting information is relative to the XRViewerPose for the XRFrame that getGlobalLightEstimate() or getGlobalReflectionProbe() was requested on. As it may be computatinonaly expensive to rotate SH and texture cubes, XRLightProbe.sphericalHarmonicsCoefficients() and XRReflectionProbe.orientation() enable the same SH and texture cubes to be used in multiple orientations.

It is possible to treat a synthetic VR scene as the environment that AR content will be mixed in to. In this case, the platform will be able to report the lighting estimation using the geometry of the VR scene. As the WebXR API does not specifically express if the world is synthetic or real, AR content is to be written the same, without such knowledge. Such "AR in VR" techniques do not affect the WebXR specification directly and are beyond the scope of this text.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this proposing that we allow the UA to support this in vr mode? It wouldn't be a requirement right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be optional for UA implementations. In this case, the session would be an immersive-ar session. This would be driven by a UA specific UI, which requires no understanding of the "AR in VR" for the content's implementation.


## Physically Based Units

The lighting estimation values represent luminance and colors that may be outside the gamut of the output device. Direct sunlight can project 5000 nits at full power, while a typical display may emit only 250-500 nits. The objects in a scene will attenuate the power of the sun and reflect a smaller portion towards the viewer. Even if the display can only represent such a limited gamut (such as SRGB, P3, or Rec 2020), intermediate lighting calculations used by shaders involve scaling up small values and attenuating large values outside of the displayed gamut. When the lighting calculation results in a color that can not be displayed, the resulting value will be altered by a variety of post processing effects to match the rendering intent and aesthetic chosen by the content authors.

Luminance values are expressed in nits (cd/m^2). Nits are used by some native platform lighting estimation API's and the media-capabilities API. User agents will translate the values returned by native platforms to nits for consistency.

As lighting is scene-relative as opposed to display-relative, the luminance values are encoded linearly with no gamma curve. Most modern render engines perform intermediate calculations in linear space and can accept such values directly. If an engine performs intermediate calculations in a color space encoded with gamma, such as SRGB, care must be taken when converting the values. After scaling the values, the result may include components above 1.0 or below 0.0. Naive implementations that clamp RGB components independently will result in erraneous hue and saturation for out-of-gamut colors.

## Global Illumination

Rendering algorithms take into consideration not only the light received by a surface from the light source but also light that has bounced around the scene multiple times before reaching the eye.

Traditional real-time engines had a simple global "ambient" constant value that is added to the real-time shading result. Engines using such a simple technique can use XRLightProbe.indirectIrradiance, scaled to return the desired effect. It may also be necessary to apply a gamma curve if the shading is done in SRGB space.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is indirectIrradiance a scalar value? It's listed as a Float32Array below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indirectIrradiance is intended to hold 3 values in the units described in "Physically Based Units". These values would represent the red, green, and blue components of the light. It seems that we are missing text to describe these components. When scaling indirectIrradiance, one would scale each of the components.


Global illumination describes the collective techniques used to more accurately estimate the light received from indirect reflections.

## Image based lighting
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would "Cube Map textures" be a better title for this section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Image based lighting" implies not only the presence of cube map textures but also how they should be interpreted by a renderer. That said, the next section is labeled "Spherical Harmonics", which by itself does not imply how they should be interpreted by the renderer. At the least, we should have a glossary of these terms. Perhaps this could be discussed in the CG call to get consensus.


HDR Cube Map textures, as created by the XRReflectionProbe provide all the information about light sources and indirect bounces needed to accurately render PBR materials that are diffuse, glossy, and visibly reflective. Image based lighting effects utilizing such textures are simple to implement and perform well for VR and AR rendering. Unfortunately, such cube map textures require a lot of video memory and can often represent the environment from a limited range of locations where such a map was captured.

HDR Cube Map textures are commonly used to implement "Reflection Probes" in modern rendering engines.

## Spherical Harmonics

SH (Spherical Harmonics) are used as a more compact alternative to HDR cube maps by storing a small number of coefficient values describing a fourier series over the surface of a sphere. SH can effectively compress cube maps, while retaining multiple lights and directionality. Due to their lightweight nature, many SH probes can be used within a scene, be interpolated, or be calculated for locations nearer to the lit objects.

WebXR API supports up to 9 SH coefficients per RGB color component, for a total of 27 floating point scalar values. This enables the level 2 (3rd) order of details. If a platform can not supply all 9 coefficients, it can pass 0 for the higher order coefficients resulting in an effectively lower frequency reproduction.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we specify a maximum number of coefficients here? Can that be platform dependent, since in the future we might be able to estimate more of them?


This "SH probe" format is used by most modern rendering engines, including Unity, Unreal, and Threejs.

## Shadows

When an HDR Cube Map texture is available, shadows only have to consider occlusion of other rendered objects in the scene.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure what this sentence is saying... I'm pretty sure this is hard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this 100% correctly is very hard; however, there are many simple approximations that are commonly used. In particular, it may be sufficient in some cases to combine a baked ambient occlusion map with a IBR shader that needs no realtime dynamic lights.

A simple, non-physically based implementation may simply index the HDR cube map using a surface normal, and blend it with an albedo term representing the color of the surface using operators representing the artists intent.


When a HDR Cube Map texture is not available, or the typical soft shadow effects of image based lighting are too costly to implement, the XRLightProbe.primaryLightDirection and XRLightProbe.primaryLightIntensity can be used to render shadows cast by the most prominent light source.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should primaryLightIntensity be primaryLightColor? A linear RGB triple would include intensity implicitly.


## Security Implications

### XRLightProbe

Only XRLightProbe.indirectIrradiance is guaranteed to be available either due to user privacy settings or the capabilities of the platform.

XRLightProbe returns sufficient information to render objects that appear to fit into their environment, with highly diffuse surfaces or high frequency normal maps which would result in a wide NDF (normal distribution function). Highly polished objects may be represented with a non-physically based illusion of glossiness with a specular highlight effect sensitive only to the primary light direction. Reflections will be unable to reproduce detailed images of the environment without an XRReflectionProbe.

The lighting estimation returned by the WebXR API explicitly describes the real world environment in proximity to the user. By default, only low spatial frequency and low temporal frequency information should be returned by the WebXR API. Even when a platform can directly produce higher spatial and temporal frequency information, the browser must apply a low pass filter with an aim to mitigate the risk of untrusted content identifying the geolocation of the user or of profiling their environment.

Combined with other factors, such as the user's IP address, even the low frequency information returned with XRLightProbe increases the fingerprinting risk. The XRLightProbe should only be accessible during an active WebXR session.

### XRReflectionProbe

XRReflectionProbe should only be accessible with a permissions prompt equivalent to requesting access to the camera and microphone. XRReflectionProbe enables efficient and simple to implement image based lighting. PBR shaders can index the mip map chain of the environment cube to reduce the memory bandwidth required while integrating multiple samples to match wider NDF's.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't get permissions, can we still return a low-res version created from the output of XRLightProbe? That way we don't force anyone to implement spherical harmonics and directional lights if they just do IBL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. I would like to break this out into its own issue for discussion.


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the reference to and microphone here. Why not just camera?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining similar UX as requesting microphone permission, not necessarily relating the kind of sensor. It seems that this analogy is adding more confusion than helping, so perhaps I should remove the "microphone" reference.

## Appendix A: Proposed partial IDL
This is a partial IDL and is considered additive to the core IDL found in the main [explainer](explainer.md).

```webidl
partial interface XRFrame {
Promise<XRLightProbe> getGlobalLightEstimate();
Promise<XRReflectionProbe> getGlobalReflectionProbe();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this API to be promise-based? If the lighting is estimated for a specific frame and we're returning promises, then the application would not be able to get the result of getGlobalLightEstimate() / getGlobalReflectionProbe() during request animation frame callback relevant for the frame for which the lighting was estimated. It would be able to act on the data (at the earliest) in a subsequent frame's rAF, and by that time, the estimate might be outdated.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternative would be to make this API subscription-based (similarly to what's described in existing hit test explainer).

};

[SecureContext, Exposed=Window]
partial interface XRLightProbe {
readonly attribute Float32Array indirectIrradiance;
readonly attribute Float32Array? primaryLightDirection;
readonly attribute Float32Array? primaryLightIntensity;
readonly attribute Float32Array? sphericalHarmonicsCoefficients;
[SameObject] readonly attribute DOMPointReadOnly? sphericalHarmonicsOrientation;
};

[SecureContext, Exposed=Window]
partial interface XRReflectionProbe {
[SameObject] readonly attribute DOMPointReadOnly orientation;
WebGLTexture? createWebGLEnvironmentCube();
};
```