Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing the precision of the DOMHighResTimeStamp resolution #56

Closed
plehegar opened this issue Jan 9, 2018 · 17 comments
Closed

Reducing the precision of the DOMHighResTimeStamp resolution #56

plehegar opened this issue Jan 9, 2018 · 17 comments
Assignees
Labels
security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response.
Milestone

Comments

@plehegar
Copy link
Member

plehegar commented Jan 9, 2018

@plehegar plehegar added the security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response. label Jan 9, 2018
@plehegar plehegar added this to the Level 2 milestone Jan 9, 2018
@siusin
Copy link
Contributor

siusin commented Jan 17, 2018

@toddreifsteck
Copy link
Member

Per discussion on the 2/22 call, this issue is still in heavy flux. We will address this update when UAs come to consensus. When consensus occurs, we will update the CR.

@plehegar Could you add a note to the spec indicating that the current 5 us is likely to increase then follow it with a re-publish?

@blurbusters
Copy link

blurbusters commented Feb 24, 2018

I have serious grave concerns about permanent precision reductions.

FireFox's 60 reduced precision even further to 2ms.

2ms precision now makes WebVR useless.

And a lot of 3D game programming techniques becomes impractical.

For a very good legitimate reason, I may have to begin automatically prompting users step by step instructions on overriding the precision reductions via advanced browser flags/config. This reduces security, since I essentially have to tell users to reduce the security globally just for one site - I do not like this approach.

Better remedy is a prompt/permissions based API like fullscreen, camera, VR, etc.

Please urgently read:
https://bugzilla.mozilla.org/show_bug.cgi?id=1427918#c9

@blurbusters
Copy link

blurbusters commented Feb 25, 2018

Longer write-up:

Firstly, Spectre and Meltdown are indeed extremely serious security issues.
Many vendors, rightfully, doing emergency fixes including temporarily reducing timer precision.

However, I have seriously grave concerns about Mozilla/FireFox's decision to reduce performance.now() precision to 2ms -- there are unanticipated major side effects, including one that will reduce security.

MAJOR SIDE EFFECT 1:

The precision degradation is so massive (even at 60Hz) -- affecting fullscreen WebGL and WebVR majorly -- that I have heard from a friend that a site that plans to automatically detect the 2ms imprecision and automatically prompt the user with step-by-step instructions to fix the imprecision for their site. As a global FireFox setting, this is problematic from a security perspective. A global setting changed.

I think the safer solution is:

"In an absolute emergency: Personally if there's a duress to reduce timer precision, then a temporary permissions API (cameras and fullscreen have more security/privacy fears) to prompt the user for permission to stay at 5us precision. At least during the duration of security studying, until being reverted to maximized precision."

MAJOR SIDE EFFECT 2:

As a past game developer (most standards writers aren't game developers), let me tell you:

Accurate gametimes are necessary for accurate calculations of 3D object positions, especially in full-screen WebGL games.

At sub-refresh-cycles, animations should still use sub-millisecond accuracy to prevent jank, when calculating time-based animations (e.g. calculating object positions based on time) -- that's what 3D graphics often do compensating for fluctuating frame rates, they calculate object world positions based on gametime.

That means 1000 inch per second moving balls (~60mph) can be wrongly mispositioned by 2 inches if the gametime is off by 2 millisecond. That can mean goal/nogoal. Fastballs, baseballs, hockey pucks can go much faster than this. Low-precision gametimes will ruin a lot of HTML5 and WebGL game programming.

Gametimes need sub-millisecond accuracy, especially full-screen WebGL games, because they calculate real-world 3D object positions based on gametime.

That way everything looks correct regardless of framerate. Framerates fluctuate all over the place, so video games using 3D graphics use gametime to calculate the position of everything. 3D object positions are calculated based on gametime. Wrong gametime means everything janks like crazy in full screen WebGL games.

Again, this is for 60 Hz. I'm not even talking about higher refresh rates. 1ms gametime errors create noticeable motion-fluidity flaws in 60Hz full-screen WebGL games!

Things even janks crazily with 5ms and 8ms gametime errors on a 60Hz display. You're in the driver's seat of a 300mph car, in a racing game -- a 2ms gametime error becomes large even on a 60 Hz display. Jank/jerkiness/stutter can get huge even at sub-refresh-cycle levels even on a 60 Hz display, even with sub-refresh-cycle errors.

Also, don't forget that there are advantages to frame rates above refresh rates in reducing latency: https://www.blurbusters.com/faq/benefits-of-frame-rate-above-refresh-rate/

My prediction is FireFox may get a tsunami wave of huge complaints from game designers suddenly hit by major precision reductions in ability to calculate gameworld positions.

MAJOR SIDE EFFECT 3:

Depending on how timer precision is degraded, it potentially eliminates educational motion tests in web browsers. Many scientific tests such as www.testufo.com/ghosting and www.testufo.com/frameskipping (display overclocking) is heavily dependant on perfect frame rate synchronization, and the site is able to detect whenever Chrome misses a frame.

Peer reviewed papers have already been written based on browser motion tests (including www.blurbusters.com/motion-tests/pursuit-camera-paper ...) thanks to browser's ability to achieve perfect refresh rate synchronization)

Even one of my sites, TestUFO, depending on how the site is degraded in upcoming FireFox A/B tests (new versus old browser) -- it may be forced to do something similar to the popup in "MAJOR SIDE EFFECT 1" -- for specific kinds of peer-reviewed scientific motion testing. Is there a way for me to tell users how to whitelist only one site (TestUFO.com) for high-precision timers, without doing it as a global setting? (Thought exercise for W3C)

The TestUFO website already automatically displays a message telling TestUFO users to switch web browsers instead of IE/Edge for 120Hz testing, due to IE/Edge continued violation of Section 7.1.4.2 of HTML 5.2. More than 50% of TestUFO visitors are using a refresh rate other than 60 Hz.

MAJOR SIDE EFFECT 4:

It makes WebVR useless.

Its premise (without creating nausea/headaches) is heavily dependant on accurate gametimes for accurate positions of 3D objects (see MAJOR SIDE EFFECT 2 above).

MAJOR SIDE EFFECTS 5-100

I have a much longer list, but for brevity, I am shortening this email to point out the graveness of FireFox's decision.

CONCLUSION

The approach by the FireFox team should be refined.

If continued short-term emergency mitigation is absolute critical, it should includes a Permissions API (much like Full Screen mode and WebRTC camera permissions) to case-by-case ask the user for permission to use high-precision timers (allowing 5us or 1us precision).

If absolutely necessary, this could even be limited to Organization Validation HTTPS sites, combined with only exclusive same-origin use, even triggered by a click, and only after confirming via a popup Permission API (like for WebRTC), that 5us/1us precision becomes granted to that particular site.

Long-term, these restrictions should be removed once the underlying causes of Spectre/Meltdown becomes solved. However, massive precision reductions, that forces web developers to give instructions to visitors how to configure their web browsers, is a less secure solution.

Thanks,
Mark Rejhon
Founder, Blur Busters / TestUFO
(Past Invited Expert, W3C Web Platform Working Group for HTML 5.2)

@blurbusters
Copy link

blurbusters commented Feb 26, 2018

FireFox responded at https://bugzilla.mozilla.org/show_bug.cgi?id=1435296

FireFox uses a hard-edge (performance.now() incrementing by 2ms every exactly 2ms). That's easier to generate an artificial timer from. From what I understand 2ms microsecond-accurate-hard-edge is far less secure than 100us+100us pseudorandom jitter.

I have a new suggestion: Double-pseudorandom jitter where the pseudorandom-jitter increment in the time counter is completely unrelated to the actual (completely separate pseudorandom) real-world time elapsed between increments in performance.now() .... Basically two separate pseudorandom jitterings going on concurrently. This potentially makes 10us-20us safe from Meltdown/Spectre, but validation will be needed.

HI Mark, thanks for this feedback. I am curious, is the situation as dire
for Chrome and Edge as it is for Firefox? I am trying to understand if there
is something different about our implementation compared to theirs that
makes it significantly worse for web application programmers while browsers
work to address high resolution timing attacks (of which Spectre is just
one.) I believe both of their implementations are less expansive than ours,
as we adjust all timers (not just performance.now()) and we clamp to 2ms
instead of 1ms as they do. Is 1ms an acceptable resolution while 2ms is
untenable for the types of applications you describe?

1ms is still too imprecise.

Currently browsers are:
-- Chrome using 100us + 100us pseudorandom jitter (M64)
-- Edge using 20us + 20us pseudorandom jitter
-- Safari using 1ms (same precision as Date.now())

I have not been able to find information about any further precision reductions (did they come up in your meetings)?

From what I understand (from other developers), 100us + 100us pseudorandom jitter is reportedly much harder to do a Spectre/Meltdown attack than a trunctated 2ms hard-edge. So that may be a better (interim) compromise -- Use 100us instead of 2000us -- and add a 100us jitter. One can create a simulated high-precision timer via precision busywaits from known 2000us hard-edges (trunctated timers), but by adding jitter, you really mess that ability up.

Even that is something I am really reluctant to tolerate, as there are some game calculations that benefits from sub-50us accuracy, so ultimately, in the long term, there should eventually be 1us precision again (in the post-Spectre/Meltdown mitigation world, or with security permissions).

100us(granularity)+100us(pseudorandom) is more accurate (less jank) for games that calculates positionals from gametime. Yet from what I am reading, should be harder to execute a Spectre/Meltdown attack from than a non-jittered 2000us(granularity). The hard-edges of a 2 millisecond trunctation provides exact-interval reference points (just monitor for timer event changes). By adding (well-seeded) pseudorandom jitter, you add several orders of magnitude difficulty to Spectre/Meltdown attacks, and make 2000us imprecision (as you are doing) unnecessary. More research is needed probably, but I implore you to find a way to go sub-1ms for games.

An improved version of this concept is a double pseudorandom:

Basically, every (0-100us pseudorandom value of "A") update an internal timer then increment (0-100us pseudorandom of a different pseudorandom value "B"). That way, it never ticks backwards, despite the pseudorandom jitter. And 33.1782us may have actually passed before the timer incremented by say, 59.197us. Then 89.3665us later, the timer incremented by 23.2165us. And so on. Do all decimal points allowed by the web spec (performance.now supports it) -- the more pseudorandom digits the better.

The pseudorandom timer increment can be completely independent of the actual real-world time interval between the sudden pseudorandom timer increments. This double-pseudorandom jitter approach allows you to massively tighten the accuracy.

By combining these two independent pseudorandoms, you have a much more fine-grained timer, more suitable for games, yet is much harder to do Spectre/Meltdown workarounds than a simple hard-edged 2ms increment every 2ms.

I think 20us can be made safe by turning it from a non-jittered/single-pseudorandom-jittered approach to a double-pseudorandom-jittered approach (as described above) -- validation will be needed but 2ms is extremely untenable for WebVR / fullscreen 3D FPS browser games.

Longer term, ultimately, I'd prefer 1us or 0.1us (microsecond) precision to be restored at some point in the future -- e.g. a strong permissions mechanism -- a special sandbox flag that indicates full Spectre/Meltdown safety -- but for emergency mitigation, 20us double-pseudorandom-jittered is a great compromise.

@tdresser
Copy link

Just want to make sure I understand the concerns here. I don't think any browser vendors are reducing accuracy of timestamps passed to requestAnimationFrame, so we still have accurate frame timestamps.

I was under the impression that games would generally execute physics simulation using fixed time intervals: this comes to mind, though it may be quite out of date.

If this is the case, I don't see why you'd ever need a high accuracy timestamp other than for display time, which rAF gives you.

I suppose lack of accuracy in event timestamps could cause some issues, but that doesn't seem to be what you're referring to. Can you provide a bit more detail on why you need these timestamps?

@blurbusters
Copy link

blurbusters commented Feb 27, 2018

If this is the case, I don't see why you'd ever need a high accuracy timestamp other than for display time, which rAF gives you.

OK, that solves potentially one of the use cases.
However, it doesn't solve other use cases (e.g. WebVR)

The timestamps of input reads are important for calculating gametimes. For example, input reads must be accurate as possible relative to gametimes. And framerates fluctuates, so doing a performance.now concurrently with the input-read often results in better motionfeel for full-screen 3D graphics (OpenGL or WebVR).

There are use cases where we definitely need an accurate VSYNC time with the rAF() but there are use cases where the timestamp of an input read needs to be as accurate as possible (preferably 100us rather than 1us).

Also, VSYNC OFF development also means gametimes are more synced to frame-generation times, rather than the timing interval of refresh cycles. When games are designed this way, and then ported to HTML5, all kinds of side effects can happen when there is no access to microsecond-accurate gametimes. Gametimes are not always linked to VSYNC interval (refresh cycle timestamps) -- they can be and they frequently are -- but it is not always the case.

There are many complex nuances but the easiest one to explain is likely this: Imagine a full screen WebGL graphics in 4K resolution. 8000 pixels per second screen panning (e.g. mouse left/right or head-turning in WebVR). A 1ms error in 8000 pixel per second equals 8 pixels misalignments (which manifests itself as things like stutter or inaccurately-calculated debris trajectories, etc).

If PointerEvents keeps full precision of the timestamps of input reads (which can go to 1000Hz or 2000Hz for many gaming devices -- all modern gaming mice now run at ~1000 Hz including the cheap $39 Logitech gaming mice), then that helps quite hugely if you're leaving those untouched (and not rounding them off to the nearest Hz -- sometimes the sensors inside them run at odd rates, like 1500Hz, and so the timestamps are a little rough when rounding off to USB poll rates -- so you really don't want to clamp the sensor timestamps). Microsoft Research found that 1000 Hz input had human-eye noticeable benefits: https://www.youtube.com/watch?v=vOvQCPLkPt4 .... and it's now the standard rate in modern gaming mice. Anyway, other games are also setting "gametime" from performance.now() for various different logistical reasons.

The "gametime" concept is a fundamental 3D game development concept, first-person and virtual reality. It is taught in game design in classrooms -- google "gametime OpenGL" -- 200,000 hits in Google. Or "gametime Direct3D" -- 1,000,000 hits in Google. Are you familiar with the concept of gametime?

This stuff is messing with gametime, full stop.

In modern 3D development, especially with the ultra-demanding precision of virtual reality that keeps improving, gametimes needs to be microsecond accurate. We'll tolerate 20us and 100us, but 1ms is garbage.

What if the developer needs to link gametimes to a keypress read? What if the developer needs to link gametimes to rendertimes (and not refresh cycle times)? What if the developer needs to link gametimes to PointerEvents timestamps? There are many use cases where a path of game development may take, depending on the developer goal. performance.now is the only choice in some situations. Sometimes it's the prioritization of positionals to a headtracker, or the prioritization of positions to refresh cycles, or the prioritization of lower input lag (e.g. 1000 Hz mouse), or the prioritization of object trajectories during a dynamic framerate situation, or porting a variable-refresh-rate compatible game to HTML5 via direct recompliation, etc. Killing performance.now's ability to be an accurate stand-in for existing platform API's should not be done on a permanent basis.

If none of you have any game development experience, you need an Invited Expert to help assist you in the guidance of the importance of precision clocks for game development -- and avoid misguided and damaging assumptions like "1ms doesn't matter in game development" when even Microsoft Research disagrees with that.

The game designer should have that choice, and web standards creators should not dictate that choice -- if it's at all possible to avoid being forced to limit those choices. Every effort should be made.

I realize you are all under duress, and I realize this is all a temporary situation until a permanent solution, but it's probably a good idea to have Invited Experts (With at least some game design experience) participate in this discussion.

@plehegar
Copy link
Member Author

plehegar commented Mar 1, 2018

for alignment, see also comment from Tobin

@toddreifsteck
Copy link
Member

Per discussion on 5/17, the notes in PR #57 should resolve this issue. @plehegar

@toddreifsteck
Copy link
Member

Lets send a note to the privacy group after this is complete.

@yoavweiss
Copy link
Contributor

Closing since the note has landed, but feel free to reopen if I'm missing something

@igrigorik
Copy link
Member

igrigorik commented Apr 22, 2019

FYI, recapping latest status:

@dakom
Copy link

dakom commented Sep 16, 2019

So.. does this mean traditional game loops will be janky, no way around it?

I made a simple test based on mainloop.js here: https://github.com/dakom/mainloop-test

Gets a bit off in Chrome, in Firefox it's unbearable

@dakom
Copy link

dakom commented Sep 17, 2019

So after learning a bit more about the issue it seems like maybe it's not an issue of the timing precision but rather some funkiness with the browser scheduling rAF callbacks?

I made a simpler straight delta+rAF test here: https://github.com/dakom/plainloop-test

Seems some people don't experience much jank, but it's clearer on Firefox, and I see lots of it (through running for like 30 secs)

@yoavweiss
Copy link
Contributor

@dakom - Thanks for your comments. It seems your examples use either Date.now()-based timers or rAF. AFAICT, any jank you may experience in different implementations is not directly related to this issue.

@dakom
Copy link

dakom commented Sep 18, 2019

thanks for taking a look @yoavweiss - both are using rAF and the second is just straight delta using the DOMHighResTimeStamp that comes into the rAF callback.

But yeah - I guess it's an issue of vsync, frame-skipping, etc.? Will open an issue on the firefox list I guess

@asint
Copy link

asint commented Sep 16, 2022

Further to the excellent posts by blurbusters, I should also like to add a further situation, which is becoming more popular.

I don't see why you'd ever need a high accuracy timestamp other than for display time

Please consider the ever increasing use of workers. It is now not entirely uncommon to calculate time dependant calculations in a separate thread where a value of rAF is not available.
I am currently working on a webVR project were calculating and compensating for latencies is quite honestly a bit of a nightmare. I think it is plainly obvious to envision these problems when considering workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response.
Projects
None yet
Development

No branches or pull requests

8 participants