Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-resolution timer gets tripped up by floating point arithmetic #207

Closed
novemberborn opened this issue Sep 13, 2018 · 26 comments · Fixed by #214
Closed

High-resolution timer gets tripped up by floating point arithmetic #207

novemberborn opened this issue Sep 13, 2018 · 26 comments · Fixed by #214
Assignees

Comments

@novemberborn
Copy link
Contributor

  • Lolex version : 2.7.4

What did you expect to happen?

Advancing the clock with fractional milliseconds should give me an integer nanosecond value when I call process.hrtime(prev).

What actually happens

The nanosecond value is a float, not an integer.

How to reproduce

const clock = lolex.install()
clock.tick(1022.7791)
clock.hrtime([0, 20000000]) // [ 1, 2779099.999999881 ]

The reproduction advanced the clock by a fraction of a nanosecond. I suppose it'd be fine to round the computed nanosecond value, but I'm not sure if it should be rounded up or down. Alternatively something like https://www.npmjs.com/package/decimal.js may help to avoid the floating point arithmetic. Perhaps tick() should round down to the closest nanosecond, but I'm not sure if that'll help or not.

@benjamingr
Copy link
Member

An alternative would be to have clock.tick accept a BigInt optionally or expose a tickPrecise that takes a BigInt of nanoseconds.

I can look into this when I get back from holiday at the beginning of October

@novemberborn
Copy link
Contributor Author

BigInt support isn't widespread yet. I think tick could take fractions, but the calculations have to be done differently.

@SimenB
Copy link
Member

SimenB commented Sep 18, 2018

@fatso83 is it acceptable to add a dependency here to handle the maths of it? floating point arithmetics sucks :(

EDIT: I do doubt it (seeing as lolex is dep free now), but is just a Math.round good enough? Any edge cases? The following diff works

diff --git c/src/lolex-src.js w/src/lolex-src.js
index 14f55ca..4300371 100644
--- c/src/lolex-src.js
+++ w/src/lolex-src.js
@@ -101,7 +101,7 @@ function withGlobal(_global) {
      * % operator that also works for negative numbers
      */
     function fixedModulo(n, m) {
-        return ((n % m) + m) % m;
+        return Math.round(((n % m) + m) % m);
     }
 
     /**
diff --git c/test/lolex-test.js w/test/lolex-test.js
index 6684870..9fe56b2 100644
--- c/test/lolex-test.js
+++ w/test/lolex-test.js
@@ -2361,6 +2361,14 @@ describe("lolex", function () {
                 assert.same(result[0], 1);
                 assert.same(result[1], 0);
             });
+
+            it("should handle floating point", function () {
+                var clock = lolex.createClock();
+                clock.tick(1022.7791);
+                var result = clock.hrtime([0, 20000000]);
+
+                assert.equals(result, [1, 2779100]);
+            });
         });
     }
     if (nextTickPresent) {

@fatso83
Copy link
Contributor

fatso83 commented Sep 18, 2018

I'd say lets go with the simplest thing that seems to work and if people find edge cases we haven't thought of, we will fix it in time. Math.round seems fine to me and it will probably fix the 99% immediately. I am not an expert or authority in this matter, so no strong feelings.

SimenB added a commit to SimenB/lolex that referenced this issue Sep 18, 2018
fatso83 pushed a commit that referenced this issue Sep 19, 2018
@novemberborn
Copy link
Contributor Author

This is still a problem. From #210:

If someone ever gets a problem with round-off errors ... we'll tackle it then.

👋

Starting from 742 milliseconds, ticking by 123.493 milliseconds ends up rounding to 865492920 nanoseconds.

Try in your REPL:

node -pe "742 + 123.493"                                                                                                           8.12.0
865.4929999999999

A different approach may be to do the computations in nanoseconds directly. I think that should be fine, integer wise, unless you start doing ticks larger than a 100 days worth of nanoseconds.

@SimenB SimenB reopened this Sep 19, 2018
@SimenB
Copy link
Member

SimenB commented Sep 19, 2018

$ node -p '(742 * 1e9 + 123.493 * 1e9) / 1e9'
865.493

maybe?

I think that should be fine, integer wise, unless you start doing ticks larger than a 100 days worth of nanoseconds.

We can check the size before multiplying. And either do 1e8 or whatever, or just accept that if you tick by that much, sub milli precision doesn't really matter

@fatso83
Copy link
Contributor

fatso83 commented Sep 19, 2018

@novemberborn I am not sure I see the problem wrt lolex. We round off the numbers, meaning it should round up in your case. Can you demonstrate that the problem mentioned in this issue at the top isn't solved?

Your demo only shows how floating points are representing. Since we don't truncate that value, rather rounding it, how does the problem manifest?

@novemberborn
Copy link
Contributor Author

We round off the numbers, meaning it should round up in your case. Can you demonstrate that the problem mentioned in this issue at the top isn't solved?

My example came from using lolex:

Starting from 742 milliseconds, ticking by 123.493 milliseconds ends up rounding to 865492920 nanoseconds.

Note the value is 865492920 not 865493000.

@fatso83
Copy link
Contributor

fatso83 commented Sep 19, 2018

That's not a verification step. I made a runnable example using the 2.7.5 release and it shows the correct result: 845493000.

So, no, I am not convinced we have a bug 😸 Can you make sure you are running 2.7.5, not 2.7.4? I don't like to have bugs in our code, so of course, prove me wrong if I am!

@fatso83
Copy link
Contributor

fatso83 commented Sep 20, 2018

Closing, but please reopen if you can demonstrate a bug using the latest release.

@fatso83 fatso83 closed this as completed Sep 20, 2018
@novemberborn
Copy link
Contributor Author

Here you go: https://runkit.com/novemberborn/5ba363ba0f73fb001265cb0f

The difference should be 123493000 but I get 123492920.

@fatso83 fatso83 reopened this Sep 20, 2018
@fatso83
Copy link
Contributor

fatso83 commented Sep 20, 2018

Verified. The good thing is that we know have verifiable test cases 😁 One thing I noticed is that now returns floating point values. That's incorrect and needs to be addressed regardless of path using Math.floor.

I see two paths forwards:

  1. The "correct" one: change all internal internal representation of time to nanoseconds (anyone remembers seeing this before?).
  2. Try to hold all nanosecond logic in hrtime and deal with round off as best we can.

The first version should "obviously" be correct and most round off cases will fall away by removing all floating point at tick after converting to nanoseconds. The downside is that the changes spread all over library. It's pretty small though!

The second approach is what we currently do and require fewer changes, but it's hard to get exact when dealing with inexact numbers. There's some minor issues relating to keeping the high resolution and "normal" time in sync.

Personally I like the first best, and think the changes can be done in less than an hour.

What do you think?

@fatso83 fatso83 closed this as completed Sep 20, 2018
@fatso83 fatso83 reopened this Sep 20, 2018
@novemberborn
Copy link
Contributor Author

You'd have to separate the seconds from the nanoseconds in your internal representation though, right?

@fatso83
Copy link
Contributor

fatso83 commented Sep 21, 2018

@novemberborn I was thinking why, but then after doing some math I realized you are right. I am not sure if this is what you meant, but it has to do with the maximum safe integer that can be represented.

The MAX_SAFE_INTEGER constant has a value of 9007199254740991. This is 9e15. The current value for the number of milliseconds is about 1.5e12. If we should store that as nanos, then that would mean a number around 1.5e18 (as it is), which is a thousand times higher than the max safe integer representable using Javascript.

So yeah, that would mean storing seconds and nanos separately, which is not the elegant solution I thought of. Argh.

@novemberborn
Copy link
Contributor Author

novemberborn commented Sep 21, 2018

It's probably OK to keep the storage in milliseconds, and then have separate storage for the remainder down to the nanosecond.

If ticks are presented as integers, then you don't need to touch the remainder. If it's a float then you adjust the remainder, and if it ticks over (positively or negatively) adjust the milliseconds.

The remainder should be truncated to always be an integer. E.g. if I tick by 0.0000009 milliseconds, then just treat that as a 0, since it'd be 0.9 nanoseconds.

@fatso83
Copy link
Contributor

fatso83 commented Sep 21, 2018

@novemberborn So, in general, you propose we go ahead with (a modified of) the first approach? Store integers internally (even if it means a tad more logic for two numbers).

@novemberborn
Copy link
Contributor Author

@fatso83 yes, but you only need to use the second value if the tick amount is fractional. If we're ticking by seconds or milliseconds then it doesn't impact the nanosecond values at all.

@mroderick
Copy link
Member

I think this would be a valuable upgrade to lolex.

Reading the discussion is interesting, but it's easy to overlook things. @novemberborn @fatso83 could you write up a little proposal (perhaps just in the original entry above) and specify exactly what changes need to be made, how we verify that the work is correct and complete?

@fatso83
Copy link
Contributor

fatso83 commented Oct 4, 2018

I think we want this:

We should change the internal storage of time to support representing time at the nanosecond granularity as integers. This could be done using a tuple of [millis,nanos] or similar. When time is ticked a non-integer amount of milliseconds this number should be split into its integer (millis) and fractional part, and the fractional part should be converted to the remaining number of nanoseconds. We don't support sub-nanosecond precision, so the remainder should be truncated.

Example: clock.tick(100.123456789) should internally be represented as [100, 123456] (or some equivalent representation).

When the nanosecond part of the internal clock as a result of a tick becomes a integer larger than (10^6-1) it should result in in the clock increasing the millisecond part by 1 and adjusting the nanosecond part accordingly (nanos=nanos%1e6).

@mroderick
Copy link
Member

Awesome, now we just need someone to do the work :)

@fatso83 fatso83 self-assigned this Oct 5, 2018
fatso83 added a commit to fatso83/lolex that referenced this issue Oct 5, 2018
@fatso83
Copy link
Contributor

fatso83 commented Oct 5, 2018

Argh, I started this in a fork of mine (see above ref), but I have used three hours trying to make my initial test cases pass. I tried several variations after finding there are so many corner cases when we support negative ticks and negative clock values! The essence of the trouble is this: I am not totally sure what the "right" integer value for millisecond is at times, as it depends on whether we are using fixedFloor or Math.floor() and what happens around 0.


Cases that needs to be agreed upon

These all revolve around the integer value of Date.now(). hrtime can be ignored for the moment.

now=0.1, tick=-1

Pre: Date.now() == 0?
Post: Date.now() == 0 or -1?

now=1.1, tick=-1

Pre: Date.now() == 1
Post: Date.now() == 0

now=-0.3, tick=0.4

Pre: Date.now() == 0 or -1?
Post: Date.now() == 0?

now=-1.3, tick=0.4

Pre: Date.now() == -1?
Post: Date.now() == 0 or -1?


I think the best solution is simply going for Math.floor() every time.
So -0.1 -> -1, -1.1 -> -2, 1.1 -> 1, 0.1 -> 0 etc.

fatso83 added a commit to fatso83/lolex that referenced this issue Oct 5, 2018
fatso83 added a commit to fatso83/lolex that referenced this issue Oct 5, 2018
@fatso83
Copy link
Contributor

fatso83 commented Oct 6, 2018

In a conversation with Morgan we agreed on

  • always flooring the values (using Math.floor())
  • not allowing negative ticks

That made an implementation much simpler and easier to reason about, but I hit a snag with regards to handling the one test that deals with hrtime not being affected by calls to setSystemTime(), which I had not thought of (tgf regression tests!). This requires keeping two representations of time (which is why we had both hrNow and now: one linear and one civil time, both keeping up with nanosecond remainders. I have tried dealing with it in an alternate branch, but it turned out into a hard-to-reason mess, as the logic that was locally contained started spreading around. Suddenly I needed to understand every case of updateHrTime, and I didn't ... I think I need to let this rest for a while, but feel free to check it out and/or play with/build upon the two branches.

@mroderick
Copy link
Member

Perhaps we should spend some time refactoring the affected parts of the code, write better tests for the existing code ... all to make it easier to work with?

@benjamingr
Copy link
Member

I'm absolutely fine with even throwing on negative ticks, I had no idea it was supported to begin with.

@fatso83
Copy link
Contributor

fatso83 commented Oct 6, 2018

I managed to fix the remaining test in a simple manner, so I'll submit a PR shortly. I'll add a test for throwing on negative ticks as well.

fatso83 added a commit to fatso83/lolex that referenced this issue Oct 6, 2018
hrNow was an internal detail that was exposed for no
apparent reason. Its content is available through clock.hrtime()
and/or clock.performance.now().
@fatso83
Copy link
Contributor

fatso83 commented Oct 6, 2018

@novemberborn I think #214 should itch your scratch. Would you mind checking it out and seeing if it solves your issues? See the "Verifying" section for how to test this version locally.

fatso83 added a commit to fatso83/lolex that referenced this issue Oct 6, 2018
fatso83 added a commit to fatso83/lolex that referenced this issue Oct 6, 2018
hrNow was an internal detail that was exposed for no
apparent reason. Its content is available through clock.hrtime()
and/or clock.performance.now().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants