-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use System.nanoTime for elapsed time #11058
Conversation
There's a bug for this http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6458294 also, so we should make sure we don't rely on it monotonically increasing anywhere in the code. |
@@ -400,16 +402,18 @@ public synchronized long time() { | |||
if (time >= 0) { | |||
return time; | |||
} | |||
return Math.max(0, System.currentTimeMillis() - startTime); | |||
long t = TimeValue.nsecToMSec(System.nanoTime()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the assignment to t
here? All of the other replacements didn't use any temporary variables (just curious)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, leftover from debugging; I'll put it back (no assignment to t)
Left some minor comments, but this looks good. That |
Yeah, timestamps are really messy :) In general we should never rely on them for anything "important". Since I'm replacing System.currentTimeMillis() here, ES should already not be relying on a monotonic clock... nanoTime should only be an improvement since it "tries harder" to be monotonic (JVM falls back to currentTimeMillis if monotonic clock isn't available). |
The TODO is really scary ... if clock jumps backwards by an hour suddenly writing legacy checksums can spin in a busy loop for an hour before finishing! But: I think this code may be dead in master? Do we really ever call LegacyChecksums.writeChecksums after opening an ancient index? |
Really I'd love to ban the System.currentTimeMillis API (Rob's idea) but in this PR I've only addressed maybe 25% of our usage because we expose the timestamps through public APIs in the other ~75% cases. |
Thanks for the feedback @dakrone, I pushed another commit folding it in. |
LGTM About whether we write the checksums, it looks like we do call it when restoring an older version of a shard from a snapshot as well as when we clean files at the end of phase1 recovery. Maybe we can just use |
OK, I just pushed commit to use Math.max(lastVersion+1, System.currentTimeMillis()) |
System.currentTimeMillis is more vulnerable when the clock shifts. Closes #11058
I'm running a "power loss" ES test and noticed this strange log line from ES after reboot:
The [4h] is completely wrong ... the recovery only took a minute or two. Digging into it, I noticed that for some reason, on my Fedora Linux box, System.currentTimeMillis suddenly jumps 4 hours into the future, thus messing up this log line. This is reproducible...
I think when we want to measure elapsed time we should use System.nanoTime, unless it's in some crazy hot spot (nanoTime is somewhat more costly). currentTimeMillis pulls from the local clock and timezone, which can easily change... whereas nanoTime, on modern JVM/OS/hardware, should only monotonically increase.
When I changed RecoveryState.time to use nanoTime instead, the wrong [4h] output is fixed.
We use System.currentTimeMillis in a number of places in ES, and unfortunately I was only able to fix a subset of them in this PR because in the other places the timestamp is part of the API and I'm not sure if someone "relies" on that timestamp having come from currentTimeMillis...