Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix Instant/Duration math precision & associativity on Windows #57380
tl;dr Addition and subtraction on Duration/Instant are not associative on windows because we use the perfcounter frequency in every calculation instead of just when we measure time.
This is my first contrib (PR or Issue) to Rust, so please lmk if I've done this wrong. I followed CONTRIBUTING to the extent I could given my system doesn't seem to be able to build the compiler with changes in the source tree. I also asked about this issue in #rust-beginners a week or so ago, before digging through libstd -- I'm unsure if there's a good way to follow up on that, but I'd be happy to update the docs on the timing structs if this fixes the problem.
On windows (and I think macos too), however, our math gets messy because the Instant type stores the current point in time in units of HPET Performance Counter counts, not nanoseconds, and does unit conversions on every math operation, rather than just when we measure the time from the system clock.
I tried this code:
On UNIX machines (linux and macos) it behaves as you'd expect -- no crash.
On Windows hosts, however, it blows up because of a precision problem in the Instant +/- Duration math:
On windows I think this is a consequence of doing the HPET-PerfCounter-Unit conversion on each math operation. I suspect the reason it works on macs is that (from what I could find online) most apple machines report timing in nanoseconds anyway. For anyone interested, the equivalent functions on macos, with a little work to fish out the numerator/denominator from a timebase struct:
If this PR ends up working as I expect it to when CI runs the tests, I can make the same changes to the macos implementation.
We ought to be able to sort this out by storing nanoseconds, rather than PerfCounter units, that way intermediate math is done in the most precise units we support and we're only doing unit conversions when we actually measure the system clock (and it might even translate to a small perf gain for people doing tons of Instant/Duration math).
I believe this will address the underlying cause of #56034, and make the guessed epsilon constant from #56059 unnecessary. If it's of interest, I can write up how these timing types work on the tier 1 platforms to address #32626 as well, since I'm already in here figuring it out.
To that end, I've got this patch, which I think should fix it on windows, but I'm having trouble testing it -- any time I change anything in libstd I start getting this error, which no amount of clean building seems to resolve:
Since you wrote the linked PRs and might remember looking at related problems:
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.
If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.
Please see the contribution instructions for more information.
left a comment
Thanks so much for the thorough PR and description! I've been feeling for some time now that we probably want to just switch to a bland
One comment I'd have is that we may want to consider the esoteric use case of moving very far back in time. If you take a huge duration and subtract it from the current
It would definitely be of interest! We'd be more than happy to help take a look at some docs here.
cc @sfackler, I think you've had thoughts on the implementations of these APIs on the past, just wanted to make sure you didn't have any objections to this either!
Thanks for the quick turnaround!
I got a fresh shiny Windows VM going after work today, and I can finally get the test suite to run, albeit a bit slowly.
I believe you're correct about the potential for a loss of range with this change as it is today, at least on some machines. Given a pair of windows hosts, if one is giving us PerfCounter increments in nanoseconds and another is giving us microseconds, the precision of our math, and extent of time we can represent, has to vary accordingly.
If you'd like to see one, I can come up with a concrete example of a
Personally I'm leaning toward moving the underlying type to a
I think this probably isn't much of a breaking change since we already don't promise anything about the
I think I personally feel that this is the best way to go now. Using a second/nanosecond counter (even if it's not precisely
To be clear I would be worried about something like representing the date 1500 in
I'm not really sure how well this is supported today, however, nor whether it's really a case that comes up that often. I believe for
In any case starting with Windows sounds like a good idea, and we can probably just copy all the code into
Checking in -- @sfackler did you get a chance to peek at this PR / do you have an opinion about using a Duration as the backing type for Instant?
@retep998 - it's certainly possible, although since 1/ns = 1 GHz, so it seems unlikely to me that we'd have a clock much more precise, given the speeds CPUs operate at... food for thought.
Sorry that took me a little to get to — my usual windows machine is still failing to build the test suite (I suspect I've misconfigured visual studio at some point along the way).
Regardless, the last commit on here reworks this PR to use a Duration as the backing type, which simplifies the code a fair bit. If this looks good overall, I'll stack on another commit to reduce the perf_counter module interface to a couple querying functions, and cut another PR for applying the same change to macos.
Click to expand the log.
I'm not sure what's up with that travis failure — looking at the logs, it's showing a failure referring to code that doesn't exist in the current HEAD:
Any chance it's got some wonky caching going on?
Looks great to me!
It's probably worthwhile having a small comment in