time: use monotonic clock to measure elapsed time #12914

Closed
tsuna opened this Issue Oct 13, 2015 · 154 comments

Comments

Projects
None yet
Contributor

tsuna commented Oct 13, 2015

Go's standard library doesn't provide any API to access a monotonic clock source, which means one can't reliably time how long an operation takes without copy-pasting or importing some low-level platform-dependent code such as the clock package at https://github.com/davecheney/junk.

This API doesn't necessarily need to return a time.Time, just returning a number of nanoseconds as a uint64 would be enough, kinda like Java's System.nanoTime().

Contributor

adg commented Oct 13, 2015

Do we have a monotonic clock source we can export?

If not, why should this be in the standard library and not just an
importable package?

On 13 October 2015 at 12:42, Benoit Sigoure notifications@github.com
wrote:

Go's standard library doesn't provide any API to access a monotonic clock
source, which means one can't reliably time how long an operation takes
without copy-pasting or importing some low-level platform-dependent code
such as the clock package at https://github.com/davecheney/junk.

This API doesn't necessarily need to return a time.Time, just returning a
number of nanoseconds as a uint64 would be enough, kinda like Java's
System.nanoTime()
http://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--
.


Reply to this email directly or view it on GitHub
#12914.

Contributor

tsuna commented Oct 13, 2015

Why should this be in the standard library: because it's a pretty fundamental thing, especially in a language geared towards systems programming. I see a lot of people timing how long operations take using time.Now() and subtracting two time.Time objects, and I always end up saying "this is not a correct way of measuring how much time elapsed" but don't have a clear alternative to propose.

Come on, even Java has it.

Contributor

adg commented Oct 13, 2015

I see a lot of people timing how long operations take using time.Now() and subtracting two time.Time objects, and I always end up saying "this is not a correct way of measuring how much time elapsed"

Got any concrete examples?

Contributor

tsuna commented Oct 13, 2015

Any concrete examples of ..? People using time.Now() or it being wrong because it doesn't take into account changes made to the system time?

Contributor

adg commented Oct 13, 2015

I suppose what I'm getting at is this: If there's a broad need for such a package, why isn't there one that people are already using?

This API doesn't necessarily need to return a time.Time, just returning a number of nanoseconds as a uint64 would be enough, kinda like Java's System.nanoTime().

time.Duration happens to represent a 64-bit quantity nanoseconds, and, unless there's some large problem with it, should be favored as the return type for any monotonic duration handling functions.

rakyll added this to the Unplanned milestone Oct 13, 2015

Contributor

taruti commented Oct 21, 2015

I think there are two things:

A) A proper monotonic clock package (there are multiple 3rd party ones)

B) Measuring elapsed time on systems where system time may leap to either direction

B is simpler. Either ensure and document that time.Since works
even with clock shifts or define new API like:

type MeasureStart struct { ... }
func (*MeasureStart)Start()
func (*MeasureStart)Elapsed() time.Duration
Member

crawshaw commented Oct 23, 2015

On what platforms is time.Now() not monotonic? For linux/amd64, time.Now is implemented by runtime·nanotime, which is calling clock_gettime (via VDSO) with CLOCK_MONOTONIC. I see similar calls being used on linux/arm, openbsd, freebsd, etc.

Instead of a new interface, can we document time.Now as monotonic?

Contributor

ianlancetaylor commented Oct 23, 2015

@crawshaw time.Now is not monotonic on any platform that supports settimeofday. On amd64 GNU/Linux time.Now does not actually use CLOCK_MONOTONIC, it uses CLOCK_REALTIME. It's runtime.nanotime that uses CLOCK_MONOTONIC.

Contributor

ianlancetaylor commented Oct 23, 2015

Sorry, I somehow missed that you mentioned nanotime yourself. That's not what implements time.Now, though, that is implemented by time·now.

Member

crawshaw commented Oct 23, 2015

Ah right, thanks. I got confused by //go:linkname time_runtimeNano time.runtimeNano, which I misread as implementing time.Now

Member

alexbrainman commented Oct 23, 2015

time.Now() is monotonic on windows. We use undocumented way to fetch time, but it is monotonic as far as I know. @dvyukov to confirm.

Alex

Contributor

tsuna commented Oct 23, 2015

On Mon, Oct 12, 2015 at 9:07 PM, Andrew Gerrand wrote:

I suppose what I'm getting at is this: If there's a broad need for such a package, why isn't there one that people are already using?

Because people incorrectly assume that time only moves forward? It's an extremely common mistake that even at Google we were trying to hammer in people's head. When you see the large impact that things such as leap seconds have (including at Google), it's easy to see this is a well known problem but also a recurring problem.

How is it that a language geared towards systems programming that comes with such a rich standard library offers no standard way of measuring time correctly?

Look at gRPC, every single use of time.Now() there is buggy because it implicitly assumes that time only moves forward and at a constant pace, neither of which is true for CLOCK_REALTIME. Pretty much any distributed systems code that uses time.Now() other than for printing the current time is subtly buggy. And there is a lot of it. And it's there chiefly because there is no alternative in the standard library to get monotonic time so people don't even think about it.

Contributor

rsc commented Oct 24, 2015

How is it that a language geared towards systems programming that comes with such a rich standard library offers no standard way of measuring time correctly?

I expect that if you care that much about time you will run your system clocks correctly. On a well-run system, time does only move forward and at a constant (enough) pace.

When you see the large impact that things such as leap seconds have (including at Google), it's easy to see this is a well known problem but also a recurring problem.

I don't understand this comment. Leap seconds don't exist at Google.

It is true that if you use time.Now to measure time during a leap smear then your measurements may be off by approximately 0.001%. But time will still be monotonic.

rsc changed the title from runtime: Add a new API to access a monotonic clock source to runtime: time: expose monotonic clock source Oct 24, 2015

Member

dvyukov commented Oct 24, 2015

I expect that if you care that much about time you will run your system clocks correctly. On a well-run system, time does only move forward and at a constant (enough) pace.

How about programs that run on client machines? Or server software that you sells to clients?

Member

dvyukov commented Oct 24, 2015

time.Now() is monotonic on windows. We use undocumented way to fetch time, but it is monotonic as far as I know. @dvyukov to confirm.

I don't think we return monotonic time from time.Now on windows. This would be a bug, because time.Now should return system time which can be changed by user.
The shared page contains InterruptTime, this time is monotonic and we use it for runtime.nanotime; and SystemTime, this time is not monotonic and we use for time.Now.

Member

alexbrainman commented Oct 24, 2015

What Dmitry said.

Alex

Contributor

tsuna commented Oct 25, 2015 edited

On Fri, Oct 23, 2015 at 9:50 PM, Russ Cox wrote:

I don't understand this comment. Leap seconds don't exist at Google.

I was working just down the hall from where you were in B43 in 2008, when the leap second smearing scheme was devised, because everybody remembered all too well how all hell broke loose for the previous leap second.

I'm not quite understanding, why the time and net package can get access to a monotonic clock, through runtime package, but nothing else can. Surely it would be beneficial to allow others to write packages that have similar capabilities to the time and net standard libraries? It seems the protections of the monotonic clock are currently limited to the standard library and inaccessible to all other libraries.

I can understand the arguments about clocks should never go backwards but the fact that time and net is already using monotonic time, in my opinion, means that discussion was had a long time ago and the argument for monotonic time won out. It's unfortunate it's been kept inaccessible to everyone else.

Is it feasible and would it be acceptable to expose a runtime.NanoTime? That name is likely misleading though but it seems reasonable to me. (https://github.com/golang/go/blob/master/src/runtime/time.go#L296)

@Arista-Jenkins Arista-Jenkins pushed a commit to aristanetworks/goarista that referenced this issue May 2, 2016

@tsuna tsuna atime: New package to provide access to a fast monotonic clock source.
This package fills in the gap of the Go standard library discussed in
golang/go#12914.

Change-Id: Icefcb17a11a7061be369318b50ca0cd55c53e7da
46272bf
Contributor

tsuna commented May 2, 2016

For those who need a solution to this problem, here's a way to expose Go's own runtime.nanotime(), which avoids the hassle of reimplementing an Nth solution that is fast & cross-platform: aristanetworks/goarista@46272bf

@bradfitz bradfitz modified the milestone: Go1.8, Unplanned Aug 10, 2016

quentinmit added the NeedsFix label Oct 11, 2016

Contributor

rsc commented Nov 8, 2016

Too late for new API in Go 1.8.

@rsc rsc modified the milestone: Go1.9Early, Go1.8 Nov 8, 2016

Contributor

rsc commented Nov 8, 2016

Note that we don't actually have monotonic time on all systems. Right now it is best effort only (on systems where we have it, the runtime uses it).

tv42 commented Dec 20, 2016

For what it's worth, here's a package that exposes a monotonic clock as time.Duration: https://github.com/gavv/monotime

amluto commented Jan 1, 2017

As someone who has made this exact mistake before, returning Duration is wrong. The function should return a MonotonicTime (or whatever you want to call it, but the key is that it should be a distinct type) and subtracting two MonotonicTimes should give a Duration.

With my Linux hat on, CLOCK_MONOTONIC doesn't return elapsed time since any particular epoch. It returns an arbitrary number that only has any significance when you subtract two such values.

liaoarden commented Jan 1, 2017 edited

time.Duration happens to represent a 64-bit quantity nanoseconds, and, unless there's some large problem with it, should be favored as the return type for any monotonic duration handling functions.

The largest problem with that is a time.Duration is not meant to just represent a fixed instant timepoint (which is what time.Now() returns), but a 64-bit quantity of nanoseconds that is the interval between two instant time points. Returning a Duration muddles that it isn't actually a duration but a fixed instant point in time.

There was a report at https://blog.cloudflare.com/how-and-why-the-leap-second-affected-cloudflare-dns/ that time.Now() ran backwards during the recent leap second, and that using Monotonic Time would have avoided a bug. While it is true that Monotonic Time would have been a good solution, there is still no reason for time.Now() to run backwards. I suspect the implementation of time.Now() does not handle leap seconds correctly. See https://commons.wikimedia.org/wiki/File:Avoid_Using_POSIX_time_t_for_Telling_Time.pdf for my paper on how handle time correctly by not using time_t.

amluto commented Jan 2, 2017

I agree that the current APIs suck. If someone proposes a nice API for asking "what time is it" that is efficiently implementable in a vDSO, is efficiently usable, and can safely return "I don't know what time it is", I'd be all for it. Sadly, adjtimex() is none of the above.

I think such an API would have to return "I don't know" unless new userspace is in use. Otherwise it'll just lie if smeared leap seconds are in use.

I would personally favor returning TAI and the UTC-TAI offset.

But this is totally off topic. Go should expose monotonic time.

Contributor

tsuna commented Jan 2, 2017 edited

The fact that an organization like CloudFlare, a prominent Go user and contributor, had a publicly visible outage due to this, should be a wake up call for the Go community.

While there is always a possibility that the engineers at CloudFlare would have made that mistake anyway, the truth is that this mistake is incredibly easy to make in Go because there is no standard alternative, and because of this it's a mistake that is widespread in a lot of prominent open-source Go projects (I mentioned gRPC as another example earlier), which further exacerbates the problem.

Go itself doesn't use time.Now() for its own bookkeeping needs, why force everybody to re-implement the platform-specific details behind runtime.nanotime()? For a systems programming language, isn't this need fundamental enough to warrant an API in the standard library?

edit: or using a hack like the one we published to expose runtime.nanotime() at https://github.com/aristanetworks/goarista/tree/master/monotime

kevinburke referenced this issue in Shyp/rickover Jan 2, 2017

Open

Library can measure negative times #9

Just exposing runtime.nanotime() doesn't fix the problem with SetDeadline(). Maybe time.MonotonicNow() could return a time.Time with a dedicated location to enable people using SetDeadline() safely.

sandys commented Jan 2, 2017

These lines from scala are particularly interesting - https://github.com/scala/scala/blob/2.12.x/src/library/scala/concurrent/SyncVar.scala#L42-L43

// nanoTime should be monotonic, but it's not possible to rely on that.
// See http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6458294.
Member

dsnet commented Jan 2, 2017 edited

I find it dirty for any solution to create a new type (e.g., MonotonicTime). The API for this could easily just create an epoch by grabbing the "monotonic time" at the start of the Go program and henceforth just return values relative to that epoch.

In fact, the API for this could just be a function func Elapsed() time.Duration that just returns the number of nanoseconds since the program start and will guaranteed to be monotonic and constant-rate.

An API like this has the benefits:

  • Elapsed has a well-defined and useful meaning (nanoseconds since program start) as opposed to other suggestions that just return an arbitrary value.
  • Use of Elapsed is a better alternative to time.Now().Sub(before), which is buggy and non-performance. In fact, a large number of CPU cycles is wasted in calling Sub. See #17858. Computing the difference in time with Elapsed requires no method calls.
  • Having Elapsed return a time.Duration just makes other time-related arithmetic easier.

amluto commented Jan 2, 2017

@dsnet: I think you're missing the point of a new type. Suppose I write a function that takes a "timeout" parameter. With a good typing model, it cannot be misused: a Duration would be a relative timeout and a Time or TAITime or MonotonicTime or similar would be an absolute timeout. Having the duration since the start of the program is barely better than using a plain integer -- the type gives no protection against misuse or incorrect refactoring.

C++11 gets this right.

mj1856 commented Jan 3, 2017 edited

Hi. Just to chime in here, any API that recommends using the computer's realtime clock for measuring elapsed durations is flawed. For example in the Go docs, it suggests here to subtract two local time values acquired via time.Now(). Those are clock values. The computer's clock is simply not precise enough, nor stable. The time can be literally altered by the user directly on most platforms, and can be synchronized via NTP and other mechanisms. Plus, it can drift on its own, and can be affected by drift-correction algorithms in the OS on some platforms. It's just not a good choice.

In other languages, such as .NET (as an example), there is a distinctly separate API for this, which is the Stopwatch class in the System.Diagnostics namespace. One starts a stopwatch, stops it, and evaluates the elapsed time. Under the hood, this uses the QueryPerformanceCounter (QPC) API on Windows. QPC is an abstraction over the processor's Time Stamp Counter (TSC) hardware that takes into account the challenges of multi-core machines. It does NOT use the same hardware as reading the time of day from the realtime clock.

For Windows, there is an excellent article with lots of supporting details here. IIRC, Linux and OSX platforms also have this functionality, which is provided through the CLOCK_MONOTONIC value of the clock_gettime function.

As far as I can tell, Go doesn't appear to expose this functionality in any way.

Contributor

tsuna commented Jan 3, 2017

Heh, I find it pretty telling that the example in the standard library is wrong. If anything, at least this example should be removed, or specifically called out as an example to not follow, even if the Go team doesn't agree to exposing something like runtime.nanotime() as part of the standard library.

tsuna referenced this issue in aristanetworks/goarista Jan 4, 2017

Closed

How does the 'monotime' package work? #13

mvpmvh commented Jan 4, 2017

this is closed? i read all of this but missed the resolution. what was the resolution??

Member

dsnet commented Jan 4, 2017

An issue in the aristanetworks/goarista repo was closed. This issue on the golang/go repo remains open.

joshlf commented Jan 6, 2017

As for the API issue, I think it should be quite straightforward. Simply add something like a NowMonotonic function that returns a time.Time that has the same guarantees as Linux's CLOCK_MONOTONIC - it progresses at the same rate as normal time, but does not necessarily have any particular starting value.

As an example of this, in a personal project of mine, I needed monotonic time to implement a timer feature. Here's the code where I implement it. Using the resulting time.Time values is very straightforward - you just treat them as you would normal time.Time values. The only complication is that you have to remember not to treat Now and NowMonotonic as the same source, but that's a perfectly normal sort of responsibility to put on the user of a library.

For example, one of the consumers of that timeout package needs to take in time.Time values which are in the standard reference frame used by Now, figure out what duration is left before that time will occur, and then figure out the equivalent time in the NowMonotonic reference frame for use with the timeout package. This code is very straightforward, and essentially boils down to this function:

// Converts a time.Time obtained using time.Now to a roughly-equivalent time
// in the space used by timeout.NowMonotonic.
func timeToMonotonic(t time.Time) time.Time {
	diff := t.Sub(time.Now())
	return timeout.NowMonotonic().Add(diff)
}

This doesn't address the concerns about other elements of the time package, but it does address the question of what the API should look like - the same basic time.Time type should be used, and functions should be marked with which reference frame they accept or produce.

Contributor

jayschwa commented Jan 6, 2017

I like @dsnet's proposal for func Elapsed() time.Duration. It seems like the lightest solution for both API and implementation.

Just exposing runtime.nanotime() doesn't fix the problem with SetDeadline(). Maybe time.MonotonicNow() could return a time.Time with a dedicated location to enable people using SetDeadline() safely.

SetDeadline() consumes time.Time, but only uses it to calculate a delta with respect to the monotonic clock: https://github.com/golang/go/blob/master/src/net/fd_poll_runtime.go#L125-L126

Overloading time.Time to contain both wall or monotonic time could be confusing and will probably require more changes. Code that uses time.Time will need additional checks for the special "monotonic" location. And many of time.Time's methods (e.g. formatting a date string) wouldn't make sense with a monotonic value.

Contributor

cespare commented Jan 6, 2017

@jayschwa

SetDeadline() consumes time.Time, but only uses it to calculate a delta with respect to the monotonic clock:

If I compute a deadline (using, say, time.Now().Add(time.Second)) and then the clock changes before SetDeadline calls time.Until, doesn't that still cause an issue?

joshlf commented Jan 6, 2017

If I compute a deadline (using, say, time.Now().Add(time.Second)) and then the clock changes before SetDeadline calls time.Until, doesn't that still cause an issue?

Yes, but I think it should be considered OK - once the monotonic clock delta is computed, everything is fine. Before that point, you're still working with a non-monotonic clock, so if the clock changes under your feet, that's OK because that's how the clock is defined to behave. The point is that the call to SetDeadline will always be fine because, at worst, the deadline will be in the past and so the deadline will be considered to already have fired.

If a user changes their clock one hour back, the deadline will be fired one hour too late. Moreover, firing a deadline too early can also be problematic if this is translated to an error.

joshlf commented Jan 6, 2017

I like @dsnet's proposal for func Elapsed() time.Duration. It seems like the lightest solution for both API and implementation.

I like this in theory, but I think a big question here is: what does this do to the net.Conn interface? Currently the deadline methods take time.Time values, but if we want to add the ability for them to take monotonic time values as well, we'd need to add extra methods such as SetDeadlineMonotonic, which accepts a time.Duration from Elapsed. This in turn could break third-party code that implements the net.Conn interface but does not provide SetDeadlineMonotonic and any other added methods.

Another concern with this approach is that there is lots of code that currently handles time.Time values, and would work just fine if, as in a different proposal, we instead added a MonotonicNow function that returned a time.Time with a "monotonic" location. However, that code would have to be duplicated or rewritten if we now also represent "the current time" as a time.Duration value. This, to me, is probably the biggest concern.

Owner

bradfitz commented Jan 6, 2017

Breaking the go1compat promise is out of the question, so changing net.Conn is not an option. That does seem to vote in favor of a magic time.Time location, of the options proposed so far.

joshlf commented Jan 6, 2017

@bradfitz To play devil's advocate for a second, I seem to remember that the SetDeadline* methods were not in net.Conn in go1.0 and were added later? Am I misremembering?

Owner

bradfitz commented Jan 6, 2017

@joshlf, you misremember. They were changed prior to Go 1. See $GOROOT/api/*.txt:

https://raw.githubusercontent.com/golang/go/master/api/go1.txt

That was what was in Go 1.

joshlf commented Jan 6, 2017

@bradfitz Gotcha.

Contributor

jayschwa commented Jan 6, 2017

If a user changes their clock one hour back, the deadline will be fired one hour too late.

@vincentbernat, that shouldn't be the case. When SetDeadline is called, the implementation immediately computes the deadline in terms of the monotonic clock. As others have pointed out, the main risk is if the clock changes between the input parameter being calculated and the call.

joshlf commented Jan 6, 2017

If a user changes their clock one hour back, the deadline will be fired one hour too late.

@vincentbernat, that shouldn't be the case. When SetDeadline is called, the implementation immediately computes the deadline in terms of the monotonic clock. As others have pointed out, the main risk is if the clock changes between the input parameter being calculated and the call.

Also, that's already the case. Adding a monotonic clock just makes it possible for users to avoid this situation. This isn't a drawback of any monotonic clock proposal.

Contributor

rsc commented Jan 6, 2017

CLOCK_MONOTONIC represents time since some unspecified epoch. There is no canonical conversion function for converting an arbitrary monotonic time to civil time. The monotonic time "5 seconds ago" might have been two hours ago if your laptop was closed. It doesn't make sense to reuse time.Time for this. On the contrary, we should keep time.Time as far away as possible.

There needs to be:

  • a function time.F1 to get the current monotonic time of type T1
  • a way to subtract two T1 to produce a time.Duration
  • a way to add a T1 and a time.Duration to produce a new T1
  • a way to compare two T1s for ordering

Anything I am missing?

One option is T1 is time.Duration, but you lose a bit of type safety in add and subtract that can catch logic bugs.

A likely better option is T1 is an opaque value type, with

type T1 struct { unexported fields }
func F1() T1
func (T1) Add(Duration) T1
func (T1) Sub(T1) Duration
func (T1) Before(T1) bool
// T1 can be compared for equality
func (T1) After(T1) bool

amluto commented Jan 6, 2017

Owner

bradfitz commented Jan 6, 2017

@rsc, you didn't address the net.Conn SetDeadline design question. That's why people are discussing shoehorning it into time.Time.

joshlf commented Jan 6, 2017

@rsc, you didn't address the net.Conn SetDeadline design question. That's why people are discussing shoehorning it into time.Time.

That's part of it, but also I think that having to rewrite lots of code is another downside. There's a lot of code that handles time.Time values in a way that doesn't assume anything about the clock that those values came from, and all of that code can be left as-is if we use time.Time to represent time in a monotonic clock.

Contributor

ianlancetaylor commented Jan 6, 2017

@joshlf It does not make sense to speak of a monotonic clock as being associated with a specific time, so can you give an example of what kind of code you are thinking of? A monotonic clock is really only meaningful for measuring the interval between a start time and an end time without getting confused by changes to the system's notion of the current time.

Since we now like the Context type, I think the right way to handle deadlines would be to change context.WithTimeout to use monotonic time rather than system time. That would keep the same API and shouldn't break any real uses. Then make sure that everything that uses a deadline can use a Context instead.

joshlf commented Jan 6, 2017

@joshlf It does not make sense to speak of a monotonic clock as being associated with a specific time, so can you give an example of what kind of code you are thinking of? A monotonic clock is really only meaningful for measuring the interval between a start time and an end time without getting confused by changes to the system's notion of the current time.

I disagree - I think it makes sense so long as you accept that the time you're given is not guaranteed to have a particular starting time. Just as "it's currently 1:59 pm on Jan 6, 2017" really means "it's x seconds since this arbitrary time we've deemed as the year 0", a similar statement about a monotonic clock - "it's currently time t" really means "it's t seconds since some arbitrary time" (which might be the beginning of the OS booting, the process starting, etc). A monotonic time behaves in all of the same ways as a normal time, only the "0" time is different. But we do that already - each different time zone has its own separate notion of when "0" happened.

By analogy, let's say that I write some code that takes a time and does something with it depending on what time zone it's in, and doing so requires some real-world knowledge about where that physical time zone is located, or why it was created, etc. Then, later, somebody comes along and adds a new time zone. Now my code has to say, "well, I don't know where that time is, so all I know about it is that it obeys certain rules that are common to all time zones, but other than that, I can't say anything about it." That's pretty much the exact same situation as with a monotonic clock, except that you have the added bonus of monotonicity.

Contributor

ianlancetaylor commented Jan 6, 2017

@joshlf It does not make sense to speak of a monotonic clock as being associated with a specific time, so can you give an example of what kind of code you are thinking of? A monotonic clock is really only meaningful for measuring the interval between a start time and an end time without getting confused by changes to the system's notion of the current time.

I disagree - I think it makes sense so long as you accept that the time you're given is not guaranteed to have a particular starting time. Just as "it's currently 1:59 pm on Jan 6, 2017" really means "it's x seconds since this arbitrary time we've deemed as the year 0", a similar statement about a monotonic clock - "it's currently time t" really means "it's t seconds since some arbitrary time" (which might be the beginning of the OS booting, the process starting, etc). A monotonic time behaves in all of the same ways as a normal time, only the "0" time is different. But we do that already - each different time zone has its own separate notion of when "0" happened.

I guess I don't understand the difference between what I said and what you said. A time.Time corresponds to an instant in time that everyone agrees on, and has useful features like conversion to year-month-day-hour-minute-second in a specific time zone. A monotonic time does not have that correspondence. So all you can do with it is measure intervals.

By analogy, let's say that I write some code that takes a time and does something with it depending on what time zone it's in, and doing so requires some real-world knowledge about where that physical time zone is located, or why it was created, etc. Then, later, somebody comes along and adds a new time zone. Now my code has to say, "well, I don't know where that time is, so all I know about it is that it obeys certain rules that are common to all time zones, but other than that, I can't say anything about it." That's pretty much the exact same situation as with a monotonic clock, except that you have the added bonus of monotonicity.

You are describing what a possible example might look like, but unfortunately that doesn't help me understand the concern. Can you give an actual example?

joshlf commented Jan 6, 2017

I guess I don't understand the difference between what I said and what you said. A time.Time corresponds to an instant in time that everyone agrees on, and has useful features like conversion to year-month-day-hour-minute-second in a specific time zone. A monotonic time does not have that correspondence. So all you can do with it is measure intervals.

Can't you do the same with a monotonic clock? It would instead be year-month-day-hour-minute-second since the start of the clock rather than from 0 CE, but it's the same idea, no?

You are describing what a possible example might look like, but unfortunately that doesn't help me understand the concern. Can you give an actual example?

I didn't mean that to be an example, but here's one. In general, I'm thinking of any code that does not rely on the absoluteness of a given time, but instead only relies on times to compute durations, to have a particular order, etc. An example would be benchmarking code (either the standard library's testing.B mechanism, or custom benchmarking code) - it only cares about using times to compute durations, and so a monotonic clock would be a fine drop-in. In fact, this might be desirable if the code being benchmarked modified the system time.

Contributor

ianlancetaylor commented Jan 6, 2017

Can't you do the same with a monotonic clock? It would instead be year-month-day-hour-minute-second since the start of the clock rather than from 0 CE, but it's the same idea, no?

But since the start of the clock is effectively unknown, there's no point. You may as well simply use time.Duration--nobody is proposing that monotonic clocks would not support time.Duration--which can already be displayed in a meaningful way.

As far as uses go, benchmark code is an interesting example. I think we can agree that it must change in some way. There is no way we can implement monotonic time source such that no change is required at all. And the changes required if we use a different type are trivial: you literally just change the type from time.Time to whatever type represents a monotonic time. You also use monotonictime.Now instead of time.Now. Everything else stays the same. And there is no reason to write the code to support either clock time or monotonic time--for benchmarking, monotonic time is the right choice. So I don't see that as a good example of why code might want to keep using time.Time.

joshlf commented Jan 6, 2017

As far as uses go, benchmark code is an interesting example. I think we can agree that it must change in some way. There is no way we can implement monotonic time source such that no change is required at all. And the changes required if we use a different type are trivial: you literally just change the type from time.Time to whatever type represents a monotonic time. You also use monotonictime.Now instead of time.Now. Everything else stays the same. And there is no reason to write the code to support either clock time or monotonic time--for benchmarking, monotonic time is the right choice. So I don't see that as a good example of why code might want to keep using time.Time.

That's fair. I know that POSIX does it this way - the clock methods are all the same, with the only difference being the use of the constant CLOCK_MONOTONIC and friends. Do you know if there's any C code that is intentionally agnostic so as to allow multiple different clocks?

Contributor

rsc commented Jan 7, 2017

@bradfitz I'm not worried about SetReadDeadline/SetWriteDeadline. Those use monotonic time internally already. If people use the idiom of SetDeadline(time.Now().Add(delta)), the window for problems is vanishingly small even today. It's a non-issue compared to the rest, and if that's the idiom then we can recognize it and fix it when it's time for breaking changes at some point.

joshlf commented Jan 7, 2017

@bradfitz I'm not worried about SetReadDeadline/SetWriteDeadline. Those use monotonic time internally already. If people use the idiom of SetDeadline(time.Now().Add(delta)), the window for problems is vanishingly small even today. It's a non-issue compared to the rest, and if that's the idiom then we can recognize it and fix it when it's time for breaking changes at some point.

Doing things that work 99% of the time strikes me as a really really bad idea... If the whole point of Go is to work at really large scales, then doing things that only break rarely is just not acceptable. That's especially true of things that will break more frequently if you don't use a particular idiom. In my mind, the complexity of an API that "works almost all the time so long as you use it in this particular way" goes in the face of Go's simplicity guarantees, especially compared to the ideal (and, in this case, completely attainable) API that "just works."

eswierk commented Jan 7, 2017

@joshlf Doing things that work 99% of the time strikes me as a really really bad idea... If the whole point of Go is to work at really large scales, then doing things that only break rarely is just not acceptable. That's especially true of things that will break more frequently if you don't use a particular idiom.

Exactly right. The "fixed" SetDeadline is in some ways worse than the broken one: now we have a vanishingly small probability of reproducing some mysterious hang due to a race with a system clock jump, virtually guaranteeing that the bug will surface only when the code is running in production at large scale.

The implementation of SetDeadline isn't the problem--the API is. It needs to take either a monotonic clock time or a delta/duration, not a wall-clock time.

joshlf commented Jan 7, 2017

Exactly right. The "fixed" SetDeadline is in some ways worse than the broken one: now we have a vanishingly small probability of reproducing some mysterious hang due to a race with a system clock jump, virtually guaranteeing that the bug will surface only when the code is running in production at large scale.

The implementation of SetDeadline isn't the problem--the API is. It needs to take either a monotonic clock time or a delta/duration, not a wall-clock time.

To be clear, this is already how it works - SetDeadline takes a wall clock time and converts it to a monotonic clock time under the hood. I think @rsc was simply suggesting that what we have already is good enough.

driskell commented Jan 8, 2017

My preference here is a time duration in seconds, if that is what is decided is the definition of the monotonic clock. It does not seem to me to make any practical sense to allow a point in time on the clock as during rendering to string you will have leap seconds to deal with and a whole array of complexities. So a duration seems to me to fit better with a monotonic clock.

With regards to SetDeadline etc. Could it be the case that under the hood they are at some point setting a monotonic deadline anyway (after a conversion.) Maybe this just needs to be exposed in the API too. It keeps the guarantees but provides new functionality for those that need/want it. Maybe SetExpiry(), SetReadTimeout() or... I'm not good at naming... SetWriteDoom()

In summary, I do think we have the pieces. They just need exposing in the API in a clean and concise way. By all means though I don't see why it can't be incremental approach as it's been a long time with the issue open and there's many more uses to an exposed monotonic clock than SetDeadline and benchmarking.

joshlf commented Jan 10, 2017

Could it be the case that under the hood they are at some point setting a monotonic deadline anyway (after a conversion.)

Yes, this is what they do.

Contributor

rsc commented Jan 10, 2017

We had a SetTimeout(time.Duration) and removed it because it wasn't clear whether the duration started at the call to SetTimeout or the call to each Read/Write operation: if you say SetTimeout(5*time.Seconds) does that mean each Read/Write waits for at most 5 seconds, or does it mean that all I/O must complete by 5 seconds after the call to SetTimeout? SetDeadline, by being clearly about a fixed point in time and not a duration relative to an unknown start, avoids this ambiguity. We will not go back to SetTimeout.

The APIs also aren't changeable today. As I wrote in #12914 (comment), use:

SetDeadline(time.Now().Add(delta))

and then if/when the APIs do change, we can find and update that idiom easily. (And again, this converts back to monotonic time inside, nearly immediately.)

I disagree with use of that idiom "virtually guaranteeing that the bug will surface only when the code is running in production at large scale". The only reason your clock should be moving at all is for strict adherence to leap seconds, and an extra or missed second in a deadline should not be a big deal. If your production clocks are jumping around by minutes or hours or days, you have other problems.

joshlf commented Jan 10, 2017

So it seems that we can't add a SetMonotonicDeadline(time.Duration) or similar because it wouldn't be backwards-compatible. This, to me, implies two options: adding a NowMonotonic() time.Time (that is, making time.Time also represent monotonic time somehow), or your suggestion: use the SetDeadline(time.Now().Add(delta)) idiom.

I disagree with use of that idiom "virtually guaranteeing that the bug will surface only when the code is running in production at large scale". The only reason your clock should be moving at all is for strict adherence to leap seconds, and an extra or missed second in a deadline should not be a big deal. If your production clocks are jumping around by minutes or hours or days, you have other problems.

I really think that this is the wrong philosophy to take. When a given property is not guaranteed (in this case, time is not guaranteed to be monotonic), we should make systems that work in the face of any behavior that is allowed by whatever properties still are guaranteed. This includes time jumping around wildly.

From a less philosophical perspective, if you have a weird server issue where time is flaky, I think it's better not to add more problems on top of that. Also, while the majority of Go's uses are in datacenters, Go is really used in a lot of places, including embedded devices, consumer desktops/laptops, etc - namely, places where your assertion that "[t]he only reason your clock should be moving at all is for strict adherence to leap seconds" doesn't hold. In these cases, the SetDeadline(time.Now().Add(delta)) idiom wouldn't guarantee correctness.

Contributor

tsuna commented Jan 10, 2017

Yeah I know I've been guilty of making similar statements, @rsc, but really, Google is Google, the rest of the world doesn't run on infrastructure that is nearly as well managed. And even if they did, "an extra or missed second in a deadline should not be a big deal" is not necessarily true either.

I'm the first one to blame people for not running NTP properly and all, but the reality is that this is such a widely spread problem that even the public NTP pool has serious issues with leap seconds (see https://twitter.com/tsunanet/status/816205447686299648 & https://community.ntppool.org/t/leap-second-2017-status/59). Go needs to be robust and correct out of the box.

driskell commented Jan 10, 2017 edited

So it seems that we can't add a SetMonotonicDeadline(time.Duration) or similar because it wouldn't be backwards-compatible.

What would it break? Adding a new API should be fine, should it not?

joshlf commented Jan 10, 2017

So it seems that we can't add a SetMonotonicDeadline(time.Duration) or similar because it wouldn't be backwards-compatible.

What would it break? Adding a new API should be fine, should it not?

The issue is that SetDeadline, SetReadDeadline, and SetWriteDeadline are all part of the net.Conn interface. If we add any methods to that interface (such as SetMonotonicDeadline), then any types which previously implemented the interface may no longer implement it, which could cause code to break.

Owner

bradfitz commented Jan 10, 2017

@driskell, we can't add a method to an interface, and net.Conn is an interface. See http://blog.merovius.de/2015/07/29/backwards-compatibility-in-go.html for some background reading.

@rsc, insisting that the wall clock time must never jump is an unreasonable position in general. Sure, for the specific case of a production data center where you have control over the servers and complete software stack then the time should be stable. But a developer trying to write robust applications does not necessarily have control over the environment in which it is run. As others have pointed out, embedded environments, applications running on consumer devices, and applications running in improperly-managed customer environments may not have access to a high-quality clock. Blaming the end user (especially if the end user is a paying customer) does not go very far.

Narrowing the window for the race condition in SetDeadline is extremely dangerous. Race conditions are bad enough, but when they are very rare they are virtually impossible to debug.

The old SetTimeout was much safer. The ambiguity of the start time can be addressed in the API documentation. Even if a developer makes the wrong assumption the code will fail in a dramatic fashion (e.g. after 5 seconds every I/O call fails), so even the most basic testing will expose the problem. Not to mention, SetTimeout is a lot more intuitive to me. I have yet to encounter a situation where I actually want to set a deadline time, so converting a duration to a deadline always seems like an unnecessary and awkward conversion.

Avoiding breaking the API is a harder problem. One possibility is to add a new interface containing the SetTimeout function. Although this function wouldn’t be part of the net.Conn interface, applications could test a net.Conn value to see if it supports the enhanced interface.

@bradfitz @JoshiF Aha, thanks, I should have picked that up.

With the explanation of reasoning for SetTimeout taking a fixed time from @rsc I wonder if some of the thoughts of simply using time.Time but with a monotonic time zone of sorts are the best route forward, maybe at least initially.

joshlf commented Jan 11, 2017

With the explanation of reasoning for SetTimeout taking a fixed time from @rsc I wonder if some of the thoughts of simply using time.Time but with a monotonic time zone of sorts are the best route forward, maybe at least initially.

Unfortunately, there are also some pretty compelling arguments against that. I think that, on balance, it's the best option available, but it's not without its issues. You can look back through the thread to see some of the discussion about it.

Contributor

ianlancetaylor commented Jan 11, 2017

Although we can't add a method to net.Conn we can add

func SetDeadlineMonotonic(c net.Conn, t monotonic.Time)

which can check for a known method implemented by all the net implementations of Conn and call that if available. If not--if called on a net.Conn implemented by some other package--it can fall back to calling SetDeadline and hoping for the best.

Contributor

tamird commented Jan 11, 2017

@ianlancetaylor how do you call SetDeadline if you have a monotonic.Time? I thought the point of that type was that it was not convertible to time.Time.

Contributor

cespare commented Jan 18, 2017

@rsc Your sketch is careful to fit into the existing time package. Does that mean you wish to rule out putting this type in a new package (such as hinted at by @ianlancetaylor's comment) or is that also part of the naming details to be decided on later?

The API outline looks good to me. The restrictions on encoding and zero values seem correct.

I still hope that at some point there's a way to set network deadlines to NOW().Add(d) without ever going through a wall clock. I wonder if we can figure out some way to smuggle a monotonic time into a deadline after locking ourselves into this API.

Contributor

rsc commented Jan 18, 2017

@cespare, yes, package is included in naming.

Contributor

rsc commented Jan 18, 2017

I'll write up a design doc (or maybe two) based on the above.

@rsc rsc modified the milestone: Proposal, Go1.9Early Jan 18, 2017

rsc self-assigned this Jan 18, 2017

orivej commented Jan 18, 2017

Is anything missing?

It may be desirable to document the basis of preference for e.g. CLOCK_MONOTONIC vs CLOCK_BOOTTIME (i.e. suspendable vs actually monotonic clock) to justify the choice of the latter on systems where it is available and switch to the latter on systems that add support for it in the future.

Contributor

ChrisHines commented Jan 19, 2017

// SinceSUFFIX returns the duration elapsed since time m.

// UntilSUFFIX returns the duration until monotonic time m.

What is time m?

joshlf commented Jan 20, 2017

@rsc how does this interact with SetDeadline and friends?

Contributor

rsc commented Jan 21, 2017

Still thinking. Design doc to follow.

Contributor

rsc commented Jan 24, 2017

I started writing a design doc based on my sketch above but I was not particularly happy with the rest. To test the sketch, I started looking at real code, particlarly the Go Corpus v0.01, which has roughly the top 100 Go language GitHub projects by both fork count and the top 100 by star count, provided 'go get -d' worked to download them.

The main problem with any new API is that it doesn't fix all the existing code, and by my count about ~30% of time.Now calls should - if you believe there is a significant difference - be converted to monotonic time. (The other ~70% are basically all immediately doing wall-clock operations like time.Now().Format(...) etc.) That's a lot of code to have to update, a lot of code that won't build on older versions of Go when you update it, and a lot of future code - 1 of 3 calls to time.Now - that you have to remember to use monotonic time instead. It seems hard to believe that we'll get all this right, both retroactively and for future code. That's not going to happen, and code like Cloudflare's will continue to crop up and cause problems.

I thought some more about the suggestion above to reuse time.Time with a special location. The special location still seems wrong, but what if we reuse time.Time by storing inside it both a wall time and a monotonic time, fetched one after the other?

Then there are two kinds of time.Times: those with wall and monotonic stored inside (let's call those “wall+monotonic Times”) and those with only wall stored inside (let's call those “wall-only Times”).

Suppose further that:

  • time.Now returns a wall+monotonic Time.
  • for t.Add(d), if t is a wall+monotonic Time, so is the result; if t is wall-only, so is the result.
  • all other functions that return Times return wall-only Times. These include: time.Date, time.Unix, t.AddDate, t.In, t.Local, t.Round, t.Truncate, t.UTC
  • for t.Sub(u), if t and u are both wall+monotonic, the result is computed by subtracting monotonics; otherwise the result is computed by subtracting wall times.
  • t.After(u), t.Before(u), t.Equal(u) compare monotonics if available (just like t.Sub(u)), otherwise walls.
  • all the other functions that operate on time.Times use the wall time only. These include: t.Day, t.Format, t.Month, t.Unix, t.UnixNano, t.Year, and so on

Doing this returns a kind of hybrid time from time.Now: it works as a wall time but also works as a monotonic time, and future operations use the right one. This automatically corrects the common idioms:

t := time.Now()
... operation ...
elapsed := time.Since(t)

and also the use of time.Times as deadlines in net.Conn and in context.Context, specifically this idiom:

t := time.Now().Add(delta)
SetDeadline(t)

where SetDeadline does things like:

remaining := deadline.Sub(time.Now())

or:

for time.Now().Before(deadline) {
	... work ...
}

These are, for presentation, kind of trivial. The survey of existing code that I did showed that there are all manner of non-trivial variations of uses of time.Now and basically all the ones that don't do a calendar operation (like formatting the time or converting to Unix seconds) should be using monotonic operations.

If we adjust the API implementation as sketched above, all this code just starts working correctly, with no changes needed.

Thoughts?

I'll post a second comment with details of the usage I found in my survey.

Contributor

rsc commented Jan 24, 2017 edited

time.Now usage notes

Analyzed uses of time.Now in Go Corpus v0.01.

Overall estimates:

  • 71% unaffected
  • 29% fixed in event of wall clock time warps (subtractions or comparisons)

Basic counts:

$ cg -f $(pwd)'.*\.go$' 'time\.Now\(\)' | sed 's;//.*;;' |grep time.Now >alltimenow
$ wc -l alltimenow
   16569 alltimenow
$ egrep -c 'time\.Now\(\).*time\.Now\(\)' alltimenow
63

$ 9 sed -n 's/.*(time\.Now\(\)(\.[A-Za-z0-9]+)?).*/\1/p' alltimenow | sort | uniq -c
4910 time.Now()
1511 time.Now().Add
  45 time.Now().AddDate
  69 time.Now().After
  77 time.Now().Before
   4 time.Now().Date
   5 time.Now().Day
   1 time.Now().Equal
 130 time.Now().Format
  23 time.Now().In
   8 time.Now().Local
   4 time.Now().Location
   1 time.Now().MarshalBinary
   2 time.Now().MarshalText
   2 time.Now().Minute
  68 time.Now().Nanosecond
  14 time.Now().Round
  22 time.Now().Second
  37 time.Now().String
 370 time.Now().Sub
  28 time.Now().Truncate
 570 time.Now().UTC
 582 time.Now().Unix
8067 time.Now().UnixNano
  17 time.Now().Year
   2 time.Now().Zone

That splits into completely unaffected:

  45 time.Now().AddDate
   4 time.Now().Date
   5 time.Now().Day
 130 time.Now().Format
  23 time.Now().In
   8 time.Now().Local
   4 time.Now().Location
   1 time.Now().MarshalBinary
   2 time.Now().MarshalText
   2 time.Now().Minute
  68 time.Now().Nanosecond
  14 time.Now().Round
  22 time.Now().Second
  37 time.Now().String
  28 time.Now().Truncate
 570 time.Now().UTC
 582 time.Now().Unix
8067 time.Now().UnixNano
  17 time.Now().Year
   2 time.Now().Zone
9631 TOTAL

and possibly affected:

4910 time.Now()
1511 time.Now().Add
  69 time.Now().After
  77 time.Now().Before
   1 time.Now().Equal
 370 time.Now().Sub
6938 TOTAL

If we pull out the possibly affected lines, the overall count is slightly higher because of the 63 lines with more than one time.Now call:

$ egrep 'time\.Now\(\)([^.]|\.(Add|After|Before|Equal|Sub)|$)' alltimenow >checktimenow
$ wc -l checktimenow
    6982 checktimenow

From the start, then, 58% of time.Now uses immediately flip to wall time and are unaffected.
The remaining 42% may be affected.

Randomly sampling 100 of the 42%, we find:

  • 32 unaffected (23 use wall time once; 9 use wall time multiple times)
  • 68 fixed

We estimate therefore that the 42% is made up of 13% additional unaffected and 29% fixed, giving an overall total of 71% unaffected, 29% fixed.

Unaffected

github.com/mitchellh/packer/vendor/google.golang.org/appengine/demos/guestbook/guestbook.go:97

func handleSign(w http.ResponseWriter, r *http.Request) {
	...
	g := &Greeting{
		Content: r.FormValue("content"),
		Date:    time.Now(),
	}
	... datastore.Put(ctx, key, g) ...
}

Unaffected.
The time will be used exactly once, during the serialization of g.Date in datastore.Put.

github.com/aws/aws-sdk-go/service/databasemigrationservice/examples_test.go:887

func ExampleDatabaseMigrationService_ModifyReplicationTask() {
	...
	params := &databasemigrationservice.ModifyReplicationTaskInput{
		...
		CdcStartTime:              aws.Time(time.Now()),
		...
	}
	... svc.ModifyReplicationTask(params) ...
}

Unaffected.
The time will be used exactly once, during the serialization of params.CdcStartTime in svc.ModifyReplicationTask.

github.com/influxdata/telegraf/plugins/inputs/mongodb/mongodb_data_test.go:94

d := NewMongodbData(
	&StatLine{
		...
		Time:          time.Now(),
		...
	},
	...
)

StatLine.Time is commented as "the time at which this StatLine was generated'' and is only used
by passing to acc.AddFields, where acc is a telegraf.Accumulator.

// AddFields adds a metric to the accumulator with the given measurement
// name, fields, and tags (and timestamp). If a timestamp is not provided,
// then the accumulator sets it to "now".
// Create a point with a value, decorating it with tags
// NOTE: tags is expected to be owned by the caller, don't mutate
// it after passing to Add.
AddFields(measurement string,
	fields map[string]interface{},
	tags map[string]string,
	t ...time.Time)

The non-test implementation of Accumulator calls t.Round, which will convert to wall time.

Unaffected.

github.com/spf13/fsync/fsync_test.go:23

// set times in the past to make sure times are synced, not accidentally
// the same
tt := time.Now().Add(-1 * time.Hour)
check(os.Chtimes("src/a/b", tt, tt))
check(os.Chtimes("src/a", tt, tt))
check(os.Chtimes("src/c", tt, tt))
check(os.Chtimes("src", tt, tt))

Unaffected.

github.com/flynn/flynn/vendor/github.com/gorilla/handlers/handlers.go:66

t := time.Now()
...
writeLog(h.writer, req, url, t, logger.Status(), logger.Size())

writeLog calls buildCommonLogLine, which eventually calls t.Format.

Unaffected.

github.com/ncw/rclone/vendor/google.golang.org/grpc/server.go:586

if err == nil && outPayload != nil {
	outPayload.SentTime = time.Now()
	stats.HandleRPC(stream.Context(), outPayload)
}

SentTime seems to never be used. Client code could call stats.RegisterRPCHandler to do stats processing and look at SentTime.
Any use of time.Since(SentTime) would be improved by having SentTime be monotonic here.

There are no calls to stats.RegisterRPCHandler in the entire corpus.

Unaffected.

github.com/openshift/origin/vendor/github.com/influxdata/influxdb/models/points.go:1316

func (p *point) UnmarshalBinary(b []byte) error {	
	...
	p.time = time.Now()
	p.time.UnmarshalBinary(b[i:])
	...
}

That's weird. It looks like it is setting p.time in case of an error in UnmarshalBinary, instead of checking for and propagating an error. All the other ways that a p.time is initalized end up using non-monotonic times, because they came from time.Unix or t.Round. Assuming that bad decodings are rare, going to call it unaffected.

Unaffected (but not completely sure).

github.com/zyedidia/micro/cmd/micro/util.go

// GetModTime returns the last modification time for a given file
// It also returns a boolean if there was a problem accessing the file
func GetModTime(path string) (time.Time, bool) {
	info, err := os.Stat(path)
	if err != nil {
		return time.Now(), false
	}
	return info.ModTime(), true
}

The result is recorded in the field Buffer.ModTime and then checked against future calls to GetModTime to see if the file changed:

// We should only use last time's eventhandler if the file wasn't by someone else in the meantime
if b.ModTime == buffer.ModTime {
	b.EventHandler = buffer.EventHandler
	b.EventHandler.buf = b
}

and

if modTime != b.ModTime {
	choice, canceled := messenger.YesNoPrompt("The file has changed since it was last read. Reload file? (y,n)")
	...
}

Normally Buffer.ModTime will be a wall time, but if the file doesn't exist Buffer.ModTime will be a monotonic time that will not compare == to any file time. That's the desired behavior here.

Unaffected (or maybe fixed).

github.com/gravitational/teleport/lib/auth/init_test.go:59

// test TTL by converting the generated cert to text -> back and making sure ExpireAfter is valid
ttl := time.Second * 10
expiryDate := time.Now().Add(ttl)
bytes, err := t.GenerateHostCert(priv, pub, "id1", "example.com", teleport.Roles{teleport.RoleNode}, ttl)
c.Assert(err, IsNil)
pk, _, _, _, err := ssh.ParseAuthorizedKey(bytes)
c.Assert(err, IsNil)
copy, ok := pk.(*ssh.Certificate)
c.Assert(ok, Equals, true)
c.Assert(uint64(expiryDate.Unix()), Equals, copy.ValidBefore)

This is jittery, in the sense that the computed expiryDate may not exactly match the cert generation that—one must assume—grabs the current time and adds the passed ttl to it to compute ValidBefore. It's unclear without digging exactly how the cert gets generated (there seems to be an RPC, but I don't know if it's to a test server in the same process). Either way, the two times are only possibly equal because of the rounding to second granularity. Even today, if the call expiryDate := time.Now().Add(ttl) happens 1 nanosecond before a wall time second boundary, this test will fail. Moving to monotonic time will not change the fact that it's jittery.

Unaffected.

github.com/aws/aws-sdk-go/private/model/api/operation.go:420

case "timestamp":
	str = `aws.Time(time.Now())`

This is the example generator for the AWS documentation. An aws.Time is always just being put into a structure to send over the wire in JSON format to AWS, so these remain OK.

Unaffected.

github.com/influxdata/telegraf/plugins/inputs/mongodb/mongodb_data_test.go:17

d := NewMongodbData(
	&StatLine{
		...
		Time:             time.Now(),
		...
	},
	...
}

Unaffected (see above from same file).

github.com/aws/aws-sdk-go/service/datapipeline/examples_test.go:36

params := &datapipeline.ActivatePipelineInput{
	...
	StartTimestamp: aws.Time(time.Now()),
}
resp, err := svc.ActivatePipeline(params)

The svc.ActivatePipeline call serializes StartTimestamp to JSON (just once).

Unaffected.

github.com/jessevdk/go-flags/man.go:177

t := time.Now()
fmt.Fprintf(wr, ".TH %s 1 \"%s\"\n", manQuote(p.Name), t.Format("2 January 2006"))

Unaffected.

k8s.io/heapster/events/manager/manager_test.go:28

batch := &core.EventBatch{
	Timestamp: time.Now(),
	Events:    []*kube_api.Event{},
}

Later used as:

buffer.WriteString(fmt.Sprintf("EventBatch     Timestamp: %s\n", batch.Timestamp))

Unaffected.

k8s.io/heapster/metrics/storage/podmetrics/reststorage.go:121

CreationTimestamp: unversioned.NewTime(time.Now())

But CreationTimestamp is only ever checked for being the zero time or not.

Unaffected.

github.com/revel/revel/server.go:46

start := time.Now()
...
// Revel request access log format
// RequestStartTime ClientIP ResponseStatus RequestLatency HTTPMethod URLPath
// Sample format:
// 2016/05/25 17:46:37.112 127.0.0.1 200  270.157µs GET /
requestLog.Printf("%v %v %v %10v %v %v",
	start.Format(requestLogTimeFormat),
	ClientIP(r),
	c.Response.Status,
	time.Since(start),
	r.Method,
	r.URL.Path,
)

Unaffected.

github.com/hashicorp/consul/command/agent/agent.go:1426

Expires: time.Now().Add(check.TTL).Unix(),

Unaffected.

github.com/drone/drone/server/login.go:143

exp := time.Now().Add(time.Hour * 72).Unix()

Unaffected.

github.com/openshift/origin/vendor/github.com/coreos/etcd/pkg/transport/listener.go:113:

tmpl := x509.Certificate{
	NotBefore:    time.Now(),
	NotAfter:     time.Now().Add(365 * (24 * time.Hour)),
	...
}
...
derBytes, err := x509.CreateCertificate(rand.Reader, &tmpl, &tmpl, &priv.PublicKey, priv)

Unaffected.

github.com/ethereum/go-ethereum/swarm/api/http/server.go:189

http.ServeContent(w, r, "", time.Now(), bytes.NewReader([]byte(newKey)))

eventually uses the passed time in formatting:

w.Header().Set("Last-Modified", modtime.UTC().Format(TimeFormat))

Unaffected.

github.com/hashicorp/consul/vendor/google.golang.org/grpc/call.go:187

if sh != nil {
	ctx = sh.TagRPC(ctx, &stats.RPCTagInfo{FullMethodName: method})
	begin := &stats.Begin{
		Client:    true,
		BeginTime: time.Now(),
		FailFast:  c.failFast,
	}
	sh.HandleRPC(ctx, begin)
}
defer func() {
	if sh != nil {
		end := &stats.End{
			Client:  true,
			EndTime: time.Now(),
			Error:   e,
		}
		sh.HandleRPC(ctx, end)
	}
}()

If something subtracted BeginTime and EndTime, that would be fixed by monotonic times.
I don't see any implementations of StatsHandler in the tree, though, so sh must be nil.

Unaffected.

github.com/hashicorp/vault/builtin/logical/pki/backend_test.go:396

if !cert.NotBefore.Before(time.Now().Add(-10 * time.Second)) {
	return nil, fmt.Errorf("Validity period not far enough in the past")
}

cert.NotBefore is usually the result of decoding an wire format certificate,
so it's not monotonic, so the time will collapse to wall time during the Before check.

Unaffected.

github.com/openshift/origin/vendor/k8s.io/kubernetes/plugin/pkg/admission/namespace/lifecycle/admission_test.go:194

fakeClock := clock.NewFakeClock(time.Now())

The clock being implemented does Since, After, and other relative manipulation only.

Unaffected.

Unaffected (but uses time.Time as wall time multiple times)

These are split out because an obvious optimization would be to store just the monotonic time
and rederive the wall time using the current wall-vs-monotonic correspondence from the
operating system. Using a wall form multiple times in this case could show up as jitter.
The proposal does not suggest this optimization, precisely because of cases like these.

github.com/docker/distribution/registry/storage/driver/inmemory/mfs.go:195

// mkdir creates a child directory under d with the given name.
func (d *dir) mkdir(name string) (*dir, error) {
	... d.mod = time.Now() ...
}

ends up being used by

fi := storagedriver.FileInfoFields{
	Path:    path,
	IsDir:   found.isdir(),
	ModTime: found.modtime(),
}

which will result in that time being returned by an os.FileInfo implementation's ModTime method.

Unaffected (but uses time multiple times).

github.com/minio/minio/cmd/server-startup-msg_test.go:52

// given
var expiredDate = time.Now().Add(time.Hour * 24 * (30 - 1)) // 29 days.
var fakeCerts = []*x509.Certificate{
	... NotAfter: expiredDate ...
}

expectedMsg := colorBlue("\nCertificate expiry info:\n") +
	colorBold(fmt.Sprintf("#1 Test cert will expire on %s\n", expiredDate))

msg := getCertificateChainMsg(fakeCerts)
if msg != expectedMsg {
	t.Fatalf("Expected message was: %s, got: %s", expectedMsg, msg)
}

Unaffected (but uses time multiple times).

github.com/pingcap/tidb/expression/builtin_string_test.go:42

{types.Time{Time: types.FromGoTime(time.Now()), Fsp: 6, Type: mysql.TypeDatetime}, 26},

The call to FromGoTime does:

func FromGoTime(t gotime.Time) TimeInternal {
	year, month, day := t.Date()
	hour, minute, second := t.Clock()
	microsecond := t.Nanosecond() / 1000
	return newMysqlTime(year, int(month), day, hour, minute, second, microsecond)
}

Unaffected (but uses time multiple times).

github.com/docker/docker/vendor/github.com/docker/distribution/registry/client/repository.go:750

func (bs *blobs) Create(ctx context.Context, options ...distribution.BlobCreateOption) (distribution.BlobWriter, error) {
	...
	return &httpBlobUpload{
		statter:   bs.statter,
		client:    bs.client,
		uuid:      uuid,
		startedAt: time.Now(),
		location:  location,
	}, nil
}

That field is used to implement distribution.BlobWriter interface's StartedAt method, which is eventually copied into a handlers.blobUploadState, which is sometimes serialized to JSON and reconstructed. The serialization seems to be the single use.

Unaffected (but not completely sure about use count).

github.com/pingcap/pd/_vendor/vendor/golang.org/x/net/internal/timeseries/timeseries.go:83

// A Clock tells the current time.
type Clock interface {
	Time() time.Time
}

type defaultClock int
var defaultClockInstance defaultClock
func (defaultClock) Time() time.Time { return time.Now() }

Let's look at how that gets used.

The main use is to get a now time and then check whether

if ts.levels[0].end.Before(now) {
	ts.advance(now)
}

but levels[0].end was rounded, meaning its a wall time. advance then does:

if !t.After(ts.levels[0].end) {
	return
}
for i := 0; i < len(ts.levels); i++ {
	level := ts.levels[i]
	if !level.end.Before(t) {
		break
	}

	// If the time is sufficiently far, just clear the level and advance
	// directly.
	if !t.Before(level.end.Add(level.size * time.Duration(ts.numBuckets))) {
		for _, b := range level.buckets {
			ts.resetObservation(b)
		}
		level.end = time.Unix(0, (t.UnixNano()/level.size.Nanoseconds())*level.size.Nanoseconds())
	}

	for t.After(level.end) {
		level.end = level.end.Add(level.size)
		level.newest = level.oldest
		level.oldest = (level.oldest + 1) % ts.numBuckets
		ts.resetObservation(level.buckets[level.newest])
	}

	t = level.end
}

Unaffected (but uses time multiple times).

github.com/astaxie/beego/logs/logger_test.go:24

func TestFormatHeader_0(t *testing.T) {
	tm := time.Now()
	if tm.Year() >= 2100 {
		t.FailNow()
	}
	dur := time.Second
	for {
		if tm.Year() >= 2100 {
			break
		}
		h, _ := formatTimeHeader(tm)
		if tm.Format("2006/01/02 15:04:05 ") != string(h) {
			t.Log(tm)
			t.FailNow()
		}
		tm = tm.Add(dur)
		dur *= 2
	}
}

Unaffected (but uses time multiple times).

github.com/attic-labs/noms/vendor/github.com/aws/aws-sdk-go/aws/signer/v4/v4_test.go:418

ctx := &signingCtx{
	...
	Time:        time.Now(),
	ExpireTime:  5 * time.Second,
}

ctx.buildCanonicalString()
expected := "https://example.org/bucket/key-._~,!@#$%^&*()?Foo=z&Foo=o&Foo=m&Foo=a"
assert.Equal(t, expected, ctx.Request.URL.String())

ctx is used as:

ctx.formattedTime = ctx.Time.UTC().Format(timeFormat)
ctx.formattedShortTime = ctx.Time.UTC().Format(shortTimeFormat)

and then ctx.formattedTime is used sometimes and ctx.formattedShortTime is used other times.

Unaffected (but uses time multiple times).

github.com/zenazn/goji/example/models.go:21

var Greets = []Greet{
	{"carl", "Welcome to Gritter!", time.Now()},
	{"alice", "Wanna know a secret?", time.Now()},
	{"bob", "Okay!", time.Now()},
	{"eve", "I'm listening...", time.Now()},
}

used by:

// Write out a representation of the greet
func (g Greet) Write(w io.Writer) {
	fmt.Fprintf(w, "%s\n@%s at %s\n---\n", g.Message, g.User,
		g.Time.Format(time.UnixDate))
}

Unaffected (but may use wall representation multiple times).

github.com/afex/hystrix-go/hystrix/rolling/rolling_timing.go:77

r.Mutex.RLock()
now := time.Now()
bucket, exists := r.Buckets[now.Unix()]
r.Mutex.RUnlock()

if !exists {
	r.Mutex.Lock()
	defer r.Mutex.Unlock()

	r.Buckets[now.Unix()] = &timingBucket{}
	bucket = r.Buckets[now.Unix()]
}

Unaffected (but uses wall representation multiple times).

Fixed

github.com/hashicorp/vault/vendor/golang.org/x/net/http2/transport.go:721

func (cc *ClientConn) RoundTrip(req *http.Request) (*http.Response, error) {
	...
	cc.lastActive = time.Now()
	...
}

matches against:

func traceGotConn(req *http.Request, cc *ClientConn) {
	... ci.IdleTime = time.Now().Sub(cc.lastActive) ...
}

Fixed.
Only for debugging, though.

github.com/docker/docker/vendor/github.com/hashicorp/serf/serf/serf.go:1417

// reap is called with a list of old members and a timeout, and removes
// members that have exceeded the timeout. The members are removed from
// both the old list and the members itself. Locking is left to the caller.
func (s *Serf) reap(old []*memberState, timeout time.Duration) []*memberState {
	now := time.Now()
	...
	for i := 0; i < n; i++ {
		...
		// Skip if the timeout is not yet reached
		if now.Sub(m.leaveTime) <= timeout {
			continue
		}
		...
	}
	...
}

and m.leaveTime is always initialized by calling time.Now.

Fixed.

github.com/hashicorp/consul/consul/acl_replication.go:173

defer metrics.MeasureSince([]string{"consul", "leader", "updateLocalACLs"}, time.Now())

This is the canonical way to use the github.com/armon/go-metrics package.

func MeasureSince(key []string, start time.Time) {
	globalMetrics.MeasureSince(key, start)
}

func (m *Metrics) MeasureSince(key []string, start time.Time) {
	...
	now := time.Now()
	elapsed := now.Sub(start)
	msec := float32(elapsed.Nanoseconds()) / float32(m.TimerGranularity)
	m.sink.AddSample(key, msec)
}

Fixed.

github.com/flynn/flynn/vendor/gopkg.in/mgo.v2/session.go:3598

if iter.timeout >= 0 {
	if timeout.IsZero() {
		timeout = time.Now().Add(iter.timeout)
	}
	if time.Now().After(timeout) {
		iter.timedout = true
		...
	}
}

Fixed.

github.com/huichen/wukong/examples/benchmark.go:173

t4 := time.Now()
done := make(chan bool)
recordResponse := recordResponseLock{}
recordResponse.count = make(map[string]int)
for iThread := 0; iThread < numQueryThreads; iThread++ {
	go search(done, &recordResponse)
}
for iThread := 0; iThread < numQueryThreads; iThread++ {
	<-done
}

// 记录时间并计算分词速度
t5 := time.Now()
log.Printf("搜索平均响应时间 %v 毫秒",
	t5.Sub(t4).Seconds()*1000/float64(numRepeatQuery*len(searchQueries)))
log.Printf("搜索吞吐量每秒 %v 次查询",
	float64(numRepeatQuery*numQueryThreads*len(searchQueries))/
		t5.Sub(t4).Seconds())

The first print is "Search average response time %v milliseconds" and the second is "Search Throughput %v queries per second."

Fixed.

github.com/ncw/rclone/vendor/google.golang.org/grpc/call.go:171

if EnableTracing {
	...
	if deadline, ok := ctx.Deadline(); ok {
		c.traceInfo.firstLine.deadline = deadline.Sub(time.Now())
	}
	...
}

Here ctx is a context.Context. We should probably arrange for ctx.Deadline to return monotonic times.
If it does, then this code is fixed.
If it does not, then this code is unaffected.

Fixed.

github.com/hashicorp/consul/consul/fsm.go:281

defer metrics.MeasureSince([]string{"consul", "fsm", "prepared-query", string(req.Op)}, time.Now())

See MeasureSince above.

Fixed.

github.com/docker/libnetwork/vendor/github.com/Sirupsen/logrus/text_formatter.go:27

var (
	baseTimestamp time.Time
	isTerminal    bool
)

func init() {
	baseTimestamp = time.Now()
	isTerminal = IsTerminal()
}

func miniTS() int {
	return int(time.Since(baseTimestamp) / time.Second)
}

Fixed.

github.com/flynn/flynn/vendor/golang.org/x/net/http2/go17.go:54

if ci.WasIdle && !cc.lastActive.IsZero() {
	ci.IdleTime = time.Now().Sub(cc.lastActive)
}

See above.

Fixed.

github.com/zyedidia/micro/cmd/micro/eventhandler.go:102

// Remove creates a remove text event and executes it
func (eh *EventHandler) Remove(start, end Loc) {
	e := &TextEvent{
		C:         eh.buf.Cursor,
		EventType: TextEventRemove,
		Start:     start,
		End:       end,
		Time:      time.Now(),
	}
	eh.Execute(e)
}

The time here is used by

// Undo the first event in the undo stack
func (eh *EventHandler) Undo() {
	t := eh.UndoStack.Peek()
	...
	startTime := t.Time.UnixNano() / int64(time.Millisecond)
	...
	for {
		t = eh.UndoStack.Peek()
		...
		if startTime-(t.Time.UnixNano()/int64(time.Millisecond)) > undoThreshold {
			return
		}
		startTime = t.Time.UnixNano() / int64(time.Millisecond)
		...
	}
}

If this avoided the call to UnixNano (used t.Sub instead), then all the times involved would be monotonic and the elapsed time computation would be independent of wall time. As written, a wall time adjustment during Undo will still break the code. Without any monotonic times, a wall time adjustment before Undo also breaks the code; that no longer happens.

*Fixed.

github.com/ethereum/go-ethereum/cmd/geth/chaincmd.go:186

start = time.Now()
fmt.Println("Compacting entire database...")
if err = db.LDB().CompactRange(util.Range{}); err != nil {
	utils.Fatalf("Compaction failed: %v", err)
}
fmt.Printf("Compaction done in %v.\n\n", time.Since(start))

Fixed.

github.com/drone/drone/shared/oauth2/oauth2.go:176

// Expired reports whether the token has expired or is invalid.
func (t *Token) Expired() bool {
	if t.AccessToken == "" {
		return true
	}
	if t.Expiry.IsZero() {
		return false
	}
	return t.Expiry.Before(time.Now())
}

t.Expiry is set with:

if b.ExpiresIn == 0 {
	tok.Expiry = time.Time{}
} else {
	tok.Expiry = time.Now().Add(time.Duration(b.ExpiresIn) * time.Second)
}

Fixed.

github.com/coreos/etcd/auth/simple_token.go:88

for {
	select {
	case t := <-tm.addSimpleTokenCh:
		tm.tokens[t] = time.Now().Add(simpleTokenTTL)
	case t := <-tm.resetSimpleTokenCh:
		if _, ok := tm.tokens[t]; ok {
			tm.tokens[t] = time.Now().Add(simpleTokenTTL)
		}
	case t := <-tm.deleteSimpleTokenCh:
		delete(tm.tokens, t)
	case <-tokenTicker.C:
		nowtime := time.Now()
		for t, tokenendtime := range tm.tokens {
			if nowtime.After(tokenendtime) {
				tm.deleteTokenFunc(t)
				delete(tm.tokens, t)
			}
		}
	case waitCh := <-tm.stopCh:
		tm.tokens = make(map[string]time.Time)
		waitCh <- struct{}{}
		return
	}
}

Fixed.

github.com/docker/docker/cli/command/node/ps_test.go:105

return []swarm.Task{
	*Task(TaskID("taskID1"), ServiceID("failure"),
		WithStatus(Timestamp(time.Now().Add(-2*time.Hour)), StatusErr("a task error"))),
	*Task(TaskID("taskID2"), ServiceID("failure"),
		WithStatus(Timestamp(time.Now().Add(-3*time.Hour)), StatusErr("a task error"))),
	*Task(TaskID("taskID3"), ServiceID("failure"),
		WithStatus(Timestamp(time.Now().Add(-4*time.Hour)), StatusErr("a task error"))),
}, nil

It's just a test, but Timestamp sets the Timestamp field in the swarm.TaskStatus used eventually in docker/cli/command/task/print.go:

strings.ToLower(units.HumanDuration(time.Since(task.Status.Timestamp))),

Having a monotonic time in the swam.TaskStatus makes time.Since more accurate.

Fixed.

github.com/docker/docker/integration-cli/docker_api_attach_test.go:130

conn.SetReadDeadline(time.Now().Add(time.Second))

Fixed.

github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/util.go:1696

timeout := 2 * time.Minute
for start := time.Now(); time.Since(start) < timeout; time.Sleep(5 * time.Second) {
	...
}

Fixed.

github.com/onsi/gomega/internal/asyncassertion/async_assertion_test.go:318

t := time.Now()
failures := InterceptGomegaFailures(func() {
	Eventually(c, 0.1).Should(Receive())
})
Ω(time.Since(t)).Should(BeNumerically("<", 90*time.Millisecond))

Fixed.

github.com/hashicorp/vault/physical/consul.go:344

defer metrics.MeasureSince([]string{"consul", "list"}, time.Now())

Fixed.

github.com/hyperledger/fabric/vendor/golang.org/x/net/context/go17.go:62

// WithTimeout returns WithDeadline(parent, time.Now().Add(timeout)).
// ...
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
	return WithDeadline(parent, time.Now().Add(timeout))
}

Fixed.

github.com/hashicorp/consul/consul/state/tombstone_gc.go:134

// nextExpires is used to calculate the next expiration time
func (t *TombstoneGC) nextExpires() time.Time {
	expires := time.Now().Add(t.ttl)
	remain := expires.UnixNano() % int64(t.granularity)
	adj := expires.Add(t.granularity - time.Duration(remain))
	return adj
}

used by:

func (t *TombstoneGC) Hint(index uint64) {
	expires := t.nextExpires()
	...
	// Check for an existing expiration timer
	exp, ok := t.expires[expires]
	if ok {
		...
		return
	}

	// Create new expiration time
	t.expires[expires] = &expireInterval{
		maxIndex: index,
		timer: time.AfterFunc(expires.Sub(time.Now()), func() {
			t.expireTime(expires)
		}),
	}
}

The granularity rounding will usually reuslt in something that can be used in a map key but not always.
The code is using the rounding only as an optimization, so it doesn't actually matter if a few extra keys get generated.
More importantly, the time passd to time.AfterFunc ends up monotonic, so that timers fire correctly.

Fixed.

github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/storage/etcd/etcd_helper.go:310

startTime := time.Now()
...
metrics.RecordEtcdRequestLatency("get", getTypeName(listPtr), startTime)

which ends up in:

func RecordEtcdRequestLatency(verb, resource string, startTime time.Time) {
	etcdRequestLatenciesSummary.WithLabelValues(verb, resource).Observe(float64(time.Since(startTime) / time.Microsecond))
}

Fixed.

github.com/pingcap/pd/server/util.go:215

start := time.Now()
ctx, cancel := context.WithTimeout(c.Ctx(), requestTimeout)
resp, err := m.Status(ctx, endpoint)
cancel()

if cost := time.Now().Sub(start); cost > slowRequestTime {
	log.Warnf("check etcd %s status, resp: %v, err: %v, cost: %s", endpoint, resp, err, cost)
}

Fixed.

github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kuberuntime/instrumented_services.go:235

func (in instrumentedImageManagerService) ImageStatus(image *runtimeApi.ImageSpec) (*runtimeApi.Image, error) {
	...
	defer recordOperation(operation, time.Now())
	...
}

// recordOperation records the duration of the operation.
func recordOperation(operation string, start time.Time) {
	metrics.RuntimeOperations.WithLabelValues(operation).Inc()
	metrics.RuntimeOperationsLatency.WithLabelValues(operation).Observe(metrics.SinceInMicroseconds(start))
}

Fixed.

github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/dockertools/instrumented_docker.go:58

defer recordOperation(operation, time.Now())

Fixed. (see previous)

github.com/coreos/etcd/tools/functional-tester/etcd-runner/command/global.go:103

start := time.Now()
for i := 1; i < len(rcs)*rounds+1; i++ {
	select {
	case <-finished:
		if i%100 == 0 {
			fmt.Printf("finished %d, took %v\n", i, time.Since(start))
			start = time.Now()
		}
	case <-time.After(time.Minute):
		log.Panic("no progress after 1 minute!")
	}
}

Fixed.

github.com/reducedb/encoding/benchtools/benchtools.go:98

now := time.Now()
...
if err = codec.Compress(in, inpos, len(in), out, outpos); err != nil {
	return 0, nil, err
}
since := time.Since(now).Nanoseconds()

Fixed.

github.com/docker/swarm/vendor/github.com/hashicorp/consul/api/semaphore.go:200

	start := time.Now()
	attempts := 0
WAIT:
	// Check if we should quit
	select {
	case <-stopCh:
		return nil, nil
	default:
	}

	// Handle the one-shot mode.
	if s.opts.SemaphoreTryOnce && attempts > 0 {
		elapsed := time.Now().Sub(start)
		if elapsed > qOpts.WaitTime {
			return nil, nil
		}

		qOpts.WaitTime -= elapsed
	}
	attempts++
	... goto WAIT ...

Fixed.

github.com/gravitational/teleport/lib/reversetunnel/localsite.go:83

func (s *localSite) GetLastConnected() time.Time {
	return time.Now()
}

This gets recorded in a services.Site's LastConnected field, the only use of which is:

c.Assert(time.Since(sites[0].LastConnected).Seconds() < 5, Equals, true)

Fixed.

github.com/coreos/etcd/tools/benchmark/cmd/watch.go:201

st := time.Now()
for range r.Events {
	results <- report.Result{Start: st, End: time.Now()}
	bar.Increment()
	atomic.AddInt32(&nrRecvCompleted, 1)
}

Those fields get used by

func (res *Result) Duration() time.Duration { return res.End.Sub(res.Start) }

func (r *report) processResult(res *Result) {
	if res.Err != nil {
		r.errorDist[res.Err.Error()]++
		return
	}
	dur := res.Duration()
	r.lats = append(r.lats, dur.Seconds())
	r.avgTotal += dur.Seconds()
	if r.sps != nil {
		r.sps.Add(res.Start, dur)
	}
}

The duration computation is fixed by use of monotonic time. The call tp r.sps.Add buckets the start time by converting to Unix seconds and is therefore unaffected (start time only used once other than the duration calculation, so no visible jitter).

Fixed.

github.com/flynn/flynn/vendor/github.com/flynn/oauth2/internal/token.go:191

token.Expiry = time.Now().Add(time.Duration(expires) * time.Second)

used by:

func (t *Token) expired() bool {
	if t.Expiry.IsZero() {
		return false
	}
	return t.Expiry.Add(-expiryDelta).Before(time.Now())
}

Only partly fixed because sometimes token.Expiry has been loaded from a JSON serialization of a fixed time. But in the case where the expiry was set from a duration, the duration is now correctly enforced.

Fixed.

github.com/hashicorp/consul/consul/fsm.go:266

defer metrics.MeasureSince([]string{"consul", "fsm", "coordinate", "batch-update"}, time.Now())

Fixed.

github.com/openshift/origin/vendor/github.com/coreos/etcd/clientv3/lease.go:437

now := time.Now()
l.mu.Lock()
for id, ka := range l.keepAlives {
	if ka.nextKeepAlive.Before(now) {
		tosend = append(tosend, id)
	}
}
l.mu.Unlock()

ka.nextKeepAlive is set to either time.Now() or

nextKeepAlive := time.Now().Add(1 + time.Duration(karesp.TTL/3)*time.Second)

Fixed.

github.com/eBay/fabio/cert/source_test.go:567

func waitFor(timeout time.Duration, up func() bool) bool {
	until := time.Now().Add(timeout)
	for {
		if time.Now().After(until) {
			return false
		}
		if up() {
			return true
		}
		time.Sleep(100 * time.Millisecond)
	}
}

Fixed.

github.com/lucas-clemente/quic-go/ackhandler/sent_packet_handler_test.go:524

err := handler.ReceivedAck(&frames.AckFrame{LargestAcked: 1}, 1, time.Now())
Expect(err).NotTo(HaveOccurred())
Expect(handler.rttStats.LatestRTT()).To(BeNumerically("~", 10*time.Minute, 1*time.Second))
err = handler.ReceivedAck(&frames.AckFrame{LargestAcked: 2}, 2, time.Now())
Expect(err).NotTo(HaveOccurred())
Expect(handler.rttStats.LatestRTT()).To(BeNumerically("~", 5*time.Minute, 1*time.Second))
err = handler.ReceivedAck(&frames.AckFrame{LargestAcked: 6}, 3, time.Now())
Expect(err).NotTo(HaveOccurred())
Expect(handler.rttStats.LatestRTT()).To(BeNumerically("~", 1*time.Minute, 1*time.Second))

where:

func (h *sentPacketHandler) ReceivedAck(ackFrame *frames.AckFrame, withPacketNumber protocol.PacketNumber, rcvTime time.Time) error {
	...
	timeDelta := rcvTime.Sub(packet.SendTime)
	h.rttStats.UpdateRTT(timeDelta, ackFrame.DelayTime, rcvTime)
	...
}

and packet.SendTime is initialized (earlier) with time.Now.

Fixed.

github.com/CodisLabs/codis/pkg/proxy/redis/conn.go:140

func (w *connWriter) Write(b []byte) (int, error) {
	...
	w.LastWrite = time.Now()
	...
}

used by:

func (p *FlushEncoder) NeedFlush() bool {
	...
	if p.MaxInterval < time.Since(p.Conn.LastWrite) {
		return true
	}
	...
}

Fixed.

github.com/docker/docker/vendor/github.com/docker/swarmkit/manager/scheduler/scheduler.go:173

func (s *Scheduler) Run(ctx context.Context) error {
	...
	var (
		debouncingStarted     time.Time
		commitDebounceTimer   *time.Timer
	)
	...

	// Watch for changes.
	for {
		select {
		case event := <-updates:
			switch v := event.(type) {
			case state.EventCommit:
				if commitDebounceTimer != nil {
					if time.Since(debouncingStarted) > maxLatency {
						...
					}
				} else {
					commitDebounceTimer = time.NewTimer(commitDebounceGap)
					debouncingStarted = time.Now()
					...
				}
			}
		...
	}
}

Fixed.

golang.org/x/net/nettest/conntest.go:361

c1.SetDeadline(time.Now().Add(10 * time.Millisecond))

Fixed.

github.com/minio/minio/vendor/github.com/eapache/go-resiliency/breaker/breaker.go:120

expiry := b.lastError.Add(b.timeout)
if time.Now().After(expiry) {
	b.errors = 0
}

where b.lastError is set using time.Now.

Fixed.

github.com/pingcap/tidb/store/tikv/client.go:65

start := time.Now()
defer func() { sendReqHistogram.WithLabelValues("cop").Observe(time.Since(start).Seconds()) }()

Fixed.

github.com/coreos/etcd/cmd/vendor/golang.org/x/net/context/go17.go:62

return WithDeadline(parent, time.Now().Add(timeout))

Fixed (see above).

github.com/coreos/rkt/rkt/image/common_test.go:161

maxAge := 10
for _, tt := range tests {
	age := time.Now().Add(time.Duration(tt.age) * time.Second)
	got := useCached(age, maxAge)
	if got != tt.use {
		t.Errorf("expected useCached(%v, %v) to return %v, but it returned %v", age, maxAge, tt.use, got)
	}
}

where:

func useCached(downloadTime time.Time, maxAge int) bool {
	freshnessLifetime := int(time.Now().Sub(downloadTime).Seconds())
	if maxAge > 0 && freshnessLifetime < maxAge {
		return true
	}
	return false
}

Fixed.

github.com/lucas-clemente/quic-go/flowcontrol/flow_controller.go:131

c.lastWindowUpdateTime = time.Now()

used as:

if c.lastWindowUpdateTime.IsZero() {
	return
}
...
timeSinceLastWindowUpdate := time.Now().Sub(c.lastWindowUpdateTime)

Fixed.

github.com/hashicorp/serf/serf/snapshot.go:327

now := time.Now()
if now.Sub(s.lastFlush) > flushInterval {
	s.lastFlush = now
	if err := s.buffered.Flush(); err != nil {
		return err
	}
}

Fixed.

github.com/junegunn/fzf/src/matcher.go:210

startedAt := time.Now()
...
for matchesInChunk := range countChan {
	...
	if time.Now().Sub(startedAt) > progressMinDuration {
		m.eventBox.Set(EvtSearchProgress, float32(count)/float32(numChunks))
	}
}

Fixed.

github.com/mitchellh/packer/vendor/google.golang.org/appengine/demos/helloworld/helloworld.go:19

var initTime = time.Now()

func handle(w http.ResponseWriter, r *http.Request) {
	...
	tmpl.Execute(w, time.Since(initTime))
}

Fixed.

github.com/ncw/rclone/vendor/google.golang.org/appengine/internal/api.go:549

func (c *context) logFlusher(stop <-chan int) {
	lastFlush := time.Now()
	tick := time.NewTicker(flushInterval)
	for {
		select {
		case <-stop:
			// Request finished.
			tick.Stop()
			return
		case <-tick.C:
			force := time.Now().Sub(lastFlush) > forceFlushInterval
			if c.flushLog(force) {
				lastFlush = time.Now()
			}
		}
	}
}

Fixed.

github.com/ethereum/go-ethereum/cmd/geth/chaincmd.go:159

start := time.Now()
...
fmt.Printf("Import done in %v.\n\n", time.Since(start))

Fixed.

github.com/nats-io/nats/test/conn_test.go:652

if firstDisconnect {
	firstDisconnect = false
	dtime1 = time.Now()
} else {
	dtime2 = time.Now()
}

and later:

if (dtime1 == time.Time{}) || (dtime2 == time.Time{}) || (rtime == time.Time{}) || (atime1 == time.Time{}) || (atime2 == time.Time{}) || (ctime == time.Time{}) {
	t.Fatalf("Some callbacks did not fire:\n%v\n%v\n%v\n%v\n%v\n%v", dtime1, rtime, atime1, atime2, dtime2, ctime)
}

if rtime.Before(dtime1) || dtime2.Before(rtime) || atime2.Before(atime1) || ctime.Before(atime2) {
	t.Fatalf("Wrong callback order:\n%v\n%v\n%v\n%v\n%v\n%v", dtime1, rtime, atime1, atime2, dtime2, ctime)
}

Fixed.

github.com/google/cadvisor/manager/container.go:456

// Schedule the next housekeeping. Sleep until that time.
if time.Now().Before(next) {
	time.Sleep(next.Sub(time.Now()))
} else {
	next = time.Now()
}
lastHousekeeping = next

Fixed.

github.com/google/cadvisor/vendor/golang.org/x/oauth2/token.go:98

return t.Expiry.Add(-expiryDelta).Before(time.Now())

Fixed (see above).

github.com/hashicorp/consul/consul/fsm.go:109

defer metrics.MeasureSince([]string{"consul", "fsm", "register"}, time.Now())

Fixed.

github.com/hashicorp/vault/vendor/github.com/hashicorp/yamux/session.go:295

// Wait for a response
start := time.Now()
...

// Compute the RTT
return time.Now().Sub(start), nil

Fixed.

github.com/go-kit/kit/examples/shipping/booking/instrumenting.go:31

defer func(begin time.Time) {
	s.requestCount.With("method", "book").Add(1)
	s.requestLatency.With("method", "book").Observe(time.Since(begin).Seconds())
}(time.Now())

Fixed.

github.com/cyfdecyf/cow/timeoutset.go:22

func (ts *TimeoutSet) add(key string) {
	now := time.Now()
	ts.Lock()
	ts.time[key] = now
	ts.Unlock()
}

used by

func (ts *TimeoutSet) has(key string) bool {
	ts.RLock()
	t, ok := ts.time[key]
	ts.RUnlock()
	if !ok {
		return false
	}
	if time.Now().Sub(t) > ts.timeout {
		ts.del(key)
		return false
	}
	return true
}

Fixed.

github.com/prometheus/prometheus/vendor/k8s.io/client-go/1.5/rest/request.go:761

//Metrics for total request latency
start := time.Now()
defer func() {
	metrics.RequestLatency.Observe(r.verb, r.finalURLTemplate(), time.Since(start))
}()

Fixed.

github.com/ethereum/go-ethereum/p2p/discover/udp.go:383

for {
	...
	select {
	...
	case p := <-t.addpending:
		p.deadline = time.Now().Add(respTimeout)
		...

	case now := <-timeout.C:
		// Notify and remove callbacks whose deadline is in the past.
		for el := plist.Front(); el != nil; el = el.Next() {
			p := el.Value.(*pending)
			if now.After(p.deadline) || now.Equal(p.deadline) {
				...
			}
		}
	}
}

Fixed assuming time channels receive monotonic times as well.

k8s.io/heapster/metrics/sinks/manager.go:150

startTime := time.Now()
...
defer exporterDuration.
	WithLabelValues(s.Name()).
	Observe(float64(time.Since(startTime)) / float64(time.Microsecond))

Fixed.

github.com/vmware/harbor/src/ui/auth/lock.go:43

func (ul *UserLock) Lock(username string) {
	...
	ul.failures[username] = time.Now()
}

used by:

func (ul *UserLock) IsLocked(username string) bool {
	...
	return time.Now().Sub(ul.failures[username]) <= ul.d
}

Fixed.

github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubectl/resource_printer_test.go:1410

{"an hour ago", translateTimestamp(unversioned.Time{Time: time.Now().Add(-6e12)}), "1h"},

where

func translateTimestamp(timestamp unversioned.Time) string {
	if timestamp.IsZero() {
		return "<unknown>"
	}
	return shortHumanDuration(time.Now().Sub(timestamp.Time))
}

Fixed.

github.com/pingcap/pd/server/kv.go:194

start := time.Now()
resp, err := clientv3.NewKV(c).Get(ctx, key, opts...)
if cost := time.Since(start); cost > kvSlowRequestTime {
	log.Warnf("kv gets too slow: key %v cost %v err %v", key, cost, err)
}

Fixed.

github.com/xtaci/kcp-go/sess.go:489

if interval > 0 && time.Now().After(lastPing.Add(interval)) {
	...
	lastPing = time.Now()
}

Fixed.

github.com/go-xorm/xorm/lru_cacher.go:202

el.Value.(*sqlNode).lastVisit = time.Now()

used as

if removedNum <= core.CacheGcMaxRemoved &&
	time.Now().Sub(e.Value.(*idNode).lastVisit) > m.Expired {
	...
}

Fixed.

github.com/openshift/origin/vendor/github.com/samuel/go-zookeeper/zk/conn.go:510

conn.SetWriteDeadline(time.Now().Add(c.recvTimeout))

Fixed.

github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/client/leaderelection/leaderelection.go:236

le.observedTime = time.Now()

used as:

if le.observedTime.Add(le.config.LeaseDuration).After(now.Time) && ...

Fixed.

k8s.io/heapster/events/sinks/manager.go:139

startTime := time.Now()
defer exporterDuration.
	WithLabelValues(s.Name()).
	Observe(float64(time.Since(startTime)) / float64(time.Microsecond))

Fixed.

golang.org/x/net/ipv4/unicast_test.go:64

... p.SetReadDeadline(time.Now().Add(100 * time.Millisecond)) ...

Fixed.

github.com/kelseyhightower/confd/vendor/github.com/Sirupsen/logrus/text_formatter.go:27

func init() {
	baseTimestamp = time.Now()
	isTerminal = IsTerminal()
}

func miniTS() int {
	return int(time.Since(baseTimestamp) / time.Second)
}

Fixed (same as above, vendored in docker/libnetwork).

github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver/v3_server.go:693

start := time.Now()
...
return nil, s.parseProposeCtxErr(cctx.Err(), start)

where

curLeadElected := s.r.leadElectedTime()
prevLeadLost := curLeadElected.Add(-2 * time.Duration(s.Cfg.ElectionTicks) * time.Duration(s.Cfg.TickMs) * time.Millisecond)
if start.After(prevLeadLost) && start.Before(curLeadElected) {
	return ErrTimeoutDueToLeaderFail
}

All the times involved end up being monotonic, making the After/Before checks more accurate.

Fixed.

Owner

bradfitz commented Jan 24, 2017

@rsc, SGTM. I was thinking along the same lines the other day but got distracting thinking about efficient representation and didn't finish thinking through which operations would use & return which time types. Your list looks good.

Contributor

cespare commented Jan 24, 2017

@rsc Your proposal is a very clever idea. It's appealing to fix most existing code transparently.

My initial reaction, though, is that the end situation is too subtle to explain, document, and use. It's strange to be able to have (at least in principle) two Times that are Equal but Format to different values (or vice versa).

That's a lot of code to have to update, a lot of code that won't build on older versions of Go when you update it, and a lot of future code - 1 of 3 calls to time.Now - that you have to remember to use monotonic time instead.

I'm not sure I buy that this is a significant problem. As you demonstrated from your code survey, it's mostly pretty simple to tell whether code needs to be changed.

Another small issue is that time.Now calls will become (presumably) 2x slower and time.Time values will become (presumably) 8 bytes larger. These are used in plenty of performance-critical code and tight loops, so I'm sure some users will notice.

Contributor

rsc commented Jan 24, 2017

@bradfitz, @cespare, it's easy to encode the time.Time values in the existing 24-byte footprint on 64-bit platforms (times outside something like 1970+/-146 years would be wall-only, never wall+monotonic). On 32-bit platforms, the footprint would bump from 16 to 20 bytes (two uint64+pointer, instead of uint64+uint32+pointer).

Reading the time is very efficient on most systems. I am not convinced that two reads will be noticed.

There is an unresolved question about explainability. Having a clear explanation in the docs is very important.

I don't completely understand the argument that this is "too subtle to use". The evidence shows that people already use these APIs just fine. But on systems where wall time can diverge from monotonic time, the code people want to write is subtly incorrect. The changes suggested above make it correct instead, without people needing to think through whether to use monotonic vs wall time at each use.

Most developers don't know about monotonic time or don't care. They are going to reach for the wall time API. There are always going to be surprises in that case when wall and monotonic case diverge. The surprises in the current implementation are things like finding out that your operation took negative 910ms. The surprises in the implementation sketched above are that if you go out of your way to compute time differences two different ways (once using wall time, say subtracting UnixNano, and once using monotonic times, say using t.Sub(u)), you might get different answers. I think that's actually a little easier to explain than negative elapsed times from trivial code.

It's already the case that times that are Equal may t.Format differently and vice versa, whether because they are in different time zones or because Format is omitting details (say, printing second resolution when the times differ by nanoseconds).

However, t.String does attempt to print a complete accounting of the time.Time (in particular, it prints the full nanosecond precision, and it redundantly prints both the time zone name and time zone offset). I would probably also make it print the monotonic time if present.

If we were going to add monotonic times as sketched above, I would suggest doing it as an experiment at the start of a cycle (say, Go 1.9) so we have lots of time to find out what is affected and possibly roll back.

@rsc rsc added Proposal and removed NeedsFix labels Jan 24, 2017

This is an interesting idea, but it does have the potential to break existing code. A problem can occur for code that stores a time.Time value (e.g. in serialized form) in a persistent file or database and compares times from different boot cycles. Imagine an application that uses time.Now().Add() to compute a future event time, stores that in a database, and then on a subsequent boot compares the value in the database to the current time using time.Sub(). According to this proposal the result would subtract two monotonic times from different boot cycles, resulting in garbage. To protect against that you would need to include a boot id as well, and only use the monotonic time if the boot id is the same.

This concern may be only theoretical since I don't know of any existing examples. Perhaps the more common idiom is to convert a time.Time to a Unix time before storing it persistently.

Contributor

rsc commented Jan 24, 2017 edited

@lacroute-sps Because monotonic time has no guaranteed meaning outside the current process, there can be no serialized form of a wall+monotonic time.Time. Serializing one necessarily downgrades it to a wall-only time.Time, no matter what the serialization. In that case, the round trip to the database will behave exactly as it does today.

@theckman theckman added a commit to theckman/cronner that referenced this issue Jan 24, 2017

@theckman theckman time keeping: switch to a monotonic time source
One of `cronner`'s core features is the ability to calculate how long a command
took to run on the system. Up until now, the only time source we had available
was the Wall-Clock Time of the system where `cronner` was running.
Unfortunately, there was always the known limitation that this was subject to
the value being wrong if NTPd were to adjust the time, or if a leap second were
to occur.

CloudFlare recently had issues with a leap second causing problems in their
software stack, and in [their blog post](https://blog.cloudflare.com/how-and-why-the-leap-second-affected-cloudflare-dns/)
brought some attention to the lack of a monotonic time source in Go. There has
been an issue open in Go for awhile to add a monotonic time source to the API,
for time keeping tasks like this:

* golang/go#12914

In this bug report above, the `github.com/aristanetworks/goarista/monotime`
package is called out as being able to provide access to the monotonic time
source.

So in short, switch our runner time keeping to use the monotonic time source
from the package above.

Signed-off-by: Tim Heckman <t@heckman.io>
a6189cd

@rsc I don't really agree that monotonic time has no guaranteed meaning outside the current process, as long as the system's monotonic clock doesn't reset (due to a reboot). However, I think adding the restriction that serializing a time.Time must downgrade to wall-only is a reasonable solution.

Contributor

jayschwa commented Jan 24, 2017

I don't really agree that monotonic time has no guaranteed meaning outside the current process, as long as the system's monotonic clock doesn't reset (due to a reboot).

Keep in mind that Go runs on many platforms. The underlying system monotonic clock that Go uses may not have the same characteristics on all of those platforms. It's safest for the new API to make as few guarantees as possible beyond the immediate problem it's trying to solve.

Contributor

tsuna commented Jan 24, 2017

Really nifty idea @rsc, if the performance impact is negligible/reasonable, that sounds like an ideal solution. I'd be happy to test any experimental CL you have – our code base spend much more time in time.Now() than I would like to admit.

Contributor

rogpeppe commented Jan 24, 2017

I like this idea. I'd suggest that we don't change the output of Time.String, as too many things may depend on it, but add another method that will print all the information and/or add a Monotonic method:

// Monotonic returns the monotonic time associated with t and reports
// whether there is one. The monotonic time is measured in nanoseconds
// since an arbitrary epoch.
func (t Time) Monotonic() (time.Duration, bool)

I'm not entirely sure whether time.UTC and friends should discard the monotonic time information. It's common do do time.Now().UTC() so that the time is comparable with DeepEquals, but that doesn't mean that it then be used for duration calculations. Given that changing the time zone doesn't actually change the time instant, why not make those methods retain the monotonic time info?

A drop in replacement for time.time would be nice, as it would allow for very quick fixes.

While this suggestion sounds magical, could we demonstrate it in golang.org/x first, and get a minimal function into the library itself for 1.9? We can then see if users are easily able to convert their code to use the monotonic now function directly - which will presumably be (marginally) more efficient even if time.Time becomes monotonic-aware. Or perhaps an alternative abstraction becomes more popular (e.g. Guava has Stopwatch).

I recognize this does not solve the SetDeadline problem though.

For cross-platform differences: on which platforms would uint64 MonotonicNanos() not be an appropriate signature?

Contributor

cespare commented Jan 24, 2017

I took a quick look through the standard library to see what effect @rsc's proposal would have. It looks like it will improve timing code in the following packages:

  • runtime/pprof tests
  • runtime tests
  • net/http and its tests
  • net tests
  • os/exec tests
  • crypto/tls tests
  • cmd/go (go test)
  • testing (fixes #18021 for free) and its tests
  • reflect tests
  • go/types and its tests

There are some other minor ones I didn't list and more I probably missed.

Member

myitcv commented Jan 25, 2017

Breaking the go1compat promise is out of the question...

The main problem with any new API is that it doesn't fix all the existing code, and by my count about ~30% of time.Now calls should - if you believe there is a significant difference - be converted to monotonic time. (The other ~70% are basically all immediately doing wall-clock operations like time.Now().Format(...) etc.)

@rsc @bradfitz doesn't this suggestion introduce a breaking change into the API?

As of now (no pun intended) time.Now() is currently defined to return a local time.

Currently, given the code:

package main

import (
	"fmt"
	"time"
)

const (
	format = "02 Jan 2006 15:04:05 MST"
)

func main() {
	// assume instead of time.Parse, t1 and t2 were instead initialised
	// via time.Now() at exactly the wall times show in the parse value 
	
	t1, err := time.Parse(format, "31 Dec 2016 23:59:59 GMT")
	if err != nil {
		panic(err)
	}
	
	// ...
	
	t2, err := time.Parse(format, "01 Jan 2017 00:00:01 GMT")
	if err != nil {
		panic(err)
	}

	fmt.Printf("Delta: %v", t2.Sub(t1))
}

we get the output:

Delta: 2s

Under your proposal we would instead get the output:

Delta: 3s

This problem is most obviously seen in timezones that observe daylight savings, but UTC is also subject to "problems" because of the leap second.

Is this not a breaking change? I can't seem to shift from my mind that there is probably a percentage of the 70% of cases you refer to that are relying on this behaviour, and that would be much harder to detect and therefore measure?

Owner

bradfitz commented Jan 25, 2017

@myitcv, no, that's not true. Under the proposal, time.Parse would only return a time.Time containing a wall time, not a wall+monotonic, so t2.Sub would only operate on the wall times, as it does today. No change to that code.

Member

myitcv commented Jan 25, 2017

@bradfitz I should have drawn more attention to the comment in the code:

	// assume instead of time.Parse, t1 and t2 were instead initialised
	// via time.Now() at exactly the wall times show in the parse value
Owner

bradfitz commented Jan 25, 2017

Oh, yeah, I missed that sorry.

Is this not a breaking change?

Some would argue it's a "fixing" change.

Member

myitcv commented Jan 25, 2017

Some would argue it's a "fixing" change.

Which is exactly the point I was leading into (assuming it is a breaking change, others might still correct me)

Let's assume for one second we don't find a solution that doesn't break the API. Then we're into the territory of debating which breaking change is "best", where "best" is a term which is defined along multiple dimensions:

  1. Number of changes required in core
  2. Number of changes required in user code
  3. The known side effects of the breaking change
  4. Ease of understanding of breaking change and when/where action is required
  5. ...

For @rsc's proposal it's points 3 and 4 where I have some doubt/uncertainty given the 70% category I described above.

I guess what I'd be keen to understand is how the various proposals (all breaking by the assumption above) compare along the lines of "best". For example, a proposal that involves defining time.Monotonic and changing the API on things like SetDeadline etc to more appropriate types (or indeed any other proposal): lots of changes required (but the compiler would tell you where exactly), but a clear means of working out how and where to make those changes (and indeed some tools could be provided to help)... clearly an incomplete analysis of this proposal, but you hopefully that illustrates my point.

Contributor

rsc commented Jan 25, 2017 edited

Ignoring all the implementation details of monotonic time, this change and this issue are really about making t.Sub return more accurate results when comparing values obtained by time.Now. I looked very hard, and I have yet to find a single piece of code that breaks due to reporting a more accurate result.

If it's just before a leap second and you do:

t1 := time.Now()
... 10 ms of work
t2 := time.Now()
... 10 ms of work
t3 := time.Now()
... 10 ms of work
const f = "15:04:05.000"
fmt.Println(t1.Format(f), t2.Sub(t1), t2.Format(f), t3.Sub(t2), t3.Format(f))

then I would argue that the new output

23:59:59.985 10ms 23:59:59.995 10ms 23:59:59.005

is more correct than

23:59:59.985 10ms 23:59:59.995 -990ms 23:59:59.005

The fact is that t2 and t3 correspond to calls of time.Now 10ms apart, not -990ms apart.

Putting leap seconds aside, suppose that instead it was just before noon and we did the same experiment but didn't print sub-second resolution:

t1 := time.Now()
... 10 ms of work
t2 := time.Now()
... 10 ms of work
t3 := time.Now()
... 10 ms of work
const f = "15:04:05"
fmt.Println(t1.Format(f), t2.Sub(t1), t2.Format(f), t3.Sub(t2), t3.Format(f))

This could print (even now):

11:59:59 10ms 11:59:59 10ms 12:00:00

No one is arguing that instead this should print:

11:59:59 0s 11:59:59 1s 12:00:00

That is, we all agree that reporting 10ms elapsed twice is correct even though printing hh:mm:ss times seems to say that first 0 seconds are elapsed and then 1 second is elapsed (or, if we were printing hh:mm, 0 minutes and 1 minute). Printed time is not always completely precise.

The same is true if the wall time changes during the execution of the program. Going back to the original example, t2 happened at 23:59:59.995, t3 happened at 23:59:59.905 (but the second 23:59:59). Just like in the noon example, the printed times don't tell the whole story. Recording the monotonic times lets t.Sub compute a more accurate result, again just like in the noon example.

Let's assume for one second we don't find a solution that doesn't break the API.

I'm sorry, but let's not. That hypothetical is off topic here. I spent a long time looking (see above) and I have yet to find a single example of code that breaks by making the proposed change. That is, all the evidence I have found is that this change does not break any API. We don't need to weigh the merits of different breaking changes because - as far as we can tell - this one is not a breaking change.

Edit: Fixed typos in noon example.

Member

myitcv commented Jan 25, 2017

That hypothetical is off topic here.

Definitely not trying to take things off topic here so apologies if that was the sentiment; my thought experiment was too hasty, not least because you don't believe this is a breaking change.

this change and this issue are really about making t.Sub return more accurate results when comparing values obtained by time.Now

t.Sub is currently not defined nor implemented to return elapsed time; it returns (rightly or wrongly) the "difference" between two time values, 2s in my example above. If we make it return elapsed time when performed between two walltime+monotonic values (obtained via the proposed time.Now), is that not a change? Not only that, does it not becomes more difficult to explain in the docs that if you don't have a walltime+monotonic value then you will get the "old behaviour" (maybe this is the point @cespare was getting at earlier)?

For example, if anyone has written some code against the current time package that, based on a calendar of leap seconds, years etc looks to calculate the elapsed time between two time.Time values obtained from time.Now (because t.Sub currently does not give this answer, as we've demonstrated), they will need to make a number of invocations of t.Sub around the times of the leap seconds, years etc. Under this proposal their calculation is going to break because they're going to add/subtract seconds needlessly and ultimately derive the wrong answer. Or have I got this wrong?

theckman commented Jan 25, 2017 edited

I think this looks good. With this proposed change, assuming I didn't want a monotonic time for an operation, what would be the canonical way to get the localtime time.Time value without the monotonic time value being included?

Contributor

rsc commented Jan 25, 2017

@theckman The eventual design and documentation will have to specify some canonical way, either by reusing an existing method (which would be nice for keeping new code compatible with old Go distributions) or by adding a new method. There are multiple options; two are time.Now().AddDate(0,0,0) or time.Now().Truncate(1).

A method like func (t Time) Wall() Time would be more readable (modulo that terrible name) for code moving forward.

If backwards compatibility is an issue, someone could still use one of the options listed.

Contributor

rsc commented Jan 25, 2017

@myitcv Time.Sub would continue to return the difference between two time values; the way I think about this is that the Time values returned by time.Now would become more precise, which improves the accuracy of subtracting them. (Specifically, in the case where the computer changes introduces discontinuities in "formatted time", the Times store enough information to compute an accurate difference even though the formatted times do not.)

Leap years are not relevant here: they're entirely regular, they've always been supported, they'll continue to be supported, and everyone agrees how they work.

Leap seconds are something different: essentially all time representations ignore them, for a variety of good reasons. The effect is that when a leap second does happen, some systems implement this by making wall time (what I called "formatted time" a few lines earlier) appear to jump backward by 1 second, which breaks all kinds of code (see above) written with the implicit assumption that time does not move backward. The change described above makes subtraction across that boundary work out properly, whereas today it does not. This is a little bit like subtraction of times across daylight savings conversions working out properly: sometimes 3:01 minus 12:59 is 2h2m, sometimes it is 1h2m, and sometimes it is 3h2m. Go gets the answer right by storing more than just the printed form.

Any time we fix a bug or even make an improvement, code depending on the buggy (or just old) behavior might break. That's not a breaking API change, but it certainly is something we take into account. For example, in Go 1.6 we changed compress/flate to return io.ErrUnexpectedEOF when decompressing a truncated input stream, instead of suggesting all was fine by returning io.EOF. This broke programs, mostly bad tests, but we judged it was more important to fix the bug than not to break programs depending on the buggy behavior. Similarly Go 1.6 changed sort to run significantly faster, but at the cost of changing the final order of elements that compared "equal" during sorting. Again we judged the performance win was worth breaking programs that were sensitive to which of the multiple correct results sort returned. We called both of those out in the Go 1.6 release notes so that users would be able to understand failures and adjust accordingly.

In this case, it absolutely matters what code might be affected by the time.Time change. I looked, and I couldn't find any. The suggestion to make the change at the start of a release cycle gives us maximal exposure to find any we've missed.

It's true that, assuming t and u are obtained by calls to time.Now, code written to use t.Sub(u) + leapSecondsBetween(t, u) instead of t.Sub(u) will now be wrong. So far we haven't found any such code. Note that even that workaround does not work correctly during a leap second: if it's 23:59:59.5, are you in the middle of the first 23:59:59 or the second 23:59:59? The new t.Sub(u) will handle that properly; the workaround fundamentally cannot.

However, in the specific workaround you posited, where the replacement for t.Sub(u) is implemented not by adding a fixup duration but by checking against a list of reference discontinuities, that will keep working just fine, since that list will contain wall-only times (not returned by time.Now), and comparisons against it will be unaffected. Even though I doubt such code exists, it also wouldn't break.

Contributor

tsuna commented Jan 25, 2017

@rsc, thanks a lot for taking the time to propose a thoughtful fix, survey a lot of Go code to assess the impact of the fix, and follow up on concerns being raised on this issue. I was a bit disappointed with how this issue was handled originally but now I'm like, +💯 to you guys for taking this seriously and doing something about it.

Contributor

rsc commented Jan 25, 2017

Thanks very much. Glad to see this work out nicely.

Contributor

taruti commented Jan 25, 2017

Just to note leap seconds are not the only issue:

  • Uncontrolled (Windows) client machines with users setting date/time manually exist in wild.
  • Systems lacking proper time on boot and jumping from 1970 to 2017 at some point during boot are quite common (e.g. some arm linux systems).
Contributor

rsc commented Jan 25, 2017

@taruti Absolutely, and the proposal fixes those too.

Member

myitcv commented Jan 25, 2017

@rsc

Leap years are not relevant here: they're entirely regular, they've always been supported, they'll continue to be supported, and everyone agrees how they work

You're right; that was an incorrect example.

This is a little bit like subtraction of times across daylight savings conversions working out properly: sometimes 3:01 minus 12:59 is 2h2m, sometimes it is 1h2m, and sometimes it is 3h2m. Go gets the answer right by storing more than just the printed form.

Your example has I think helped to explain what I've been missing here. Just to confirm, does this mean that in the case where the clocks go back in autumn/fall (i.e. in a timezone where the local time goes backwards):

   01:00      01:05                01:00      01:06            02:00
_____|__________|____________________|__________|________________|________
        |       |                               |            |
        |       |                               |            |
        | t1 := time.Now()                t2 := time.Now()   |
        |                                                    |
  Process starts                                 fmt.Printf("%v", t2.Sub(t1))

then t2.Sub(t1) is guaranteed to be 1h 1min in the current implementation?

I had been assuming that the answer would be 1 min and hence that there could be more situations than leap seconds where the current behaviour might be relied upon, on purpose or by accident. But thinking about it, if the timezone at the instant when t1 is initialised is BST (for example) but then adjusted to GMT when t2 is initialised (i.e. once the clocks have gone back), then the answer can unambiguously be calculated as 1h 1min?

Any time we fix a bug or even make an improvement, code depending on the buggy (or just old) behavior might break. That's not a breaking API change, but it certainly is something we take into account.

This is the key statement; I had, probably unreasonably, been sticking to the term "breaking API change" as a motivation for evaluating this proposal vs. others.

I'm 100% behind getting a fix in place here, not least because I agree it's simpler to reason about monotonic time (that the user/system clock cannot influence) when it comes to things like deadlines etc.

To echo @tsuna and others; thanks for taking the time to explain things further.

Contributor

rsc commented Jan 26, 2017

@myitcv, yes, in your example, all versions of Go agree that t2.Sub(t1) is 1h1m. The reason is that the time is represented internally as a delta from some fixed instant (in Go's case, the zero time Jan 1 year 1 00:00:00.000000000 UTC), and the presentation as 01:05 or 01:06 is computed only when asked to print the time. The two representations in that example are 61 minutes apart. Even if t1 and t2 print as 01:05 and 01:06, t1.UTC() and t2.UTC() would print as 00:05 and 01:06.

Wouldn't it be simpler to just include leap seconds in the count of seconds since the epoch? You would need a table of leap seconds, and you could use it to serialize the time as 23:59:60 during a leap second. Doing it that way would make the calculation of durations correct even when the wall clock time is based on a value read from a log file or transmitted between computers on a network link.

Contributor

tsuna commented Jan 26, 2017

We don't know when leap seconds are going to be introduced, so how would the table get updated when the IERS (International Earth Rotation and Reference Systems Service) decides to introduce a leap second because of irregularities in the Earth's rate of rotation?

Update the software when the IERS announces a leap second. That will require approximately one update every 18 months.

Contributor

tamird commented Jan 26, 2017

This is not a general solution. As mentioned by many people in this thread, leap seconds are not the only cause of wall time discontinuity.

If a computer starts up thinking it is 1970, and writes the time to a log file or transmits it to another computer, there is no way to recover from that error. Including monotonic time in the time structure solves the problem of a computer starting up thinking it is 1970, but not writing the time to a log file or another computer. Doing interval arithmetic correctly, by taking leap seconds into account, solves the problem of computing the interval between accurate wall clock times written to a log file or transmitted to another computer.

Contributor

rsc commented Jan 26, 2017

@JohnSauter, re:

Wouldn't it be simpler to just include leap seconds in the count of seconds since the epoch? You would need a table of leap seconds, and you could use it to serialize the time as 23:59:60 during a leap second.

This works except that the operating system provides time of day without leap seconds. What happens during a leap second when the operating system rolls the clock backwards 1 second because it can't otherwise express that a leap second is happening? Right now we compute bad elapsed times. It seems like we still would.

Contributor

rsc commented Jan 26, 2017 edited

Also, the nice thing about the current proposal is that it has no visible API change. If the underlying systems provide better ways to implement it, we can revise the implementation again.

Contributor

tsuna commented Jan 26, 2017

I'm sorry @JohnSauter but it's wishful thinking to imagine that all the potentially affected software would be updated in time when a new leap second is introduced. Even though the announcements are 6 months in advance, it's impossible to update the billions of applications running all over the world in that time frame, let alone in 18 months. This is a non-starter.

Contributor

rsc commented Jan 26, 2017

If everything along the way passed leap second information through (in particular if the operating system could be relied upon to give us an accurate information), we might be having a different conversation. But there's still the problem that a time stamp one year into the future cannot be calculated because we don't know how many leap seconds will happen between now and then.

@gopherbot gopherbot pushed a commit to golang/proposal that referenced this issue Jan 26, 2017

@rsc rsc design: add monotonic clock design
For golang/go#12914.

Change-Id: I946b48034be1e0526a2c15c4f0ecf0cbe32bb2ba
Reviewed-on: https://go-review.googlesource.com/35871
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
ae5dab6
Contributor

rsc commented Jan 26, 2017 edited

Design doc at https://golang.org/design/12914-monotonic.

Reminder: grammar and presentation nits at https://go-review.googlesource.com/#/c/35871/, substantive discussion here on the GitHub issue.

The design doc proposes a packed internal representation to save space, but it makes some additional assumptions, namely that reasonable systems will not have a wallclock time outside a range from years 1950 to 2222. I have worked with some embedded systems that powered on with a time outside this range (due to quirks of the hardware RTC in certain failure modes). Is this limitation worth saving a few bytes?

Contributor

rsc commented Jan 26, 2017

@lacroute-sps I don't know if it is. Can you tell me more about those systems? What time did they power on with? How common is that?

rsc changed the title from runtime: time: expose monotonic clock source to proposal: time: use monotonic clock source for computing elapsed time Jan 26, 2017

rsc changed the title from proposal: time: use monotonic clock source for computing elapsed time to proposal: time: use monotonic clock to measure elapsed time Jan 26, 2017

@rsc real-time clock chips typically represent times well outside of that range (using binary-coded decimal). If the value is incorrectly initialized or the RTC battery fails then the hardware can report a bad time. I won't claim this is a common case but I have seen it happen.

A big attraction of your proposed change is that it potentially decouples time-measurement calculations from the wall time. However, the proposed internal representation couples the two in certain cases and it is hard to predict in what situations that is going to cause problems. I would suggest it's better to store the monotonic time in an independent field.

Contributor

rsc commented Jan 27, 2017

@lacroute-sps, can you give specific examples of what dates those chips power on with?

I have seen years greater than 2100, which happens to be within the range the proposal can represent, but is close enough to the edge to be worrisome. From the sample I have observed the uninitialized power-on values on some devices are essentially random.

Contributor

nhooyr commented Jan 27, 2017

@rogpeppe doesn't https://golang.org/pkg/time/#Time.Equal solve that? Why use reflect.DeepEqual?

I don't have a solution for the problem of updating billions of applications every six months, though that also means that the Daylight Saving Time information can never be accurate, since it is updated even more frequently. However, I would like to point out that GNU/Linux does return a special indicator if you ask for the current time during a leap second. That makes it possible to do all of your time calculations using the tm data structure, which supports leap seconds.

Contributor

jayschwa commented Jan 27, 2017

How much performance is gained by using the packed (and harder to understand) format versus adding a new, independent field for monotonic time?

Owner

bradfitz commented Jan 27, 2017

@jayschwa, there's no new API, so users don't need to understand it, just as users don't need to understand the internals of the garbage collector. The main advantage of @rsc's proposed representation is that it doesn't change the memory footprint on 64-bit systems because there's already an unused 32-bit padding hole today.

doherty commented Jan 27, 2017

What happens when you wanted to do math with monotonic times but unbeknownst to you one of your time.Times lacks a monotonic time, so you get wall clock semantics? Are there any such cases? I'm thinking of situations where one of your time.Times is obtained from outside the process, and you're doing math on it: perhaps your operation's start time is stored in a database, and now you add a timeout to obtain a deadline -- but the time in the DB won't have a monotonic time (or, at least, it won't make sense).

Member

minux commented Jan 28, 2017

@rsc,
Just read the design doc. Looks good. I like that the API is staying simple. I noticed a small spelling mistake in the new documentation section 'Monotonic Clocks':

(If either t or u or contains no monotonic clock reading, these operations use the wall clock readings.)

There's one too many 'or' in there.

Contributor

rsc commented Feb 3, 2017

@alexmullins Thanks, fixed two other typos as well.

CL https://golang.org/cl/36255 mentions this issue.

Owner

bradfitz commented Feb 3, 2017

NTP's epoch is 1900. That might be a safer option than 1950.

gopherbot closed this in 0e33559 Feb 3, 2017

rsc changed the title from proposal: time: use monotonic clock to measure elapsed time to time: use monotonic clock to measure elapsed time Feb 3, 2017

@rsc rsc modified the milestone: Go1.9, Proposal Feb 3, 2017

Contributor

cespare commented Feb 3, 2017

Looks like the current code on master uses 1885-2157 as the range of wall times when monotonic time is present.

Contributor

rsc commented Feb 3, 2017

OK, this is checked in for Go 1.9. I moved the internal epoch to 1885 (max year 2157) to avoid any possible problem with NTP-epoch-derived times. I also adjusted the design doc to reflect this change and some minor encoding changes.

Thanks for the constructive conversation everyone.

orivej commented Feb 3, 2017

  1. The merged implementation changes the meaning of time.Now().Sub(old) when now and old are separated by system suspend and resume: it used to calculate real time difference and now that difference is reduced by the duration the system was suspended. Shouldn't time use monotonic real time source when available, and if so, can runtime.nanotime be redefined to read from that source, or should there be a new clock specifically for time.Now() to read from?

  2. Is there an explanation for + 1 in nanotime() - startNano + 1 in the design doc? If it is needed, why startNano is not reduced by 1 instead?

Contributor

rsc commented Feb 3, 2017

@orivej,

  1. Whether time.Since(start) includes "system suspended" time depends on the underlying OS's monotonic clock, but in general it's tricky to add that time into the clock since you can't take it back if you later find out you are wrong. My preference here would be to do the same thing on all operating systems, whatever that is. My guess is that all operating systems can provide the current semantics but that not all can provide the "include suspended time" semantics.

    It seems to me that if you want to include time while the computer was off, you're really talking about wall times, so the fix would be to flip into wall mode by recording 'start := time.Now().AddDate(0, 0, 0)', putting up with the fact that time.Since(start) might then return negative elapsed times.

  2. The +1 is just to make sure that tests never see a 0 monotonic time, since I used GetMono(&t) == 0 to say that there's no monotonic time. It's only for testing, and even in that context it's probably unnecessary, but also harmless.

CL https://golang.org/cl/36334 mentions this issue.

@gopherbot gopherbot pushed a commit that referenced this issue Feb 5, 2017

@AlekSi @bradfitz AlekSi + bradfitz time: Fix typo in Time.String() description.
Refs #12914.

Change-Id: Iadac4cbef70db6a95b47f86eaffcfc63bfdb8e90
Reviewed-on: https://go-review.googlesource.com/36334
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
172311c
Contributor

neild commented Mar 1, 2017

I'm looking at some code which assumes that times truncated to the same time are equal. Simplified:

func main() {
        t1 := time.Now().Truncate(1 * time.Second)
        t2 := time.Now().Truncate(1 * time.Second)
        h1, m1, s1 := t1.Clock()
        h2, m2, s2 := t2.Clock()
        if h1 == h2 && m1 == m2 && s1 == s2 {
                if !t1.Equal(t2) {
                        fmt.Printf("BUG: times truncated to the same second, but are not equal:\n%v\n%v\n", t1, t2)
                }
        }
}

The actual code is mapping times to windows; it assumes that all times within a given window will truncate to the same time.

This doesn't seem unreasonable. Should Truncate discard the monotonic time component?

Member

shurcooL commented Mar 1, 2017

@neild A relevant open issue is #18991.

@shurcooL shurcooL added a commit to google/go-github that referenced this issue Mar 1, 2017

@shurcooL shurcooL Modify time first, round after (and strip additional precision).
I've noticed that TestDo_rateLimit_noNetworkCall test fails on Go
master with the following error:

	--- FAIL: TestDo_rateLimit_noNetworkCall (0.00s)
		github_test.go:549: rateLimitErr rate reset = 2017-02-25 01:49:01 +0000 UTC, want 2017-02-25 01:49:01 +0000 UTC m=+60.206993003

(Source: https://travis-ci.org/google/go-github/jobs/205180544.)

This is due to changes in Go 1.9 to perform monotonic elapsed time
measurements. See golang/go#12914 and
https://golang.org/design/12914-monotonic for full details.

Make the test pass by modifying time first, round after, and strip
the monotonic clock reading by using AddDate(0, 0, 0). Doing that
eliminates the unwanted additional monotonic precision from wanted
time, making the expected time equality true.
4dfe00d

@borkmann borkmann added a commit to cilium/cilium that referenced this issue May 27, 2017

@borkmann borkmann pkg, bpf: add api to return monotonic clock
This adds an API that we can use to compare values against bpf_ktime_get_ns()
BPF helper for the timeouts. Looks like golang doesn't offer such an interface,
so we need to add a cgo version right here. Probably also better to not rely
on go internals should they once change their clock source internally, etc.

Related: golang/go#12914

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
b0fc46d

@tgraf tgraf added a commit to cilium/cilium that referenced this issue May 27, 2017

@borkmann @tgraf borkmann + tgraf pkg, bpf: add api to return monotonic clock
This adds an API that we can use to compare values against bpf_ktime_get_ns()
BPF helper for the timeouts. Looks like golang doesn't offer such an interface,
so we need to add a cgo version right here. Probably also better to not rely
on go internals should they once change their clock source internally, etc.

Related: golang/go#12914

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
1e59d06

Yawning referenced this issue in Katzenpost/core Jul 11, 2017

Closed

Implement a monotonic time source. #6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment