Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARTEMIS-2213 don't expire critical component in the case of clock back drift #2481

Closed

Conversation

Projects
None yet
3 participants
@wy96f
Copy link
Contributor

wy96f commented Dec 25, 2018

In our production cluster some brokers crashed. There is nothing unusual in the dump stack. After digging into code, we found component was incorrectly expired. When clock drifted back, left time was less than enter time. If the component was not entered in default 120000ms, it would be expired and server was halted.

@franz1981

This comment has been minimized.

Copy link
Contributor

franz1981 commented Dec 25, 2018

I know that System::nanoTime is based on monothonic clock of the CPU that's not wall clock time ie can't go back.
System::currentTimeMillis ie UTC wall-clock can go backwards instead due to ntp alignment.
Can you please explain better what's happened?

@michaelandrepearce

This comment has been minimized.

Copy link
Contributor

michaelandrepearce commented Dec 25, 2018

I dont see how this handles clock drift if im honest. The test case doesnt actually cause a drift in nano.

Also Nanoseconds also are from an arbitary offset on the cpu, and afaik not from wall clock. So isnt subject to drift like millis are

@michaelandrepearce

This comment has been minimized.

Copy link
Contributor

michaelandrepearce commented Dec 25, 2018

@franz1981 haha snap. I had same thinking. Oh and happy xmas!

@franz1981

This comment has been minimized.

Copy link
Contributor

franz1981 commented Dec 25, 2018

@michaelandrepearce thanks buddy you too!!!

@wy96f

This comment has been minimized.

Copy link
Contributor Author

wy96f commented Dec 26, 2018

@michaelandrepearce @franz1981
Here is the code of System::nanoTime from openjdk:
jlong os::javaTimeNanos() {
if (Linux::supports_monotonic_clock()) {
struct timespec tp;
int status = Linux::clock_gettime(CLOCK_MONOTONIC, &tp);
assert(status == 0, "gettime error");
jlong result = jlong(tp.tv_sec) * (1000 * 1000 * 1000) + jlong(tp.tv_nsec);
return result;
} else {
timeval time;
int status = gettimeofday(&time, NULL);
assert(status != -1, "linux error");
jlong usecs = jlong(time.tv_sec) * (1000 * 1000) + jlong(time.tv_usec);
return 1000 * usecs;
}
}

If monotonic clock is not supported, nanoTime is implemented like currentTimeMillis.
If clock_gettime is used, as man page says: CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific) similar to CLOCK_MONOTONIC, but provides access to a raw hardware-based time that is not subject to NTP adjustments. That's to say CLOCK_MONOTONIC maybe affected by ntp. See https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8006942, rust-lang/rust#37902 for reference. Not sure CLOCK_MONOTONIC won't jump backward, please correct me if anything wrong:)

@wy96f

This comment has been minimized.

Copy link
Contributor Author

wy96f commented Dec 26, 2018

We add a pending counter to count enterCritical calls. If left time is less than enter time and pending counter equals to 0, we don't expire component as all threads have called leaveCritical and it maybe caused by clock drift.
When a clock jumps backwards, to avoid the edge case that enterCritical is called during isExpired, we need to compare current time to latest timeEnter.

@franz1981

This comment has been minimized.

Copy link
Contributor

franz1981 commented Dec 26, 2018

Nice investigation! Anyway, CLOCK_MONOTONIC is not subject to NTP backward adjustment, but just incremental ones ie clock acceleration of any kind, but this would happen mostly for systems that shares a single common clock and I'm sure that is not the common case.
Re the openjdk code, I was already aware of cases where nanoTime was implemented using currentTimeMillis, but not that there was modern systems with such issues given that not just this critical analyser, but probably anything that measure time in a Java program would be broken if it happens.
My concern is to add a XADD instruction in the hot path to support this fix given its cost.
Do you have verified that your system is not supporting nanoTime correctly?
Probably updating the kernel or just using something with hardware clock dedicated would help.

@franz1981

This comment has been minimized.

Copy link
Contributor

franz1981 commented Dec 26, 2018

Ah, another consideration: nanoTime can actually go backward due to "starting in an unspecified point of time", but usually this need to be handled by not comparing directly values and just perform the difference, making use of numerical overflow to have correct values back, as reported by https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--.

@wy96f

This comment has been minimized.

Copy link
Contributor Author

wy96f commented Dec 26, 2018

@franz1981 thanks for the explanation. We have verified nanoTime is correctly supported. We observed the queue delivered messages and received no messages for abount 2mins, then broker is killed. Three brokers had been killed this week. That was weird. The dumped stacktrace is attached. Could you shed us some light? :)
please see https://gist.github.com/wy96f/1a5c2f481e70d8dabde70f3a07eb57b7 for dump file.

@franz1981

This comment has been minimized.

Copy link
Contributor

franz1981 commented Dec 26, 2018

I will be able to look into that in the next days :)
I can already ask you to collect time to safepoints/GC pauses if possible ie -XX:+PrintGCApplicationStoppedTime.
2 minutes seems a too long period TBH, but worth taking a look if you rely on MAPPED journal and/or paging a lot, given that major page faulting can causes long stall similar to very long full GC, but TBH nothing so long (2 minutes is a lot!).
As an additional suggestion you could run continuosly a Java program that just use one core, measuring the elapsed time between 2 consecutive nanoTime calls, recording in which wall-clock time a back-ward "drift" has happened to check if a broker shutdown has happened near the same time, makes sense?

@wy96f

This comment has been minimized.

Copy link
Contributor Author

wy96f commented Dec 30, 2018

@franz1981 thanks for your advice. We've enabled gc log and run the nano program. Any news will be reported:)

@michaelandrepearce

This comment has been minimized.

Copy link
Contributor

michaelandrepearce commented Jan 3, 2019

Can we close this for the interim?

@wy96f

This comment has been minimized.

Copy link
Contributor Author

wy96f commented Jan 4, 2019

interim

Yes. The problem did't occur these days. We'll give more investigation.

@michaelandrepearce

This comment has been minimized.

Copy link
Contributor

michaelandrepearce commented Jan 4, 2019

@wy96f if you could close, only the person opening can close, or when we merge.

@wy96f wy96f closed this Jan 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.