Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NNG_USE_CLOCKID Bug? #1526

Closed
mbush00 opened this issue Nov 17, 2021 · 6 comments
Closed

NNG_USE_CLOCKID Bug? #1526

mbush00 opened this issue Nov 17, 2021 · 6 comments

Comments

@mbush00
Copy link
Contributor

mbush00 commented Nov 17, 2021

I've been trying to track down a bug w/ my NNG usage where my nng_sockets get disconnected when the system clock changes due to a NTP adjustment.

Digging in the NNG code, it appears the default clock differs from the comments in the code. It says the default clock will be the monotonic clock, but in fact it is the realtime clock.

in src/platform/posix/posix_config.h it states:

// #define NNG_USE_CLOCKID
// This macro may be defined to a different clock id (see
// clock_gettime()). By default we use CLOCK_MONOTONIC if it exists,
// or CLOCK_REALTIME otherwise. This is ignored if NNG_USE_GETTIMEOFDAY
// is defined. Platforms that don't use POSIX clocks will probably
// ignore any setting here.

but when you look how NNG_USE_CLOCKID is established it is the opposite:

#define NNG_USE_CLOCKID CLOCK_REALTIME
#ifndef CLOCK_REALTIME
#define NNG_USE_GETTIMEOFDAY
#elif !defined(NNG_USE_CLOCKID)
#define NNG_USE_CLOCKID CLOCK_MONOTONIC
#else
#define NNG_USE_CLOCKID CLOCK_REALTIME
#endif // CLOCK_REALTIME

This if-statement makes it impossible (to my knowledge) to ever hit the CLOCK_MONOTONIC line as the first line is defining NNG_USE_CLOCKID. There is no way the #elif !defined line can ever succeed. So what happens is it always goes into the #else statement.

I'm not sure I follow what the logic is supposed to be doing so I don't have an immediate suggested fix.

@gdamore
Copy link
Contributor

gdamore commented Nov 26, 2021

Yeah that seems like a bug.

I'm not 100% certain that monotonic time is what we want, but if the condition variables we rely on don't wake when the clock changes, that's bad. Of course if NTP adjusts your clock to the past, some delays can take longer than they should.

I need to think about this, but clearly there is no way to get to the monotonic case which is a bug.

@mbush00
Copy link
Contributor Author

mbush00 commented Nov 26, 2021

I would have thought using the monotonic clock would be the best choice (if available) as it is not susceptible to changes in the wall clock. It only counts in one direction -- forward.

If that is not an acceptable choice, then perhaps a configurable option (eg. via cmake) might be the best compromise.

I'm using NNG in an embedded system that doesn't haven a battery backed-up RTC. It uses NTP to get time from a local machine and then when GPS is available it will sync with that time. It wasn't until the local machine was out by nng_setopt_ms(socket, NNG_OPT_RECVTIMEO, timeout_ms) seconds compared to the GPS time that we noticed that there were issues.

Once we figured it out, when using CLOCK_REALTIME simply changing your time while running NNG will impact your timeouts.

@gdamore
Copy link
Contributor

gdamore commented Nov 28, 2021

I've been thinking about this, and I believe it was my intent to use CLOCK_MONOTONIC whenever possible. The fact that this is is not happening is a bug.

@mbush00
Copy link
Contributor Author

mbush00 commented Nov 28, 2021

Makes sense :)

Perhaps the fix is just to remove the #define NNG_USE_CLOCKID CLOCK_REALTIME and remove the #else block. If NNG_USE_CLOCKID was defined elsewhere then we should assume whomever set it knows what they were doing.

I don't know if there are implementations without CLOCK_MONOTONIC, but the following would implement your initital intentions based on the comment:

#ifndef CLOCK_REALTIME
#define NNG_USE_GETTIMEOFDAY
#elif !defined(NNG_USE_CLOCKID)
#if defined(CLOCK_MONOTONIC)
#define NNG_USE_CLOCKID CLOCK_MONOTONIC
#else
#define NNG_USE_CLOCKID CLOCK_REALTIME
#endif

@gdamore
Copy link
Contributor

gdamore commented Nov 28, 2021

It turns that is not sufficient. Because Apple.

I will be posting a fix shortly.

@mbush00
Copy link
Contributor Author

mbush00 commented Nov 28, 2021

ahh... good ol'Apple.

Thanks for the fix.

JaylinYu pushed a commit to nanomq/nng that referenced this issue Nov 29, 2021
This makes CLOCK_MONOTONIC the default (as it should have been)
for platforms that have it defined, except for Apple platforms which lack
support for using anything other than the real time clock with condition
variables.  (And unfortunately silently ignore attempts to do otherwise.)
JaylinYu pushed a commit to nanomq/nng that referenced this issue Nov 29, 2021
This makes CLOCK_MONOTONIC the default (as it should have been)
for platforms that have it defined, except for Apple platforms which lack
support for using anything other than the real time clock with condition
variables.  (And unfortunately silently ignore attempts to do otherwise.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants