Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set-system-time ignores chrony #12

Closed
sailoog opened this issue Feb 12, 2023 · 15 comments
Closed

set-system-time ignores chrony #12

sailoog opened this issue Feb 12, 2023 · 15 comments

Comments

@sailoog
Copy link

sailoog commented Feb 12, 2023

chrony installed and working OK synchronizing time at boot in Debian bullseye.

set-system-time plugin enabled:

imatge

even though chrony is present and working the plugin is setting time:

imatge

sources:

pi@openplotter:~ $ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^- 192.92.6.30                   1   6   377    61    +62ms[  +62ms] +/-  196ms
^* ntp61.kashra-server.com       2   6   377    66  +7419us[+6273us] +/-   54ms
^- ntp1.torix.ca                 1   6   377    62  +8854us[+8854us] +/-   78ms
^- ntp68.kashra-server.com       2   6     7    63  +4830us[+4830us] +/-   61ms

chrony log:

pi@openplotter:~ $ journalctl -b -u chrony
-- Journal begins at Mon 2022-07-18 20:04:04 CEST, ends at Sun 2023-03-12 14:40:52 CET. --
Feb 12 14:29:17 openplotter systemd[1]: Starting chrony, an NTP client/server...
Feb 12 14:29:18 openplotter chronyd[624]: chronyd version 4.0 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGN>
Feb 12 14:29:18 openplotter chronyd[624]: Frequency 8.309 +/- 12.134 ppm read from /var/lib/chrony/chrony.drift
Feb 12 14:29:19 openplotter chronyd[624]: Using right/UTC timezone to obtain leap second data
Feb 12 14:29:19 openplotter chronyd[624]: Loaded seccomp filter
Feb 12 14:29:19 openplotter systemd[1]: Started chrony, an NTP client/server.
Feb 12 14:29:53 openplotter chronyd[624]: Selected source 91.236.251.13 (2.debian.pool.ntp.org)
Feb 12 14:29:53 openplotter chronyd[624]: System clock wrong by 20.297028 seconds
Feb 12 14:30:13 openplotter chronyd[624]: System clock was stepped by 20.297028 seconds
Feb 12 14:30:13 openplotter chronyd[624]: System clock TAI offset set to 37 seconds
Feb 12 14:31:19 openplotter chronyd[624]: System clock wrong by 1.699830 seconds
Feb 12 14:31:21 openplotter chronyd[624]: System clock was stepped by 1.699825 seconds
Feb 12 14:32:26 openplotter chronyd[624]: Can't synchronise: no majority
Feb 12 14:32:27 openplotter chronyd[624]: Selected source 51.89.85.70 (2.debian.pool.ntp.org)
Mar 12 14:33:29 openplotter chronyd[624]: Forward time jump detected!
Mar 12 14:33:29 openplotter chronyd[624]: Can't synchronise: no selectable sources
Mar 12 14:35:39 openplotter chronyd[624]: Selected source 51.89.85.70 (2.debian.pool.ntp.org)
Mar 12 14:35:39 openplotter chronyd[624]: System clock wrong by -2419198.880293 seconds
Mar 12 14:35:42 openplotter chronyd[624]: Selected source 206.108.0.131 (2.debian.pool.ntp.org)
Mar 12 14:36:43 openplotter chronyd[624]: Selected source 51.89.85.70 (2.debian.pool.ntp.org)
Mar 12 14:36:45 openplotter chronyd[624]: Source 91.236.251.13 replaced with 51.195.26.117 (2.debian.pool.ntp.org)

This has critical side effects as failing handshakes when updating repositories:

Err:5 https://deb.nodesource.com/node_16.x bullseye InRelease
  Certificate verification failed: The certificate is NOT trusted. The revocation or OCSP data are old and have been superseded.  Could not handshake: Error in the certificate verification. [IP: 212.230.237.170 443]

disabling set-system-time and restarting chrony everything works OK:

pi@openplotter:~ $ journalctl -b -u chrony
-- Journal begins at Sat 2023-02-11 13:24:33 CET, ends at Sun 2023-03-12 14:50:43 CET. --
Feb 12 14:29:17 openplotter systemd[1]: Starting chrony, an NTP client/server...
Feb 12 14:29:18 openplotter chronyd[624]: chronyd version 4.0 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGN>
Feb 12 14:29:18 openplotter chronyd[624]: Frequency 8.309 +/- 12.134 ppm read from /var/lib/chrony/chrony.drift
Feb 12 14:29:19 openplotter chronyd[624]: Using right/UTC timezone to obtain leap second data
Feb 12 14:29:19 openplotter chronyd[624]: Loaded seccomp filter
Feb 12 14:29:19 openplotter systemd[1]: Started chrony, an NTP client/server.
Feb 12 14:29:53 openplotter chronyd[624]: Selected source 91.236.251.13 (2.debian.pool.ntp.org)
Feb 12 14:29:53 openplotter chronyd[624]: System clock wrong by 20.297028 seconds
Feb 12 14:30:13 openplotter chronyd[624]: System clock was stepped by 20.297028 seconds
Feb 12 14:30:13 openplotter chronyd[624]: System clock TAI offset set to 37 seconds
Feb 12 14:31:19 openplotter chronyd[624]: System clock wrong by 1.699830 seconds
Feb 12 14:31:21 openplotter chronyd[624]: System clock was stepped by 1.699825 seconds
Feb 12 14:32:26 openplotter chronyd[624]: Can't synchronise: no majority
Feb 12 14:32:27 openplotter chronyd[624]: Selected source 51.89.85.70 (2.debian.pool.ntp.org)
Mar 12 14:33:29 openplotter chronyd[624]: Forward time jump detected!
Mar 12 14:33:29 openplotter chronyd[624]: Can't synchronise: no selectable sources
Mar 12 14:35:39 openplotter chronyd[624]: Selected source 51.89.85.70 (2.debian.pool.ntp.org)
Mar 12 14:35:39 openplotter chronyd[624]: System clock wrong by -2419198.880293 seconds
Mar 12 14:35:42 openplotter chronyd[624]: Selected source 206.108.0.131 (2.debian.pool.ntp.org)
Mar 12 14:36:43 openplotter chronyd[624]: Selected source 51.89.85.70 (2.debian.pool.ntp.org)
Mar 12 14:36:45 openplotter chronyd[624]: Source 91.236.251.13 replaced with 51.195.26.117 (2.debian.pool.ntp.org)
Mar 12 14:50:43 openplotter chronyd[624]: chronyd exiting
Mar 12 14:50:43 openplotter systemd[1]: Stopping chrony, an NTP client/server...
Mar 12 14:50:43 openplotter systemd[1]: chrony.service: Succeeded.
Mar 12 14:50:43 openplotter systemd[1]: Stopped chrony, an NTP client/server.
Mar 12 14:50:43 openplotter systemd[1]: Starting chrony, an NTP client/server...
Mar 12 14:50:43 openplotter chronyd[7781]: chronyd version 4.0 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIG>
Mar 12 14:50:43 openplotter chronyd[7781]: Frequency 20.494 +/- 35.339 ppm read from /var/lib/chrony/chrony.drift
Mar 12 14:50:43 openplotter chronyd[7781]: Using right/UTC timezone to obtain leap second data
Mar 12 14:50:43 openplotter chronyd[7781]: Loaded seccomp filter
Mar 12 14:50:43 openplotter systemd[1]: Started chrony, an NTP client/server.
Mar 12 14:50:48 openplotter chronyd[7781]: Selected source 162.159.200.123 (2.debian.pool.ntp.org)
Mar 12 14:50:48 openplotter chronyd[7781]: System clock wrong by -2419116.741169 seconds
Feb 12 14:52:12 openplotter chronyd[7781]: System clock was stepped by -2419116.741169 seconds
Feb 12 14:52:12 openplotter chronyd[7781]: System clock TAI offset set to 37 seconds
Feb 12 14:52:13 openplotter chronyd[7781]: Selected source 176.9.157.155 (2.debian.pool.ntp.org)
@sailoog
Copy link
Author

sailoog commented Feb 12, 2023

This the output of the command set-system-time uses:

pi@openplotter:~ $ chronyc sources 2> /dev/null | cut -c2 | grep -ce '-\|*'
1

@tkurki
Copy link
Member

tkurki commented Feb 12, 2023

The log lines Can't synchronise: no majority and Can't synchronise: no selectable sources look very suspicious to me. I see no evidence that set-system-time is not working as expected: there is trouble with ntp so let's use GPS time.

The real problem here seems to be that something is producing bad datetime data, time jumps forward to March 12. Note that this would happen with any loss of internet connectivity, so imho that is the real problem to solve, otherwise it makes no sense to keep set-system-time active in the first place.

If this is easy to reproduce I suggest you add logging of what chrony is actually outputting for example with chronyc sources 2> /dev/null | tee -a /tmp/set-system-time.log | cut -c2 | grep -ce '-\|*'. Looking at the command output afterwards tells us nothing, as clearly the grep did not find > 0 lines. Enabling debug logging for set-system-time would also log when time is set.

@tkurki
Copy link
Member

tkurki commented Feb 12, 2023

We could change set-system-time behavior so that a single zero sources count does not activate it but ntp needs to be unavailable for a longer period.

@tkurki
Copy link
Member

tkurki commented Feb 12, 2023

Oh I missed the message that set-system-time has set time four times. But there's still something odd going on with the sources: your first chronyc sources output has 4 grep hits, while your second has 1.

@sailoog
Copy link
Author

sailoog commented Feb 12, 2023

Thanks for the tips. I can reproduce this every time in a fresh installation. I think I have find out why set-system-time is ignoring chrony:

pi@openplotter:~ $ chronyc sources -v

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current best, '+' = combined, '-' = not combined,
| /             'x' = may be in error, '~' = too variable, '?' = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? time.cloudflare.com           3   6   377     5   -328ms[+40320m] +/-   29ms
^* sarah.vandalsweb.com          3   6    37     4   -347ms[  -51ms] +/-   95ms
^- 45.66.52.71                   2   6   377     5   -335ms[  -40ms] +/-   66ms
^- 185.179.104.7                 2   6   377     4   -338ms[  -43ms] +/-   86ms
pi@openplotter:~ $ chronyc sources 2> /dev/null | cut -c2 | grep -ce '-\|*'
3
pi@openplotter:~ $ chronyc sources -v

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current best, '+' = combined, '-' = not combined,
| /             'x' = may be in error, '~' = too variable, '?' = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? time.cloudflare.com           3   6   377     8  +40320m[+40320m] +/-   53ms
^? sarah.vandalsweb.com          3   6     7     7  +40320m[+40320m] +/-   92ms
^? 45.66.52.71                   2   6   377     8  +40320m[+40320m] +/-   60ms
^? 185.179.104.7                 2   6   377     8  +40320m[+40320m] +/-   78ms
pi@openplotter:~ $ chronyc sources 2> /dev/null | cut -c2 | grep -ce '-\|*'
0

It seems that sources marked as '?' are ignored and that makes sense.
Now I have to find out why it fails when set-system-time sets the time. Time coming from GPS is ok but the system time seems to jump forward 2419199.265061 seconds and that is a lot of seconds, aprox 27 days and this is why the repos complain:

E: Release file for http://security.debian.org/debian-security/dists/bullseye-security/InRelease is expired (invalid since 21d 6h 31min 55s). Updates for this repository will not be applied.
E: Release file for http://deb.debian.org/debian/dists/bullseye-updates/InRelease is expired (invalid since 21d 4h 28min 47s). Updates for this repository will not be applied.
E: Release file for http://deb.debian.org/debian/dists/bullseye-backports/InRelease is expired (invalid since 21d 4h 28min 47s). Updates for this repository will not be applied.

I will try more GPSs

@FredericGuilbault
Copy link
Contributor

I don't have an RPI with me ATM, so im guessing things out of my head, but I'm wondering if there would be some kind of race condition between fake-hwclock and chrony that push time forward ... maybe

@tkurki
Copy link
Member

tkurki commented Feb 13, 2023

@FredericGuilbault check the screenshot, where set-system-time says it has set the time.

From the log it looks like it jumps forward exactly one month. Now that it is February that is 2419200 seconds. I suspect your source just produces navigation.datetime off by one month. Earlier you said Time coming from GPS is ok - was that literally just time or date and time?

@sailoog
Copy link
Author

sailoog commented Feb 13, 2023

This is really weird. Time and date from GPS are OK and navigation.datetime key is also correct in the server. It is only after set-system-time sets the system time when there is the 1 month jump but the navigation.datetime key keeps the right date/time. I will make more tests this afternoon. Data comes from a Raymarine E80 display by Seatalk1 connection. I will switch to NMEA 0183 and CAN connection and try another GPS.

@tkurki
Copy link
Member

tkurki commented Feb 13, 2023

Rather log the input and the set tume command, another source may just work and we won’t be any wiser.

@sailoog
Copy link
Author

sailoog commented Feb 13, 2023

I am getting more and more suspicious of the plugin. This is my navigation.datetime key values before enabling set-system-time plugin:

{
  "meta": {
    "description": "Time and Date from the GNSS Positioning System"
  },
  "value": "2023-02-13T18:01:01.000Z",
  "$source": "seatalk1.ST",
  "timestamp": "2023-02-13T18:01:00.859Z",
  "sentence": "ALK"
}

after enabling the plugin this is the SK log, the entry"[object Object]" is in red:

Feb 13 19:01:03 GET /signalk/v1/api/vessels/self/navigation/datetime 304 6.123 ms - -
Feb 13 19:01:04 GET /signalk/v1/api/vessels/self/navigation/datetime 304 4.916 ms - -
Feb 13 19:01:10 [object Object]
Mar 13 19:01:11 2023-03-13T18:01:11.010Z @signalk/set-system-time System time set to 2023-03-13T18:01:11.000Z 

this is the service log:

Mar 13 19:01:11 openplotter sudo[1391]: pam_unix(sudo:session): session closed for user root
Mar 13 19:01:11 openplotter signalk-server[430]: 2023-03-13T18:01:11.010Z @signalk/set-system-time System time set to 2023-03-13T18:01:11.000Z
Mar 13 19:02:11 openplotter sudo[1457]:       pi : PWD=/home/pi/.signalk ; USER=root ; COMMAND=/usr/bin/date
Mar 13 19:02:11 openplotter sudo[1458]:       pi : PWD=/home/pi/.signalk ; USER=root ; COMMAND=/usr/bin/date --iso-8601 -u -s 2023-03-13T18:02:11.000Z
Mar 13 19:02:11 openplotter sudo[1458]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
Mar 13 19:02:11 openplotter sudo[1457]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)

and this is how the navigation.datetime key looks now:

{
  "meta": {
    "description": "Time and Date from the GNSS Positioning System"
  },
  "value": "2023-02-13T18:02:21.000Z",
  "$source": "seatalk1.ST",
  "timestamp": "2023-03-13T18:02:20.911Z",
  "sentence": "ALK"
}

as you can see the values is still ok. If I use that date/time with date command everything works:

pi@openplotter:~ $ sudo date --iso-8601 -u -s "2023-02-13T18:02:21.000Z"
2023-02-13

It seems that is the plugin who adds a unit to the month. Is it working for you?

@sailoog
Copy link
Author

sailoog commented Feb 13, 2023

I got it!

Changed to CAN connection and after 8 iterations set-system-time keeps the right date and everything works.

this is the key value:

{
  "meta": {
    "description": "Time and Date from the GNSS Positioning System"
  },
  "value": "2023-02-13T18:56:43.51200Z",
  "$source": "can0.160",
  "timestamp": "2023-02-13T18:56:43.989Z",
  "pgn": 126992
}

as you can see the difference with the seatalk1 source is the amount of decimals in seconds. Any idea how to fix this?

@tkurki
Copy link
Member

tkurki commented Feb 13, 2023

I sincerely doubt it is up to the number of decimals.

Looking at full data before and after does not work, if seatalk produces always two datagrams in succession and the first one is bad and the second one is good. You’ll just see the second, good one’s data. At least that is one possibility.

There are two st datagrams with datetime, I think the bug is at

https://github.com/SignalK/nmea0183-signalk/blob/6e2602c0570c7121f493a1c6ab9b29f8aa2b29c1/hooks/seatalk/0x54.js#L55

compare to

https://github.com/SignalK/nmea0183-signalk/blob/6e2602c0570c7121f493a1c6ab9b29f8aa2b29c1/hooks/seatalk/0x56.js#L44

but if they always come in succession you never see the off by 1 month alue! But the first one is what triggers this plugin.

@sailoog
Copy link
Author

sailoog commented Feb 14, 2023

Agree, probably that is the bug not the decimals. I will try later adding -1 to the datagrama 54 but I am afraid this is beyond my knowledge of signal k code. Let me know if you want me to make any test.

tkurki added a commit to SignalK/nmea0183-signalk that referenced this issue Feb 14, 2023
tkurki added a commit to SignalK/nmea0183-signalk that referenced this issue Feb 14, 2023
@tkurki
Copy link
Member

tkurki commented Feb 14, 2023

@sailoog
Copy link
Author

sailoog commented Feb 14, 2023

tested and confirmed!
I have just PR a typo correction here: #13
thanks!

@sailoog sailoog closed this as completed Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants