Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ntpd: NTP + DNSSEC chicken-and-egg problem at boot. #10409

Open
R-Adrian opened this issue Oct 31, 2019 · 1 comment

Comments

@R-Adrian
Copy link

@R-Adrian R-Adrian commented Oct 31, 2019

Maintainer: @tripolar
Environment:

# cat /etc/openwrt_release
DISTRIB_ID='SuperWRT'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r11362-4bf9bec361'
DISTRIB_TARGET='ar71xx/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='SuperWRT SNAPSHOT r11362-4bf9bec361'
DISTRIB_TAINTS='no-all'
# uname -a
Linux MyRouter-v2 4.14.150 #0 Wed Oct 30 10:16:25 2019 mips GNU/Linux

Problem description:
Most routers do not have a built-in hardware clock and try to obtain the time through NTP at boot time.
If DNSSEC is also configured and NSEC validation enforced then this becomes impossible, because the router will not be capable of obtaining an usable time reference since it cannot do proper secure DNS resolution for the time servers hostnames.

relevant bits from /etc/config/dhcp that turn NTPD into a dead duck at boot and which in turn causes the entire DNSSEC resolution to fail because of time differences are:

config dnsmasq
	option dnssec '1'
	option dnsseccheckunsigned '1'

Tentative solution:
Would it be possible to adjust the startup script of NTPD so that it first tries to obtain a rough time reference from somewhere, without relying on the time server hostnames configured in /etc/config/system?
/etc/init.d/sysfixtime is useless when the router doesn't have a built-in hardware clock.

Maybe query a couple of times at boot one time server that has a static ip address, to obtain an usable time reference so that DNSSEC validation can be bootstrapped later on?

probably possible to use here:
Google time servers https://time.google.com
Cloudflare time servers https://time.cloudflare.com

These servers are members of the NTP pool project and they have fixed IP addresses published in DNS for worldwide use:
Google:
216.239.35.0
216.239.35.4
216.239.35.8
216.239.35.12
2001:4860:4806:0:0:0:0:0
2001:4860:4806:4:0:0:0:0
2001:4860:4806:8:0:0:0:0
2001:4860:4806:c:0:0:0:0

Cloudflare:
162.159.200.1
162.159.200.123
2606:4700:f1:0:0:0:0:1
2606:4700:f1:0:0:0:0:123

Or maybe implement somehow the secure RoughTime protocol for obtaining a reliable, rough time at boot?
https://roughtime.googlesource.com/roughtime
https://blog.cloudflare.com/roughtime/

Note: i also opened a related bug for Busybox NTPD (base system package) since that is also affected by a similar issue.
https://bugs.openwrt.org/index.php?do=details&task_id=2574

@darrentinghc

This comment has been minimized.

Copy link

@darrentinghc darrentinghc commented Nov 13, 2019

This time skew too far on boot also seems to corrupt collectd RRA files. Start sequence for ntp in OpenWRT way too late, much later than collectd itself, collectd always started before ntp initialize system time and in my test case system date set to year 2033 when collectd started, rendering all RRA files stop updating after system reboot.

==>My current work around is disable collectd auto start and start it via rc.local with 120secs sleep seems work fine [far from perfect, if wan links take more than 120secs to come out, RRA toast again].

add into rc.local:
(sleep 120; /etc/init.d/collectd start) &

Ideally, collectd startup scripts in OpenWRT should include ntp service detection.
if ntp is configured and enabled, collectd should wait in the background for ntp sync to complete before proceed. This NTP issue typically should not causes much issues if default time is skew backward like year 1990, but strangely OpenWRT default time forward to year 2033, is this bugs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.