Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FCOS PXE install fails to fetch image with wrong system BIOS time #1158

Open
HuijingHei opened this issue Apr 4, 2023 · 5 comments
Open

Comments

@HuijingHei
Copy link
Member

See console logs:

[  103.080254] systemd-resolved[2869]: System hostname changed to 'ampere-mtsnow-altramax-13.khw4.lab.eng.bos.redhat.com'. 
[  103.140581] systemd[1]: Starting NetworkManager-dispatcher.service - Network Manager Script Dispatcher Service... 
[  103.170913] coreos-installer-service[2952]: coreos-installer install /dev/nvme0n1 --ignition-url http://file.nay.redhat.com/~hhei/beaker-install.ign --insecure-ignition --firstboot-args rd.neednet=1 ip=dhcp  --image-url https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1. 
[  103.220301] coreos-installer-service[2952]: 0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz --insecure --fetch-retries infinite 
[  103.260377] NetworkManager[2881]: <info>  [1677024077.4527] device (eno1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') 
[  103.300233] NetworkManager[2881]: <info>  [1677024077.4530] device (eno1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') 
[  103.340206] NetworkManager[2881]: <info>  [1677024077.4537] manager: NetworkManager state is now CONNECTED_SITE 
[  103.370441] NetworkManager[2881]: <info>  [1677024077.4542] device (eno1): Activation: successful, device activated. 
[  103.400186] NetworkManager[2881]: <info>  [1677024077.4550] manager: NetworkManager state is now CONNECTED_GLOBAL 
[  103.430176] NetworkManager[2881]: <info>  [1677024077.4557] manager: startup complete 
[  103.460168] systemd[1]: Finished NetworkManager-wait-online.service - Network Manager Wait Online. 
[  103.490154] systemd[1]: Started NetworkManager-dispatcher.service - Network Manager Script Dispatcher Service. 
[  103.520243] systemd[1]: Reached target network-online.target - Network is Online. 
[  103.550191] systemd[1]: Starting coreos-installer.service - CoreOS Installer... 
[  103.580178] systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive. 
[  103.610200] systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive. 
[  104.376439] coreos-installer-service[2963]: Downloading image from https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz 
[  104.430262] coreos-installer-service[2963]: Downloading signature from https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz.sig 
[  104.523221] coreos-installer-service[2963]: Error fetching 'https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz.sig': error sending request for url (https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz.sig): error trying to connect: error:0A000086:SSL routines:tls_post_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1889: (certificate is not yet valid) 
[  104.600210] coreos-installer-service[2963]: Sleeping 1s and retrying... 
@jlebon jlebon changed the title Get error about SSL when install FCOS in beaker using pxe FCOS PXE install fails to fetch image with "certificate is not yet valid" Apr 4, 2023
@jlebon
Copy link
Member

jlebon commented Apr 4, 2023

This comes from an internal chat. From Colin there:

My guess here is that the system BIOS time is incorrect. Probably need to block on chrony/timesyncd before running coreos-installer. This is something we should make more ergonomic.

@HuijingHei
Copy link
Member Author

@cgwalters I can not access the console of the machine, do you have any way to update/sync the system BIOS time ?

@jlebon
Copy link
Member

jlebon commented Apr 5, 2023

Note that coreos-installer should keep retrying. If the chrony theory is correct, then eventually it should succeed once chrony has stepped the clock (IIRC, it does allow stepping once initially). Do you see any output from chrony?

@HuijingHei
Copy link
Member Author

This comes from an internal chat. From Jonathan there:

instead of doing an automated install, try passing an Ignition config that e.g. adds your SSH key and then SSH into the live environment to poke around and manually try the installation. the output of timedatectl and systemctl status chronyd would be interesting

@HuijingHei
Copy link
Member Author

Manually install in the live environment, can install successfully.

Looks that @cgwalters is right. From System clock was stepped by 3895209.489901 seconds, the time is synced by chronyd and can download signature file xx.sig.

[core@ampere-mtsnow-altramax-13 ~]$ systemctl status chronyd
● chronyd.service - NTP client/server
     Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
             /usr/lib/systemd/system/chronyd.service.d
             └─platform-chrony.conf
     Active: active (running) since Wed 2023-02-22 00:01:07 UTC; 1 month 14 days ago
       Docs: man:chronyd(8)
             man:chrony.conf(5)
    Process: 2920 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
   Main PID: 2923 (chronyd)
      Tasks: 1 (limit: 76276)
     Memory: 2.2M
        CPU: 74ms
     CGroup: /system.slice/chronyd.service
             └─2923 /usr/sbin/chronyd -F 2

Feb 22 00:01:06 localhost.localdomain chronyd[2923]: chronyd version 4.3 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +NTS +SEC>
Feb 22 00:01:06 localhost.localdomain chronyd[2923]: Using right/UTC timezone to obtain leap second data
Feb 22 00:01:06 localhost.localdomain chronyd[2923]: Loaded seccomp filter (level 2)
Feb 22 00:01:06 localhost.localdomain systemd[1]: Starting chronyd.service - NTP client/server...
Feb 22 00:01:07 localhost.localdomain systemd[1]: Started chronyd.service - NTP client/server.
Feb 22 00:01:26 ampere-mtsnow-altramax-13.khw4.lab.eng.bos.redhat.com chronyd[2923]: Selected source 10.2.32.38
Feb 22 00:01:26 ampere-mtsnow-altramax-13.khw4.lab.eng.bos.redhat.com chronyd[2923]: System clock wrong by 3895209.489901 seconds
Apr 08 02:01:35 ampere-mtsnow-altramax-13.khw4.lab.eng.bos.redhat.com chronyd[2923]: System clock was stepped by 3895209.489901 seconds
Apr 08 02:01:35 ampere-mtsnow-altramax-13.khw4.lab.eng.bos.redhat.com chronyd[2923]: System clock TAI offset set to 37 seconds
Apr 08 02:03:56 ampere-mtsnow-altramax-13.khw4.lab.eng.bos.redhat.com chronyd[2923]: Source 2600:3c03:e002:1300::10 replaced with 68.233.45.146 (2.fedora.poo>

[core@ampere-mtsnow-altramax-13 ~]$ timedatectl
               Local time: Sat 2023-04-08 02:06:19 UTC
           Universal time: Sat 2023-04-08 02:06:19 UTC
                 RTC time: n/a
                Time zone: UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

[core@ampere-mtsnow-altramax-13 ~]$ sudo coreos-installer install /dev/nvme0n1 --ignition-url http://xxx/~hhei/beaker-install.ign --insecure-ignition --image-url https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz --insecure --fetch-retries infinite
Downloading image from https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz
Downloading signature from https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz.sig
Partitions in use on /dev/nvme0n1:
    /dev/nvme0n1p3 in use by /dev/dm-1
    /dev/nvme0n1p3 in use by /dev/dm-2
    /dev/nvme0n1p3 in use by /dev/dm-0
Error: checking for exclusive access to /dev/nvme0n1

Caused by:
    found busy partitions

[core@ampere-mtsnow-altramax-13 ~]$ sudo vgremove rhel_ampere-mtsnow-altramax-13

[core@ampere-mtsnow-altramax-13 ~]$ sudo coreos-installer install /dev/nvme0n1 --ignition-url http://xxx/~hhei/beaker-install.ign --insecure-ignition --image-url https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz --insecure --fetch-retries infinite
Downloading image from https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz
Downloading signature from https://builds.coreos.fedoraproject.org/prod/streams/next/builds/38.20230322.1.0/aarch64/fedora-coreos-38.20230322.1.0-metal.aarch64.raw.xz.sig
> Read disk 608.2 MiB/608.2 MiB (100%)   
gpg: Signature made Wed Mar 22 23:23:04 2023 UTC
gpg:                using RSA key 6A51BBABBA3D5467B6171221809A8D7CEB10B464
gpg: checking the trustdb
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   4  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 4u
gpg: Good signature from "Fedora (38) <fedora-38-primary@fedoraproject.org>" [ultimate]
Writing Ignition config
Install complete.

@HuijingHei HuijingHei changed the title FCOS PXE install fails to fetch image with "certificate is not yet valid" FCOS PXE install fails to fetch image with wrong system BIOS time Apr 10, 2023
HuijingHei added a commit to HuijingHei/coreos-installer that referenced this issue Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants