Zincati fails to update nodes: Too many open files #1608

fifofonix · 2023-11-06T12:40:28Z

Describe the bug

Several next nodes running 39.20231022.1.0 did not update as part of the latest update cycle.

Zincati reported errors.

Restarting zincati fixed the problem.

Reproduction steps

Haven't spent time trying to reproduce.

Examining the fleet I observe:

Nodes that updated to 39.20231022.1.0 on/around 2023-10-25 18:35:58 UTC and have remained up seem to be affected.
Nodes that have since rebooted for whatever reason, and have therefore restarted zincati, have auto-updated.

Expected behavior

Node updates like it always has.

Actual behavior

Node does not update. Zincati reports errors.

System details

vSphere
39.20231022.1.0

Butane or Ignition config

No response

Additional information

Node updated on Oct 25th and by Oct 29th a too many files error was observed and repeats up until recently. No other system funcationality impacted during this time.

Oct 25 18:40:14 d-node zincati[2266]: [INFO  zincati::update_agent::actor] found 1 other finalized deployment
Oct 25 18:40:14 d-node zincati[2266]: [INFO  zincati::update_agent::actor] deployment 39.20231016.1.0 (0b877da0ce6bbbcefa0806d9dbf1aad3c8c0187ae2b431a2b7ce8646cd4fc602) will be excluded from being a future update target
Oct 25 18:40:14 d-node zincati[2266]: [INFO  zincati::update_agent::actor] initialization complete, auto-updates logic enabled
Oct 25 18:40:14 d-node zincati[2266]: [INFO  zincati::strategy] update strategy: immediate
Oct 25 18:40:14 d-node zincati[2266]: [INFO  zincati::update_agent::actor] reached steady state, periodically polling for updates
Oct 25 18:40:14 d-node systemd[1]: Started zincati.service - Zincati Update Agent.
Oct 25 18:40:14 d-node audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=zincati comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 25 18:40:14 d-node zincati[2266]: [INFO  zincati::cincinnati] current release detected as not a dead-end
Oct 25 18:40:15 d-node pkexec[2658]: zincati: Executing command [USER=root] [TTY=unknown] [CWD=/] [COMMAND=/usr/libexec/zincati deadend-motd unset]
Oct 26 22:33:16 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: server-side error, code 502: (unknown/generic server error)
Oct 26 22:38:39 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: server-side error, code 502: (unknown/generic server error)
Oct 26 22:43:56 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: server-side error, code 502: (unknown/generic server error)
Oct 26 22:52:53 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: server-side error, code 502: (unknown/generic server error)
Oct 28 05:09:46 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?node_uuid=4849744564b14d1980785ea216f46fb9&os_checksum=8b1abc2d92a9928145
c3e3c43d0319b93770b6e6687ef1f05dbd06b71c5cd0db&rollout_wariness=0.500000&stream=next&group=default&platform=vmware&os_version=39.20231022.1.0&basearch=x86_64): error trying to connect: Connection reset by peer (os error 104)
Oct 28 18:15:34 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: server-side error, code 502: (unknown/generic server error)
Oct 29 10:27:30 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?node_uuid=4849744564b14d1980785ea216f46fb9&platform=vmware&group=default&
os_version=39.20231022.1.0&stream=next&rollout_wariness=0.500000&basearch=x86_64&os_checksum=8b1abc2d92a9928145c3e3c43d0319b93770b6e6687ef1f05dbd06b71c5cd0db): error trying to connect: dns error: Too many open files (os error 24)
Oct 29 10:32:36 d-node zincati[2266]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Oct 29 10:32:36 d-node zincati[2266]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?basearch=x86_64&os_version=39.20231022.1.0&node_uuid=4849744564b14d198078
5ea216f46fb9&stream=next&os_checksum=8b1abc2d92a9928145c3e3c43d0319b93770b6e6687ef1f05dbd06b71c5cd0db&group=default&rollout_wariness=0.500000&platform=vmware): error trying to connect: dns error: Too many open files (os error 24)
Oct 29 10:37:36 d-node zincati[2266]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files

The text was updated successfully, but these errors were encountered:

jlebon · 2023-11-07T22:09:09Z

That release has the new Zincati release: https://github.com/coreos/zincati/releases/tag/v0.0.26. Possibly related to one of those dependency bumps?

jlebon · 2023-11-07T22:25:33Z

OK and Zincati v0.0.26 just got promoted in today's stable release. Hmm, I think we need to dig into this more to understand the impact here. It might be worth pausing the rollout until we do that.

It includes Zincati v0.0.26 which may be a culprit in coreos/fedora-coreos-tracker#1608 where updates appear to break in some situations. Let's pause until we know more.

jlebon · 2023-11-07T22:35:07Z

It's an old issue, but this looks similar in that it's also about async reqwest: seanmonstar/reqwest#386

I didn't see anything obvious though in the reqwest release notes (for the range we went through in that Zincati release).

It includes Zincati v0.0.26 which may be a culprit in coreos/fedora-coreos-tracker#1608 where updates appear to break in some situations. Let's pause until we know more.

dustymabe · 2023-11-07T23:54:31Z

This problem is real:

[core@dustymabe ~]$ rpm-ostree status 
State: idle
AutomaticUpdatesDriver: Zincati
  DriverState: active; periodically polling for updates (last checked Mon 2023-11-06 18:59:21 UTC)
Deployments:
● fedora:fedora/aarch64/coreos/next
                  Version: 39.20231101.1.0 (2023-11-02T23:35:56Z)
                   Commit: 751bdb39581e6b11621af5296b973c2524e9befc1ff4c89fedbb9d2b3f3386d1
             GPGSignature: Valid signature by E8F23996F23218640CB44CBE75CF5AC418B8E74C

  fedora:fedora/aarch64/coreos/testing
                  Version: 38.20231027.2.0 (2023-10-30T16:09:34Z)
                   Commit: 05360a2543aa96bb3e8597f08b8aaa3ea14292c52cf773f804a2cfa45b919a89
             GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464

[core@dustymabe ~]$ uptime
 23:53:01 up 4 days, 20:24,  2 users,  load average: 0.00, 0.00, 0.00
[core@dustymabe ~]$ sudo lsof -u zincati | wc -l
1039


[core@dustymabe ~]$ systemctl status zincati  | cat
● zincati.service - Zincati Update Agent
     Loaded: loaded (/usr/lib/systemd/system/zincati.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: active (running) since Fri 2023-11-03 03:29:34 UTC; 4 days ago
       Docs: https://github.com/coreos/zincati
   Main PID: 1159 (zincati)
     Status: "periodically polling for updates (last checked Mon 2023-11-06 18:59:21 UTC)"
      Tasks: 6 (limit: 9325)
     Memory: 22.2M
        CPU: 18.941s
     CGroup: /system.slice/zincati.service
             └─1159 /usr/libexec/zincati agent -v

Nov 07 23:30:40 dustymabe.com zincati[1159]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 07 23:30:40 dustymabe.com zincati[1159]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?group=default&node_uuid=631b7b2d1424438aba2b3c7755225957&platform=gcp&stream=next&os_checksum=751bdb39581e6b11621af5296b973c2524e9befc1ff4c89fedbb9d2b3f3386d1&basearch=aarch64&os_version=39.20231101.1.0): error trying to connect: dns error: Too many open files (os error 24)
Nov 07 23:36:04 dustymabe.com zincati[1159]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 07 23:36:04 dustymabe.com zincati[1159]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?os_checksum=751bdb39581e6b11621af5296b973c2524e9befc1ff4c89fedbb9d2b3f3386d1&platform=gcp&os_version=39.20231101.1.0&node_uuid=631b7b2d1424438aba2b3c7755225957&basearch=aarch64&group=default&stream=next): error trying to connect: dns error: Too many open files (os error 24)
Nov 07 23:41:19 dustymabe.com zincati[1159]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 07 23:41:19 dustymabe.com zincati[1159]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?node_uuid=631b7b2d1424438aba2b3c7755225957&basearch=aarch64&os_checksum=751bdb39581e6b11621af5296b973c2524e9befc1ff4c89fedbb9d2b3f3386d1&os_version=39.20231101.1.0&group=default&platform=gcp&stream=next): error trying to connect: dns error: Too many open files (os error 24)
Nov 07 23:46:34 dustymabe.com zincati[1159]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 07 23:46:34 dustymabe.com zincati[1159]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?basearch=aarch64&os_checksum=751bdb39581e6b11621af5296b973c2524e9befc1ff4c89fedbb9d2b3f3386d1&os_version=39.20231101.1.0&node_uuid=631b7b2d1424438aba2b3c7755225957&stream=next&platform=gcp&group=default): error trying to connect: dns error: Too many open files (os error 24)
Nov 07 23:51:40 dustymabe.com zincati[1159]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 07 23:51:40 dustymabe.com zincati[1159]: [ERROR zincati::cincinnati] failed to check Cincinnati for updates: client-side error: error sending request for url (https://updates.coreos.fedoraproject.org/v1/graph?basearch=aarch64&os_checksum=751bdb39581e6b11621af5296b973c2524e9befc1ff4c89fedbb9d2b3f3386d1&platform=gcp&os_version=39.20231101.1.0&node_uuid=631b7b2d1424438aba2b3c7755225957&stream=next&group=default): error trying to connect: dns error: Too many open files (os error 24)

cgwalters · 2023-11-08T00:42:47Z

In a quick skim of things I think this is fixed by lucab/libsystemd-rs@ee15505

cc lucab/libsystemd-rs#147

dustymabe · 2023-11-08T01:11:16Z

Looks like we have a few day window where zincati hasn't yet run out of it's alotment of open files:

[core@apu2 ~]$ sudo lsof -u zincati | wc -l
1015
[core@apu2 ~]$ uptime
 01:08:15 up 3 days, 13:52,  1 user,  load average: 0.15, 0.12, 0.05

I imagine for all testing and next nodes out there the user will have to restart zincati to unstick the system. For stable we can do a release tomorrow with a downgraded zincati and hopefully no systems get stuck: coreos/fedora-coreos-config#2720

For lucab/libsystemd-rs@ee155056e54e for coreos/fedora-coreos-tracker#1608.

jlebon · 2023-11-08T14:17:42Z

Zincati bump in coreos/zincati#1118.

dustymabe · 2023-11-08T14:49:06Z

I think there have been other cases in the past where zincati has got into a state where a simple restart would have allowed the system to progress updating. I wonder if we shouldn't do something like this to just force restart the service periodically:

diff --git a/dist/systemd/system/zincati.service b/dist/systemd/system/zincati.service
index 1837a09..ae6fdda 100644
--- a/dist/systemd/system/zincati.service
+++ b/dist/systemd/system/zincati.service
@@ -22,6 +22,7 @@ Type=notify
 ExecStart=/usr/libexec/zincati agent ${ZINCATI_VERBOSITY}
 Restart=on-failure
 RestartSec=10s
+RuntimeMaxSec=2w
 
 [Install]
 WantedBy=multi-user.target

dustymabe · 2023-11-08T14:50:36Z

If we set the time to something shorter it could possibly be a round about way to fix coreos/zincati#928

dustymabe · 2023-11-08T14:51:52Z

I wonder if we shouldn't do something like this to just force restart the service periodically

I know this would feel gross/icky. The truth of the matter is that we want code that is bug free and doesn't have any problems, but the reality is we can't foresee/catch everything and in this component of the system it's important to have an escape hatch.

dustymabe · 2023-11-08T14:57:33Z

The stable candidate we built last night seems promising:

Fedora CoreOS 38.20231027.3.2
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/tag/coreos

Last login: Wed Nov  8 14:35:30 2023
[core@cosa-devsh ~]$ 
[core@cosa-devsh ~]$ 
[core@cosa-devsh ~]$ 
[core@cosa-devsh ~]$ rpm -q zincati
zincati-0.0.25-4.fc38.x86_64
[core@cosa-devsh ~]$ while true; do date; sudo lsof -u zincati | wc -l; sleep 300; done
Wed Nov  8 14:36:08 UTC 2023
34
Wed Nov  8 14:41:08 UTC 2023
34
Wed Nov  8 14:46:08 UTC 2023
34
Wed Nov  8 14:51:08 UTC 2023
34
Wed Nov  8 14:56:08 UTC 2023
34

jlebon · 2023-11-08T15:30:38Z

I can confirm that coreos/zincati#1118 fixes this issue:

$ cosa run --qemu-image fedora-coreos-38.20231027.3.1-qemu.x86_64.qcow2
[root@cosa-devsh ~]# mkdir -p /etc/zincati/config.d
[root@cosa-devsh ~]# cat > /etc/zincati/config.d/99-agent-speedup.toml
[agent.timing]
steady_interval_secs = 1
[root@cosa-devsh ~]# systemctl restart zincati
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
15
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
16
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
17
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
18
[root@cosa-devsh ~]# systemctl stop zincati
[root@cosa-devsh ~]# rpm-ostree usroverlay
Development mode enabled.  A writable overlayfs is now mounted on /usr.
All changes there will be discarded on reboot.
[root@cosa-devsh ~]# cp /mnt/workdir-tmp/zincati /usr/libexec/zincati
[root@cosa-devsh ~]# systemctl start zincati
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
10
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
10
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
10
[root@cosa-devsh ~]# lsof -aUu zincati | wc -l
10
[root@cosa-devsh ~]#

jlebon · 2023-11-08T20:04:29Z

I think there have been other cases in the past where zincati has got into a state where a simple restart would have allowed the system to progress updating. I wonder if we shouldn't do something like this to just force restart the service periodically:
diff --git a/dist/systemd/system/zincati.service b/dist/systemd/system/zincati.service
index 1837a09..ae6fdda 100644
--- a/dist/systemd/system/zincati.service
+++ b/dist/systemd/system/zincati.service
@@ -22,6 +22,7 @@ Type=notify
 ExecStart=/usr/libexec/zincati agent ${ZINCATI_VERBOSITY}
 Restart=on-failure
 RestartSec=10s
+RuntimeMaxSec=2w
 
 [Install]
 WantedBy=multi-user.target

Seems reasonable to me, though I would bump it to e.g. 3 weeks so it's more safely larger than our usual release frequency. IOW, we usually shouldn't be running for longer than 3 weeks, and if we are, that might indicate something is wrong and restarting Zincati is worth a try.

We've seen some issues in the past where a simple restart of the zincati daemon would have allowed systems to continue updating. Let's periodically restart the zincati daemon to handle cases like this in the future, which we can't always foresee. The most recent example being: coreos/fedora-coreos-tracker#1608

dustymabe · 2023-11-08T21:16:07Z

Seems reasonable to me, though I would bump it to e.g. 3 weeks so it's more safely larger than our usual release frequency. IOW, we usually shouldn't be running for longer than 3 weeks, and if we are, that might indicate something is wrong and restarting Zincati is worth a try.

coreos/zincati#1121

dustymabe · 2023-11-08T21:18:09Z

The fix for this went into stable stream release 38.20231027.3.2.

This was a fast-track to get the stable update out within a day of it shipping to end users. By getting it out to users this fast the Zincati client on the node should not have run out of open files allotment and should be able to update the system from 38.20231027.3.1 to 38.20231027.3.2 (with downgraded zincati) without issue.

dustymabe · 2023-11-13T14:57:58Z

Started a hackmd for the coreos-status communication: https://hackmd.io/RcLX0wjNTE-BheouqnYt_Q?edit

jlebon · 2023-11-13T21:56:16Z

Message sent to coreos-status: https://lists.fedoraproject.org/archives/list/coreos-status@lists.fedoraproject.org/thread/SLRXXLQOGNYAO56EIZ45VPMUWNEB3ZCM/ and posted in forum: https://discussion.fedoraproject.org/t/fedora-coreos-testing-38-20231027-2-0-and-next-39-20231022-1-0-may-not-receive-updates/95836

dustymabe · 2023-11-14T14:36:11Z

The fix for this went into next stream release 39.20231106.1.1. Please try out the new release and report issues.

dustymabe · 2023-11-14T14:36:23Z

The fix for this went into testing stream release 39.20231101.2.1. Please try out the new release and report issues.

dustymabe · 2023-11-14T14:57:04Z

If you were on stable and have a periodic update strategy that delays finalization/reboot for a few days you'll end up in a position where the system won't be able to finalize the update:

The logs look like this (this example is from a testing stream node, but you'll see similar logs):

Nov 06 07:03:36 coreos-x86-64-builder systemd[1]: Starting zincati.service - Zincati Update Agent...
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::cli::agent] starting update agent (zincati 0.0.26)
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::cincinnati] Cincinnati service: https://updates.coreos.fedoraproject.org
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::cli::agent] agent running on node 'c6cee14455e8473597134c814284fc69', in update group 'default'
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] registering as the update driver for rpm-ostree
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] found 1 other finalized deployment
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] deployment 38.20231014.2.0 (febe79b06fc4a72804911f68b7e22aca1e7abfd277446aef4ab6d226e5fc1e28) will be excluded from being a future update target
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] initialization complete, auto-updates logic enabled
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::strategy] update strategy: periodic, total schedule length 60 minutes; next window at 7:3 on Mon (UTC), subject to time zone caveats.
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] reached steady state, periodically polling for updates
Nov 06 07:03:36 coreos-x86-64-builder zincati[1297]: [INFO  zincati::cincinnati] current release detected as not a dead-end
Nov 06 07:03:36 coreos-x86-64-builder systemd[1]: Started zincati.service - Zincati Update Agent.
Nov 07 18:28:01 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] target release '39.20231101.2.0' selected, proceeding to stage it
Nov 07 18:28:20 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] update staged: 39.20231101.2.0
Nov 09 22:27:06 coreos-x86-64-builder zincati[1297]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 09 22:32:27 coreos-x86-64-builder zincati[1297]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 09 22:37:39 coreos-x86-64-builder zincati[1297]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
...
...
...
...
Nov 13 06:52:20 coreos-x86-64-builder zincati[1297]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 13 06:57:32 coreos-x86-64-builder zincati[1297]: [ERROR zincati::utils] failed to notify service manager of service status change: libsystemd error: failed to open Unix datagram socket: EMFILE: Too many open files
Nov 13 07:02:41 coreos-x86-64-builder zincati[1297]: [ERROR zincati::update_agent] failed to check for interactive sessions: failed to run `loginctl` binary
Nov 13 07:02:41 coreos-x86-64-builder zincati[1297]: [WARN  zincati::update_agent] assuming no active sessions and proceeding anyway
Nov 13 07:02:41 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] staged deployment '39.20231101.2.0' available, proceeding to finalize it
Nov 13 07:02:41 coreos-x86-64-builder zincati[1297]: [ERROR zincati::update_agent::actor] failed to finalize deployment: failed to run 'rpm-ostree' binary
Nov 13 07:08:05 coreos-x86-64-builder zincati[1297]: [ERROR zincati::update_agent] failed to check for interactive sessions: failed to run `loginctl` binary
Nov 13 07:08:05 coreos-x86-64-builder zincati[1297]: [WARN  zincati::update_agent] assuming no active sessions and proceeding anyway
Nov 13 07:08:05 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] staged deployment '39.20231101.2.0' available, proceeding to finalize it
Nov 13 07:08:05 coreos-x86-64-builder zincati[1297]: [ERROR zincati::update_agent::actor] failed to finalize deployment: failed to run 'rpm-ostree' binary
Nov 13 07:13:17 coreos-x86-64-builder zincati[1297]: [ERROR zincati::update_agent] failed to check for interactive sessions: failed to run `loginctl` binary
Nov 13 07:13:17 coreos-x86-64-builder zincati[1297]: [WARN  zincati::update_agent] assuming no active sessions and proceeding anyway
Nov 13 07:13:17 coreos-x86-64-builder zincati[1297]: [INFO  zincati::update_agent::actor] staged deployment '39.20231101.2.0' available, proceeding to finalize it
Nov 13 07:13:17 coreos-x86-64-builder zincati[1297]: [ERROR zincati::update_agent::actor] failed to finalize deployment: failed to run 'rpm-ostree' binary
Nov 13 07:18:35 coreos-x86-64-builder zincati[1297]: [ERROR zincati::update_agent] failed to check for interactive sessions: failed to run `loginctl` binary
Nov 13 07:18:35 coreos-x86-64-builder zincati[1297]: [WARN  zincati::update_agent] assuming no active sessions and proceeding anyway

So you'll also need to apply the workaround.

In `38.20231027.2.0` it was the last 38 release of `testing`. It also happens to be the first release with the zincati problem [1]. To avoid this problem we'll make the 38->39 update barrier (the one that satisfies https://docs.fedoraproject.org/en-US/fedora-coreos/update-barrier-signing-keys/ be `38.20231014.2.0` rather than `38.20231027.2.0`. [1] coreos/fedora-coreos-tracker#1608).

In `38.20231027.2.0` it was the last 38 release of `testing`. It also happens to be the first release with the zincati problem [1]. To avoid this problem we'll make the 38->39 update barrier (the one that satisfies https://docs.fedoraproject.org/en-US/fedora-coreos/update-barrier-signing-keys/ be `38.20231014.2.0` rather than `38.20231027.2.0`. [1] coreos/fedora-coreos-tracker#1608

fifofonix added the kind/bug label Nov 6, 2023

jlebon mentioned this issue Nov 7, 2023

updates/stable: pause stable rollout coreos/fedora-coreos-streams#813

Merged

cgwalters mentioned this issue Nov 8, 2023

new release? lucab/libsystemd-rs#147

Closed

jlebon mentioned this issue Nov 8, 2023

[stable] overrides: pin rust-zincati-0.0.25-4.fc38 coreos/fedora-coreos-config#2720

Merged

dustymabe mentioned this issue Nov 8, 2023

stable: new release on 2023-11-08 (38.20231027.3.2) coreos/fedora-coreos-streams#814

Closed

43 tasks

jlebon added a commit to jlebon/zincati that referenced this issue Nov 8, 2023

build(deps): bump libsystemd from 0.6.0 to 0.7.0

5467001

For lucab/libsystemd-rs@ee155056e54e for coreos/fedora-coreos-tracker#1608.

jlebon mentioned this issue Nov 8, 2023

build(deps): bump libsystemd from 0.6.0 to 0.7.0 coreos/zincati#1118

Merged

dustymabe mentioned this issue Nov 8, 2023

zincati.service: periodically restart zincati daemon coreos/zincati#1121

Closed

dustymabe changed the title ~~Nodes Fail To Update (Zincati Reports libsystemd errors regarding EMFILE: Too many open files)~~ Zincati fails to update nodes: Too many open files Nov 8, 2023

dustymabe added status/pending-testing-release Fixed upstream. Waiting on a testing release. status/pending-next-release Fixed upstream. Waiting on a next release. status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. labels Nov 8, 2023

dustymabe removed the status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. label Nov 8, 2023

jlebon mentioned this issue Nov 8, 2023

[testing] overrides: pin rust-zincati-0.0.25-6.fc39 coreos/fedora-coreos-config#2721

Merged

This was referenced Nov 8, 2023

testing: new ad-hoc release on 2023-11-13 (39.20231101.2.1) coreos/fedora-coreos-streams#817

Closed

next: new ad-hoc release on 2023-11-13 (39.20231106.1.1) coreos/fedora-coreos-streams#818

Closed

dustymabe removed status/pending-testing-release Fixed upstream. Waiting on a testing release. status/pending-next-release Fixed upstream. Waiting on a next release. labels Nov 14, 2023

dustymabe closed this as completed Nov 14, 2023

dustymabe mentioned this issue Nov 14, 2023

tracker: Rebase onto Fedora 39 #1490

Closed

47 tasks

dustymabe mentioned this issue Nov 14, 2023

Change barrier for last F38 testing stream release coreos/fedora-coreos-streams#821

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zincati fails to update nodes: Too many open files #1608

Zincati fails to update nodes: Too many open files #1608

fifofonix commented Nov 6, 2023

jlebon commented Nov 7, 2023

jlebon commented Nov 7, 2023

jlebon commented Nov 7, 2023

dustymabe commented Nov 7, 2023

cgwalters commented Nov 8, 2023 •

edited

Loading

dustymabe commented Nov 8, 2023

jlebon commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

jlebon commented Nov 8, 2023

jlebon commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 13, 2023

jlebon commented Nov 13, 2023

dustymabe commented Nov 14, 2023

dustymabe commented Nov 14, 2023

dustymabe commented Nov 14, 2023

Zincati fails to update nodes: Too many open files #1608

Zincati fails to update nodes: Too many open files #1608

Comments

fifofonix commented Nov 6, 2023

Describe the bug

Reproduction steps

Expected behavior

Actual behavior

System details

Butane or Ignition config

Additional information

jlebon commented Nov 7, 2023

jlebon commented Nov 7, 2023

jlebon commented Nov 7, 2023

dustymabe commented Nov 7, 2023

cgwalters commented Nov 8, 2023 • edited Loading

dustymabe commented Nov 8, 2023

jlebon commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

jlebon commented Nov 8, 2023

jlebon commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 8, 2023

dustymabe commented Nov 13, 2023

jlebon commented Nov 13, 2023

dustymabe commented Nov 14, 2023

dustymabe commented Nov 14, 2023

dustymabe commented Nov 14, 2023

cgwalters commented Nov 8, 2023 •

edited

Loading