Skip to content

fix(alpine): fix Alpine 3.23 compatibility issues causing post-reboot crashes#132

Merged
stevensbkang merged 10 commits intoportainer:developfrom
samdulam:fix/alpine-3.23-compatibility
Mar 31, 2026
Merged

fix(alpine): fix Alpine 3.23 compatibility issues causing post-reboot crashes#132
stevensbkang merged 10 commits intoportainer:developfrom
samdulam:fix/alpine-3.23-compatibility

Conversation

@samdulam
Copy link
Copy Markdown
Contributor

Summary

  • cgroupDriver mismatch: kubelet had cgroupDriver: systemd hardcoded and containerd set SystemdCgroup=true whenever cgroupv2 was detected. Alpine 3.23 uses cgroupv2 but runs OpenRC (not systemd), so both kubelet and containerd now detect systemd presence via /run/systemd/private before selecting the systemd cgroup driver. Without this fix, kubesolo works interactively but crashes on every reboot because OpenRC starts the service before systemd (which doesn't exist) could be checked.
  • CoreDNS OOM killed: Alpine 3.23's cgroupv2 kernel accounts for more memory types (socket buffers, slab objects, page tables), causing CoreDNS to exceed its 20Mi limit and be OOM-killed silently (exit 255 / "Unknown" rather than 137 / "OOMKilled"). Increased limit to 64Mi and request to 32Mi.
  • kube-proxy nftables binary missing: /proc/net/ip_tables_names is absent on Alpine 3.23, so kube-proxy selects nftables mode and requires the nft binary which isn't installed by default. Added Alpine-specific nftables prerequisite check to install.sh with a --install-prereqs flag for automatic installation.
  • install.sh self-kill: stop_running_processes used pgrep -f "kubesolo" which matched the install script's own process when invoked with --offline-install=/tmp/kubesolo. Replaced with a /proc/$pid/exe-based lookup that only matches processes whose actual executable is the kubesolo binary.

Test plan

  • Install kubesolo on Alpine 3.23 with --install-prereqs flag and verify nftables is installed automatically
  • Verify kubesolo starts successfully after a clean reboot on Alpine 3.23
  • Verify CoreDNS pod stays running (no OOM kills in dmesg)
  • Verify kube-proxy starts without "nft not found" error
  • Verify kubesolo still works correctly on systemd-based distros (cgroupDriver should still be "systemd")
  • Run install.sh --offline-install=/tmp/kubesolo-binary and confirm the script does not kill itself

… crashes

Three root causes identified on Alpine 3.23 that work fine on 3.22:

1. cgroupDriver mismatch: kubelet hardcoded "systemd" as cgroup driver but
   Alpine uses OpenRC (no systemd), causing kubelet to fail immediately on
   boot. Similarly, containerd set SystemdCgroup=true whenever cgroupv2 was
   detected, regardless of whether systemd was running. Both now detect
   systemd presence via /run/systemd/private before selecting the driver.

2. CoreDNS OOM killed: Alpine 3.23's cgroupv2 kernel accounts more memory
   types (socket buffers, slab objects, page tables), causing CoreDNS to
   exceed its 20Mi limit. Increased limit to 64Mi and request to 32Mi.

3. kube-proxy nftables mode: /proc/net/ip_tables_names is absent on Alpine
   3.23, causing kube-proxy to select nftables mode which requires the nft
   binary. Added nftables as an Alpine-specific prerequisite check in
   install.sh, with --install-prereqs flag for automatic installation.

Also fixed install.sh stop_running_processes self-kill: replaced pgrep -f
"kubesolo" (which matches the install script's own cmdline when
--offline-install path contains "kubesolo") with a /proc/$pid/exe-based
check that only matches processes whose actual executable is the kubesolo
binary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@samdulam samdulam requested a review from stevensbkang as a code owner March 30, 2026 11:33
Sam and others added 9 commits March 30, 2026 12:02
…elf-kill

stop_port_processes was detecting kubesolo processes by grepping the cmdline
for "kubesolo", which could match the install script itself or its path
argument. Switch to checking /proc/$pid/exe against the known binary path,
consistent with the fix already applied to stop_running_processes.
Mirrors the matrix from release.yaml so musl artifacts (required for
Alpine and other musl-based distros) are available from CI runs, not
only from manual release triggers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lsof -t on Alpine returns PID 0/1 for files that were previously mapped
by the init system. The previous check used cmdline grep which failed to
identify them as non-kubesolo, and a logic bug caused all processes to be
killed when not running under kubesolo. Fix:

- Add explicit guard: never kill PID <= 1
- Replace cmdline grep with /proc/$pid/exe check (consistent with the
  other stop_* functions) so only the actual kubesolo binary is targeted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…es on reboot

containerd's state directory was under /var/lib/kubesolo (persistent),
so after a reboot containerd recovered existing pod sandboxes from its
database without re-running CNI ADD. This left the nftables masquerade
rules (in-memory, wiped on reboot) empty, breaking pod-to-external
routing until pods were manually deleted and rescheduled.

Moving state to /run/kubesolo/containerd/state (tmpfs) forces containerd
to treat all pods as new on each boot, re-running CNI ADD for every pod
and re-establishing the masquerade rules. Persistent image/snapshot data
stays under basePath/containerd/root as before.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On a default Alpine install the OpenRC cgroups service is not enabled,
leaving /sys/fs/cgroup/cgroup.controllers empty. Without it kubesolo
fails the cgroups pre-flight check. Added ensure_alpine_cgroups_service()
which detects this condition on Alpine/OpenRC and either enables+starts
the service automatically (--install-prereqs) or exits with clear
instructions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
flushNftablesNat() was unconditionally flushing table ip nat before
kube-proxy started. In nftables mode kube-proxy uses its own
table ip kube-proxy and never writes to table ip nat, so the flush
was wiping CNI masquerade rules set up by the bridge plugin during
pod scheduling. This caused pod-to-external traffic to break after
every reboot (rules were added during kubelet startup, then cleared
when kube-proxy started moments later).

The flush was originally added to avoid conflicts with Podman/netavark
native nftables entries, which only applies to iptables proxy mode.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This change is being handled in a separate PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Only the limit needs to increase to prevent OOM kills on Alpine 3.23
cgroupv2. The request can stay at the original 20Mi.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
arm32 and riscv64 builds are not needed for CI validation.
Full arch matrix is still built in the release pipeline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@stevensbkang stevensbkang merged commit a5cf461 into portainer:develop Mar 31, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants