-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Summary
The Sysbox systemd unit files (sysbox-fs.service and sysbox-mgr.service) shipped with Sysbox 0.6.5 contain deprecated systemd configuration that causes issues on Amazon Linux 2023 (AL2023), preventing reliable Envbox deployments on EC2 instances.
Prior to this - the same scenario works with the Envbox with Sysbox 0.6.4 base.
Environment
- OS: Amazon Linux 2023 (EC2 instances)
- Sysbox Version: 0.6.5
- Use Case: Supporting Envbox container runtime on AL2023
Issues Identified
Following replication and troubleshooting, the log indicates sysbox-fs starts before sysbox-mgr, but no sysbox-fs.sock file is created, however in about half the starts, the sequence shows sysbox-mgr first following a successful sysbox-fs startup and a sysbox-fs.sock file.
Looking at this, shows issues with the systemd unit files for the two services.
Deprecated StartLimitInterval
Parameter
Since AL2023 uses systemd 252, the old parameters are of concern and the systemd unit files have not been updated for ~2 years.
Although not a deprecation issue, the other differences are versions of Systemd on AL2023, AL2 on 219, and going past 230 to 252 can lead to functionality changes. This may be a time-bomb for other OS systems if the upstream Systemd unit file is not fixed. Ubuntu:latest uses Systemd version 252 as well.
This issue is happening when Envbox is running on VM/Hardware/EC2 instances, which is counter to the aim of running on Kubernetes.
Duplicate Type=
Declarations
Both unit files contain:
Type=simple
Type=notify
This creates redundant declarations where only Type=notify
should be used.
Missing Service Dependencies and Ordering
Based on testing, the current configuration lacks proper service dependencies, leading to race conditions where sysbox-fs may start before sysbox-mgr is ready.
Missing Restart Policies
No restart policies are defined, making the services less resilient to failures.
Current Unit Files (Sysbox 0.6.5)
sysbox-fs.service
[Unit]
Description=sysbox-fs (part of the Sysbox container runtime)
PartOf=sysbox.service
After=sysbox-mgr.service
[Service]
Type=simple
Type=notify
ExecStart=/usr/bin/sysbox-fs
TimeoutStartSec=10
TimeoutStopSec=10
StartLimitInterval=0
NotifyAccess=main
OOMScoreAdjust=-500
LimitNOFILE=infinity
LimitNPROC=infinity
[Install]
WantedBy=sysbox.service
sysbox-mgr.service
[Unit]
Description=sysbox-mgr (part of the Sysbox container runtime)
PartOf=sysbox.service
[Service]
Type=simple
Type=notify
ExecStart=/usr/bin/sysbox-mgr
TimeoutStartSec=45
TimeoutStopSec=90
StartLimitInterval=0
NotifyAccess=main
OOMScoreAdjust=-500
LimitNOFILE=infinity
LimitNPROC=infinity
[Install]
WantedBy=sysbox.service
Proposed Fix
This fix can be applied to the Envbox image directly, enabling users immediately.
Simplified Patch Command
RUN sed -i \
-e '/^Type=simple$/d' \
-e 's/^StartLimitInterval=0$/StartLimitIntervalSec=0/' \
-e '/^\[Unit\]/a After=sysbox-fs.service\nRequires=sysbox-fs.service' \
/usr/lib/systemd/system/sysbox-mgr.service && \
sed -i \
-e '/^Type=simple$/d' \
-e 's/^StartLimitInterval=0$/StartLimitIntervalSec=0/' \
-e '/^\[Unit\]/a Before=sysbox-mgr.service' \
-e '/^\[Service\]/a Restart=on-failure\nRestartSec=2s\nStartLimitBurst=5\nStartLimitIntervalSec=30' \
/usr/lib/systemd/system/sysbox-fs.service
Expected Result After Patch
sysbox-fs.service
[Unit]
Before=sysbox-mgr.service
Description=sysbox-fs (part of the Sysbox container runtime)
PartOf=sysbox.service
[Service]
Restart=on-failure
RestartSec=2s
StartLimitBurst=5
StartLimitIntervalSec=30
Type=notify
ExecStart=/usr/bin/sysbox-fs
TimeoutStartSec=10
TimeoutStopSec=10
StartLimitIntervalSec=0
NotifyAccess=main
OOMScoreAdjust=-500
LimitNOFILE=infinity
LimitNPROC=infinity
[Install]
WantedBy=sysbox.service
sysbox-mgr.service
[Unit]
After=sysbox-fs.service
Requires=sysbox-fs.service
Description=sysbox-mgr (part of the Sysbox container runtime)
PartOf=sysbox.service
[Service]
Restart=on-failure
RestartSec=2s
StartLimitBurst=5
StartLimitIntervalSec=30
Type=notify
ExecStart=/usr/bin/sysbox-mgr
TimeoutStartSec=45
TimeoutStopSec=90
StartLimitIntervalSec=0
NotifyAccess=main
OOMScoreAdjust=-500
LimitNOFILE=infinity
LimitNPROC=infinity
[Install]
WantedBy=sysbox.service
Alternative Timeout/StartLimit Values for Testing
For environments experiencing timing issues, consider these alternative configurations:
Conservative (Slower but More Reliable)
# sysbox-fs.service [Service] section additions
RestartSec=5s
StartLimitBurst=3
StartLimitIntervalSec=60
TimeoutStartSec=20
# sysbox-mgr.service [Service] section additions
RestartSec=5s
StartLimitBurst=3
StartLimitIntervalSec=60
TimeoutStartSec=60
Aggressive (Faster but Less Tolerant)
# sysbox-fs.service [Service] section additions
RestartSec=1s
StartLimitBurst=10
StartLimitIntervalSec=20
TimeoutStartSec=5
# sysbox-mgr.service [Service] section additions
RestartSec=1s
StartLimitBurst=10
StartLimitIntervalSec=20
TimeoutStartSec=30
Impact
- Without fix: Unreliable Sysbox startup on AL2023, preventing consistent Envbox deployments
- With fix: Proper service ordering, restart policies, and AL2023 compatibility