Detect Ubuntu 24.04+ and disable apt-daily auto-updates at VC startup#698
Open
Detect Ubuntu 24.04+ and disable apt-daily auto-updates at VC startup#698
Conversation
Follow-up to #694. Two issues reported by the CRC QoS team after the initial change merged: 1. Quoting bug: the Command parameter used single quotes to group the bash subcommand: bash -c 'systemctl mask ...' VC's ExecuteCommand splits the first whitespace-delimited token (bash) as the executable and passes the remainder to .NET Process.Start as the Arguments string. .NET's argument tokenizer (on both Linux and Windows) treats double quotes as the grouping character and does NOT strip single quotes. As a result bash received argv[2] = 'systemctl literally, which caused it to fail with a command not found style error. Replaced with escaped double quotes: bash -c "systemctl mask ..." The CRC team validated this form locally and has a Juno experiment running against a profile with this corrected syntax. 2. Scope: the mid-run VC restart on Ubuntu 24.04 is not SPEC-specific. The team explicitly reported reproducing it with FIO in experiment SYSAUTO-CHPI_MWH25PrdApp17_Fio_PV2_all_sizes_ga_version_Standard_M176bs_v3- 20260421135821. Added the same MaskAptDailyTimers dependency step to PERF-IO-FIO.json and PERF-IO-FIO-OLTP.json. Broader coverage across all Linux perf profiles (or a code-level startup hook in ExecuteProfileCommand so no profile opt-in is needed) can follow after further review discussion. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem ------- On Ubuntu 24.04+, the default installation of `needrestart` combined with the `apt-daily-upgrade.timer` (fires daily 06:00-07:00 UTC with RandomizedDelaySec=60min) automatically restarts any service whose shared libraries are updated by unattended upgrades. For long-running Virtual Client workloads this manifests as VC being SIGKILL'd mid-run - observed at ~29% of VMs in CRC SYSAUTO experiments, concentrated at hour 6 UTC (uniform 0-59 minute distribution, matching the apt timer signature). Web / distro research confirmed this behavior is specific to Ubuntu 24.04+: - Ubuntu 24.04+: needrestart installed AND auto-restart-on-unattended-upgrade is the default (this is the regression). - Ubuntu 22.04/22.10/23.04: needrestart installed but list-only in non-interactive mode; not known to cause the issue in the field. - Debian 11/12: needrestart not installed by default. Fix --- Add a best-effort startup hook in ExecuteProfileCommand that: - runs exactly once per VC invocation, immediately after Platform.Initialize - is a no-op on non-Unix platforms - parses the Ubuntu major version out of PRETTY_NAME and only runs on >=24 - masks + stops the four apt-daily units via `bash -c "..."` (double-quoted because .NET Process argument tokenization follows Windows CommandLineToArgvW rules - single quotes do not group) - swallows any exception so VC startup is never blocked by this mitigation - logs telemetry (DisabledLinuxAutoUpdates / DisableLinuxAutoUpdatesFailed) Because the mitigation now runs unconditionally for every profile on the affected OS, the per-profile MaskAptDailyTimers step added to the SPEC CPU and FIO profiles in PR #694 is redundant and has been removed. Bumped VERSION to 3.1.3. Tests ----- Added parameterized coverage for TryGetUbuntuMajorVersion (9 cases, all passing). ExecuteProfileCommandTests: 24/24 pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Defends Virtual Client against mid-run SIGKILL caused by
apt-daily-upgrade.timer+needrestarton Ubuntu 24.04+. The mitigation is applied at VC startup for every profile automatically, superseding the per-profileMaskAptDailyTimersstep added in #694.Background
On Ubuntu 24.04+,
needrestart(installed by default) is configured to auto-restart services whose shared libraries are updated by unattended upgrades. Combined withapt-daily-upgrade.timerfiring between 06:00-07:00 UTC (RandomizedDelaySec=60min), any long-running Virtual Client process gets SIGKILL'd mid-run.CRC telemetry showed this affecting ~29% of Ubuntu 24.04 VMs, with the restart distribution concentrated at hour 6 UTC and uniform across the 0-59 minute window — the exact apt timer signature. The issue is profile-agnostic (observed on SPEC CPU AND FIO workloads) which is why a per-profile mitigation is insufficient.
Distro matrix (from documented defaults):
needrestartdefaultapt-daily-upgrade.timerOnly Ubuntu 24.04+ is strictly affected, so the gate is scoped to that.
What changed
ExecuteProfileCommand.csDisableLinuxAutoUpdatesAsynchook called once per VC invocation, right afterPlatform.Initialize.PRETTY_NAME(via the newTryGetUbuntuMajorVersionhelper) and only runs when>= 24.bash -c "..."one-liner. Uses double-quote grouping —.NET Processtokenization followsCommandLineToArgvWrules where single quotes pass through literally (this is the same bug the CRC team reported against Mask apt-daily timers in SPEC CPU profiles to prevent VC restarts on Ubuntu 24.04 #694 — the quoting is now fixed structurally and covered by a code-level helper).DisabledLinuxAutoUpdates(success) /DisableLinuxAutoUpdatesFailed(warning).Profile cleanup
Removed the now-redundant per-profile
MaskAptDailyTimersstep from:PERF-SPECCPU-FPRATE.jsonPERF-SPECCPU-FPSPEED.jsonPERF-SPECCPU-INTRATE.jsonPERF-SPECCPU-INTSPEED.jsonPERF-IO-FIO.jsonPERF-IO-FIO-OLTP.jsonVersion
Bumped
VERSIONto3.1.3.Testing
TryGetUbuntuMajorVersioncases covering Ubuntu 24.04.1 LTS, 22.04.3 LTS, 25.10, lowercase variants, Debian names, codename-only Ubuntu (development), empty, and null.dotnet test VirtualClient.UnitTests --filter ExecuteProfileCommandTests: 24/24 pass.dotnet build VirtualClient.Main -c Releasesucceeds with 0 warnings / 0 errors.Risk
Low. The mitigation is gated on Ubuntu ≥ 24.04 and wrapped in a single try/catch that swallows everything. Worst case on an unexpected system: a single warning-level telemetry event, then normal startup proceeds.
Follow-ups (not in this PR)
--apt-daily-mask=falseopt-out flag if any user objects; deliberately omitted for now since the default-on is the desired behavior.