test: add coverage for removeComments CSE stripping logic#8489
Merged
Conversation
Add unit tests and integration test for the removeComments function that
strips comments from shell scripts during CSE assembly. The existing tests
only covered simple comment patterns and did not exercise realistic CSE
script content.
New coverage includes:
- disableVulnerableKernelModule function body survives stripping
- Function call sites are preserved after stripping
- grep patterns with hash characters are not mistaken for comments
- Hash in variable expansions (${#array[@]}) is preserved
- Documents the known limitation where comment-like lines inside string
literals get stripped (root cause of DirtyFrag DOA bug PR #8475)
- Integration test reads actual cse_main.sh and validates critical
patterns survive removeComments intact
AB#37912827
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds additional Go test coverage around removeComments (used during Linux CSE assembly) to prevent regressions where comment-stripping heuristics accidentally remove functionally significant shell content (notably around the DirtyFrag/CVE kernel module blacklist mitigation added recently in cse_main.sh).
Changes:
- Adds table-driven unit tests covering several shell-script patterns that
removeCommentsmust handle safely. - Adds an integration-style test that loads
parts/linux/cloud-init/artifacts/cse_main.sh, runsremoveComments, and validates keydisableVulnerableKernelModulepatterns and call sites survive stripping. - Introduces a
repoRoot()helper to locatecse_main.shwithout hardcoded absolute paths.
Replace the removeComments-only integration test with comprehensive full-pipeline round-trip tests that exercise the exact same path used in production: removeComments → gzip → base64 → decode → gunzip. TestCSEScriptRoundTrip validates all 6 embedded CSE shell scripts: - Byte-for-byte round-trip integrity (decoded == stripped input) - bash -n syntax check on decoded output (skipped for cse_cmd.sh which contains Go template directives pre-execution) TestCSEScriptRoundTrip_CSEMainShCriticalContent validates cse_main.sh critical content survives the full pipeline (function declaration, call sites, printf patterns, modprobe commands, grep patterns, and no orphaned install/blacklist lines). This would have caught the PR #8475 DOA bug where removeComments broke a printf format string, producing invalid bash that only failed at runtime on the VM. AB#37912827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Dynamically discover all .sh files in artifacts/ (55+ scripts, including subdirs) - Full pipeline: removeComments → gzip → base64 → decode → gunzip → bash -n - Remove DirtyFrag-specific content assertions — the bash -n check is sufficient - Enable extglob for syntax check (scripts use @() patterns) - Document known removeComments limitation for aks-localdns-hosts-setup.sh AB#37912827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Extract cseRoundTrip and cseValidateBashSyntax helpers to reduce cognitive complexity (gocognit: 38 → under 20) - Fix err shadow (govet) by using writeErr variable - Fix goimports formatting - Rename 'grep with hash pattern' test to clarify intent AB#37912827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The aks-localdns-hosts-setup.sh fix is being addressed in a separate PR. This test should have zero skip exceptions. AB#37912827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use authoritative script list from getBase64EncodedGzippedCustomScript() calls in variables.go instead of walking all .sh files. Only scripts that flow through removeComments in production are tested — no skip lists needed. AB#37912827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add test documenting known limitation: '# ' inside string literals gets stripped by removeComments (the PR #8475 DOA root cause) - Fix grep test to actually include '#' in the pattern AB#37912827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace hardcoded script list with discoverCSEScripts() that parses variables.go for getBase64EncodedGzippedCustomScript() calls, resolves constant names from const.go, and filters to .sh files. New scripts added to the CSE pipeline are automatically covered. AB#37912827 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cameronmeissner
approved these changes
May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add Go unit tests that validate all CSE shell scripts survive the
removeComments→ gzip → base64 → decode → gunzip pipeline without breaking bash syntax.Background
PR #8475 (DirtyFrag kernel module blacklist) was dead on arrival because
removeCommentsinpkg/agent/utils.gostripped a# descriptionline from inside aprintfformat string, breaking the script. PR #8486 fixed the immediate issue, but there was no test to prevent this class of regression.What this PR adds
TestRemoveComments_ShellPatterns— 6 table-driven subtests#in input)${#array[@]})#inside string literals IS stripped (the root cause)TestCSEScriptRoundTrip— 34 CSE pipeline scriptsgetBase64EncodedGzippedCustomScript()calls invariables.goremoveComments→ gzip → base64 → decode → gunzipbash -nsyntax validation on each decoded script (withextglobenabled){{}}) skipbash -n(not valid bash pre-execution)Related
# %sfrom printf)