feat: auto-extract shell injection expressions from run: steps into env vars#28998
feat: auto-extract shell injection expressions from run: steps into env vars#28998
Conversation
…nv vars
When the compiler encounters ${{ ... }} expressions directly in the run:
field of a step (outside heredoc blocks), it now automatically:
1. Generates a deterministic GH_AW_<EXPR> environment variable name
2. Moves the expression to the step's env: block
3. Replaces the inline ${{ }} reference in the shell script with $VAR
4. Emits a compiler warning explaining what was changed and why
This prevents shell injection attacks (template injection) where
user-controlled data in expressions like github.event.issue.title or
steps.*.outputs.* can be used to execute arbitrary commands.
The sanitizer is applied:
- In renderStepFromMap() for struct-based step rendering
- In addCustomStepsAsIs() for raw-YAML user-authored steps
- In addCustomStepsWithRuntimeInsertion() for steps with runtime injection
A new run_step_sanitizer.go file provides the core logic, and
run_step_sanitizer_test.go provides comprehensive test coverage.
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/6f0951d1-ca98-48dd-9f00-6596d812879b
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Rename renderStepFromMap parameter from 'yamlBuilder' to 'out' to remove variable aliasing (all internal uses updated to 'out') - Clarify comment about heredoc replacement: expressions found exclusively inside heredocs are never added to 'ordered', so they're never replaced - Add debug log via runStepSanitizerLog when sanitizeCustomStepsYAML skips due to YAML parse failure - Rename test case to 'expression in heredoc not extracted' (expectChanged: false) Agent-Logs-Url: https://github.com/github/gh-aw/sessions/6f0951d1-ca98-48dd-9f00-6596d812879b Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot add more tests, add fuzz tests |
Additional unit tests for sanitizeRunStepExpressions:
- PR/comment body expressions extracted
- Expression with extra whitespace
- Complex expressions → hash-based env var names
- Mixed heredoc + non-heredoc expressions
- Injection attempt pattern extracted safely
- Empty run string and no expression marker
- Three expressions in a single command
- Existing env map immutability check
- Broad property test: non-heredoc run portion is ${{ }}-free after sanitization
New unit tests for sanitizeCustomStepsYAML:
- Single/multiple steps, unchanged vs changed scenarios
- Expression already in env: block not double-extracted
- Malformed YAML returned unchanged without error
- Warning content verified
Fuzz tests (run_step_sanitizer_fuzz_test.go):
- FuzzSanitizeRunStepExpressions: 30+ seed entries covering safe, unsafe,
heredoc, complex, duplicate, unicode, and injection-attempt patterns;
verifies no panic, correct postconditions (non-heredoc portion is clean,
env: populated, warnings reference "shell injection", input not mutated)
- FuzzSanitizeCustomStepsYAML: verifies no panic, correct behaviour on
malformed YAML, warning text
The fuzz test found a real bug in the initial test logic: expressions inside
heredoc blocks are intentionally left in place (correct behavior), so the
postcondition check now uses removeHeredocContent() before checking for ${{.
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3b0b6b26-d5cb-495e-9ada-888b9960a5b9
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done in 6de63ba. More unit tests added (
Fuzz tests added (
The fuzz test found an actual bug in the initial test logic: expressions inside heredoc blocks are intentionally left in place, so the postcondition now correctly uses |
There was a problem hiding this comment.
Pull request overview
Adds an automatic mitigation for GitHub Actions template/script injection by rewriting ${{ ... }} expressions found inline in run: scripts into deterministic GH_AW_... env vars, and emitting compiler warnings describing the rewrite.
Changes:
- Introduces
sanitizeRunStepExpressions/sanitizeCustomStepsYAMLto extract${{ ... }}fromrun:intoenv:and replace inline occurrences with$GH_AW_.... - Hooks the sanitizer into both the struct/map-based step rendering path and the custom-steps YAML insertion path, emitting formatted warnings and incrementing compiler warning counts.
- Adds extensive unit + fuzz coverage for sanitizer behavior (including heredoc handling and immutability), and updates a repo workflow to the new safe pattern.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/run_step_sanitizer.go | Implements run-step and custom-steps YAML sanitization + warning message generation. |
| pkg/workflow/run_step_sanitizer_test.go | Unit tests covering extraction behavior, heredoc cases, env immutability, and warning content. |
| pkg/workflow/run_step_sanitizer_fuzz_test.go | Fuzz tests for sanitizer robustness and key safety properties. |
| pkg/workflow/compiler_yaml_step_conversion.go | Emits warnings and applies sanitization before rendering map-based steps. |
| pkg/workflow/compiler_yaml_main_job.go | Sanitizes custom-steps YAML before writing and emits warnings via a helper. |
| .github/workflows/design-decision-gate.lock.yml | Updates an existing workflow step to use extracted env vars instead of inline expressions in run:. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 6/6 changed files
- Comments generated: 5
| existingEnv, _ := step["env"].(map[string]any) | ||
| newEnv := make(map[string]any, len(existingEnv)+len(ordered)) | ||
| maps.Copy(newEnv, existingEnv) | ||
| for _, s := range ordered { | ||
| newEnv[s.EnvVar] = s.Original |
| // The returned string is a replacement for the input customSteps that is safe to | ||
| // write to the generated workflow YAML. The returned warnings slice contains one | ||
| // entry per unique expression that was extracted across all steps. | ||
| // | ||
| // When no expressions are found the original customSteps string is returned |
| case map[string]any: | ||
| // For complex fields like "with" or "env" | ||
| fmt.Fprintf(yaml, "%s:\n", field) | ||
| fmt.Fprintf(out, "%s:\n", field) | ||
| for key, val := range v { | ||
| fmt.Fprintf(yaml, "%s %s: %v\n", indent, key, val) | ||
| fmt.Fprintf(out, "%s %s: %v\n", indent, key, val) |
| // err is only non-nil for YAML re-serialisation failures; the function | ||
| // always returns a non-empty string (either original or sanitized). |
| // Replace every occurrence of each expression in the full run: string. | ||
| // This includes content inside heredoc blocks: the runner evaluates the | ||
| // env: block assignment before passing heredoc content to the shell, so | ||
| // the variable reference is resolved correctly even inside a heredoc. | ||
| // Note: only expressions detected in the non-heredoc portion of the run: |
…pressions
Documents the architectural decision to automatically extract ${{ ... }}
expressions from run: step scripts into env: variables at compile time
to prevent shell injection attacks.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Commit pushed:
|
🧪 Test Quality Sentinel ReportTest Quality Score: 77/100
Test Classification DetailsView all 9 test classifications
Flagged Items — Minor Suggestions
|
| Component | Weight | Raw | Score |
|---|---|---|---|
| Behavioral Coverage | 40 pts | 9/9 = 100% design tests | 40 |
| Error/Edge Case Coverage | 30 pts | 5/9 = 56% have edge cases | 17 |
| Low Duplication | 20 pts | 0 duplicate clusters | 20 |
| Proportional Growth | 10 pts | run_step_sanitizer_test.go 625 vs 246 lines = 2.54:1 |
0 |
| Total | 100 | 77 |
Note on inflation: The test/production ratio exceeds 2:1, but
run_step_sanitizer.gois a security-critical shell-injection sanitizer with complex heredoc and expression parsing. The comprehensive coverage — 21 table rows plus 6 standalone tests plus 2 fuzz tests — is intentionally thorough and genuinely valuable here. The inflation flag is a mechanical threshold; the tests themselves are high quality.
Language Support
Tests analyzed:
- 🐹 Go (
*_test.go): 9Test*functions, 2Fuzz*functions — unit (//go:build !integration) - 🟨 JavaScript (
*.test.cjs,*.test.js): 0 tests
Build tags: ✅ Both new test files correctly declare //go:build !integration on line 1.
Mock libraries: ✅ No gomock, testify/mock, .EXPECT(), or .On() usage detected.
Verdict
✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 9 analyzed tests verify behavioral contracts of the new
run_step_sanitizer.gowith strong coverage of heredoc edge cases, mutation safety, env-var naming contracts, and shell-injection property invariants. The 4 barerequire.Truecalls without descriptive messages are a minor style nit worth fixing but do not affect test value.
📖 Understanding Test Classifications
Design Tests (High Value) verify what the system does:
- Assert on observable outputs, return values, or state changes
- Cover error paths and boundary conditions
- Would catch a behavioral regression if deleted
- Remain valid even after internal refactoring
Implementation Tests (Low Value) verify how the system does it:
- Assert on internal function calls (mocking internals)
- Only test the happy path with typical inputs
- Break during legitimate refactoring even when behavior is correct
- Give false assurance: they pass even when the system is wrong
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
References: §25082700826
🧪 Test quality analysis by Test Quality Sentinel · ● 912.8K · ◷
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (1,208 new lines in AI has analyzed the PR diff and generated a draft ADR to help you get started: 📄 Draft ADR: What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
References: §25082700747
|
There was a problem hiding this comment.
✅ Test Quality Sentinel: 77/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). All 9 analyzed tests verify behavioral contracts with strong coverage of the new shell-injection sanitizer. Minor: 4 bare require.True calls lack descriptive messages (lines 381, 396, 402, 429).
|
@copilot review all comments |
1. Collision handling (run_step_sanitizer.go:134-138)
- When the generated GH_AW_ key already exists in env: with the same value,
reuse it silently (no overwrite, no duplicate entry).
- When the key exists with a different value, pick GH_AW_..._2 (then _3, …)
rather than silently overwriting the user's env var.
- Bound the collision search loop to 100 iterations to prevent any
pathological infinite loop; skip extraction and log a debug message if
all suffixes are exhausted.
2. Quoted-heredoc replacement (run_step_sanitizer.go:141-145)
- Replace strings.ReplaceAll with new replaceOutsideQuotedHeredocs() helper
(template_injection_utils.go) that skips quoted heredoc regions.
- Quoted heredocs suppress shell variable expansion, so replacing
${{ expr }} with $GH_AW_VAR inside them would write the literal var name
to the output file instead of the expression value.
- Unquoted heredocs are still replaced (shell expands vars there).
3. Docstring (run_step_sanitizer.go:186-190)
- Corrected: warnings are per-step (one per unique expression within each
step); no global deduplication across steps.
4. Non-deterministic map ordering (compiler_yaml_step_conversion.go:149-153)
- Sorted map keys with slices.Sorted(maps.Keys(v)) before writing with/env
blocks to produce stable YAML output.
5. Inaccurate fuzz test comment (run_step_sanitizer_fuzz_test.go:251-252)
- Corrected: the function may return an empty string when input is empty;
it returns the original string unchanged only on re-serialisation error.
New tests:
- TestSanitizeRunStepExpressions_CollisionReusesSameValue
- TestSanitizeRunStepExpressions_CollisionDifferentValueGetsAlternateName
- TestSanitizeRunStepExpressions_ExpressionInQuotedHeredocNotReplaced
- TestReplaceOutsideQuotedHeredocs_NoHeredoc
- TestReplaceOutsideQuotedHeredocs_QuotedHeredocPreserved
- TestReplaceOutsideQuotedHeredocs_UnquotedHeredocIsReplaced
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/cfcea2c1-c8ba-4931-9e12-00a57eab22db
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
All five reviewer comments addressed in a7db751:
|
Summary
When a workflow step's
run:field contains GitHub Actions expressions like${{ github.event.issue.title }}or${{ steps.foo.outputs.bar }}, an attacker who controls those values can inject arbitrary shell commands — a technique known as template injection or script injection.This PR updates the compiler to automatically fix this pattern instead of failing with a hard error. For every
${{ ... }}expression found directly in arun:script (outside heredoc blocks), the compiler now:GH_AW_<EXPR>environment variable nameenv:block${{ }}occurrence in the shell script with$GH_AW_...Example transformation:
Before (unsafe — shell injection risk):
After (safe — expression passed as env var):
Warning emitted:
Changes
pkg/workflow/run_step_sanitizer.gosanitizeRunStepExpressions()(step map → sanitized map),sanitizeCustomStepsYAML()(raw YAML string → sanitized string)pkg/workflow/template_injection_utils.goreplaceOutsideQuotedHeredocs()helper that restricts expression replacement to non-quoted-heredoc regionspkg/workflow/run_step_sanitizer_test.gosanitizeCustomStepsYAML, and property testspkg/workflow/run_step_sanitizer_fuzz_test.goFuzzSanitizeRunStepExpressionsandFuzzSanitizeCustomStepsYAMLfuzz tests with 30+ seed corpus entriespkg/workflow/compiler_yaml_step_conversion.gorenderStepFromMap()— struct-based step rendering path; map keys sorted for stable outputpkg/workflow/compiler_yaml_main_job.goaddCustomStepsAsIs()andaddCustomStepsWithRuntimeInsertion()via newsanitizeAndWarnCustomSteps()helperSecurity Rationale
The safe pattern requires the GitHub Actions runner to evaluate the expression value and store it as data in the environment before the shell sees it. When
${{ ... }}appears inline in arun:script, the runner substitutes the raw string value directly into the shell command before execution — allowing an attacker who controls that value (e.g., via an issue title or PR description) to break out of quoted strings and execute arbitrary commands.Collision Handling
If a step's
env:block already defines aGH_AW_*key with the same expression value, the sanitizer reuses it without adding a duplicate entry. If the key exists with a different value, the sanitizer picks an alternate name (GH_AW_..._2,_3, …, capped at 100 attempts) rather than silently overwriting the user's variable.Quoted Heredoc Safety
Replacements are performed only outside quoted heredoc regions (
<< 'EOF'…EOF). Quoted heredocs suppress shell variable expansion, so writing$GH_AW_VARinside them would produce the literal string in the output file instead of the expression value. Unquoted heredocs (<< EOF) are still rewritten because the shell expands variables there.Test Coverage
All unit tests pass. 203/203 workflow files recompile cleanly with no new warnings (existing workflows already use the safe
env:pattern).Fuzz Tests
Two fuzz tests are included in
run_step_sanitizer_fuzz_test.go:FuzzSanitizeRunStepExpressions— feeds arbitraryrun:script content and verifies: no panic, non-heredoc portion of the sanitized script contains no${{ }}, env block is populated withGH_AW_keys, warnings mention "shell injection", and the input map is never mutated.FuzzSanitizeCustomStepsYAML— feeds arbitrary YAML strings and verifies: no panic, malformed YAML returns the input unchanged, and warning messages contain "shell injection".