fix: pass each semgrep_rules entry as its own --config flag#98
Conversation
semgrep ci only accepts a single --config value; passing a space-separated ruleset string (e.g. "auto p/security-audit p/owasp-top-ten p/ci p/nodejs p/typescript") as one --config= argument made the remaining words parse as unknown positional arguments, so semgrep ci exited 2 and never wrote semgrep.sarif. The downstream findings check then silently treated the missing/empty SARIF as "0 findings" and reported success, so every caller of this reusable workflow has been getting a green Semgrep check without an actual scan running since at least 2026-06-19. Fix: split semgrep_rules on whitespace and repeat --config per rule (the flag semgrep ci actually supports for multiple rulesets), and make the findings check fail closed if the scan step itself failed instead of only trusting the SARIF file's presence.
📝 WalkthroughWalkthroughThe Semgrep security scan workflow was updated to build a ChangesSemgrep Scan Workflow Update
Estimated code review effort: 1 (Trivial) | ~5 minutes Estimated code review effort: 1 (Trivial) | ~5 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
.github/workflows/security-scan-source.yml (1)
116-116: 🔒 Security & Privacy | 🔵 Trivial | 💤 Low valueMinor: same template-in-shell pattern, lower risk.
steps.semgrep.outcomeis a fixed GitHub-controlled enum, so this expansion is not attacker-influenced, but moving it toenv:matches the fix used elsewhere and avoids further zizmor noise.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/security-scan-source.yml at line 116, The semgrep outcome check in the workflow still uses a direct GitHub expression inside the shell condition; move the `steps.semgrep.outcome` value into `env:` for the job or step and reference that environment variable in the `if` test instead. Update the existing shell guard around `steps.semgrep.outcome` so it follows the same pattern used elsewhere in the workflow and removes the template-in-shell warning.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/security-scan-source.yml:
- Around line 104-109: The shell loop in the Semgrep step is expanding
inputs.semgrep_rules directly inside the run script, creating a
template-to-shell injection risk. Move semgrep_rules into an env value for the
step and have the script read that variable instead, keeping the configuration
assembly in the same semgrep ci invocation logic but removing direct GitHub
expression interpolation from the shell body.
- Around line 112-120: The new semgrep guard in the semgrep-check step is
treating normal non-zero exits from semgrep ci as a tool failure, which skips
the existing findings summary logic. Update the check around
steps.semgrep.outcome in the security-scan-source workflow so it only fails on
true execution problems such as missing or invalid semgrep.sarif, and let the
total/suppressed/unsuppressed counting path run for real findings. Keep the
semgrep-check step and the semgrep.sarif validation logic aligned so the
detailed breakdown still reports unsuppressed findings correctly.
---
Nitpick comments:
In @.github/workflows/security-scan-source.yml:
- Line 116: The semgrep outcome check in the workflow still uses a direct GitHub
expression inside the shell condition; move the `steps.semgrep.outcome` value
into `env:` for the job or step and reference that environment variable in the
`if` test instead. Update the existing shell guard around
`steps.semgrep.outcome` so it follows the same pattern used elsewhere in the
workflow and removes the template-in-shell warning.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 94b658b3-8983-4fe9-af4e-3d069ffb8c69
📒 Files selected for processing (1)
.github/workflows/security-scan-source.yml
| run: | | ||
| config_args=() | ||
| for rule in ${{ inputs.semgrep_rules }}; do | ||
| config_args+=(--config "$rule") | ||
| done | ||
| semgrep ci "${config_args[@]}" --sarif > semgrep.sarif |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Workflow input spliced directly into shell (template injection).
inputs.semgrep_rules is expanded directly inside the run: script body. If this reusable workflow can be triggered with an untrusted/attacker-influenced value for semgrep_rules (e.g., from a fork PR context), this allows arbitrary shell command injection into a step with security-events: write / actions: read permissions. zizmor also flags this at line 106.
🔒 Proposed fix using env to break the template-to-shell boundary
- name: semgrep security scan
id: semgrep
+ env:
+ SEMGREP_RULES_INPUT: ${{ inputs.semgrep_rules }}
run: |
config_args=()
- for rule in ${{ inputs.semgrep_rules }}; do
+ for rule in $SEMGREP_RULES_INPUT; do
config_args+=(--config "$rule")
done
semgrep ci "${config_args[@]}" --sarif > semgrep.sarif
continue-on-error: true📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| run: | | |
| config_args=() | |
| for rule in ${{ inputs.semgrep_rules }}; do | |
| config_args+=(--config "$rule") | |
| done | |
| semgrep ci "${config_args[@]}" --sarif > semgrep.sarif | |
| env: | |
| SEMGREP_RULES_INPUT: ${{ inputs.semgrep_rules }} | |
| run: | | |
| config_args=() | |
| for rule in $SEMGREP_RULES_INPUT; do | |
| config_args+=(--config "$rule") | |
| done | |
| semgrep ci "${config_args[@]}" --sarif > semgrep.sarif |
🧰 Tools
🪛 zizmor (1.26.1)
[error] 106-106: code injection via template expansion (template-injection): may expand into attacker-controllable code
(template-injection)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/security-scan-source.yml around lines 104 - 109, The shell
loop in the Semgrep step is expanding inputs.semgrep_rules directly inside the
run script, creating a template-to-shell injection risk. Move semgrep_rules into
an env value for the step and have the script read that variable instead,
keeping the configuration assembly in the same semgrep ci invocation logic but
removing direct GitHub expression interpolation from the shell body.
Source: Linters/SAST tools
| - name: check semgrep results for unsuppressed findings | ||
| id: semgrep-check | ||
| if: always() | ||
| run: | | ||
| if [ "${{ steps.semgrep.outcome }}" != "success" ]; then | ||
| echo "❌ Semgrep scan step failed, treating as a failed check" | ||
| exit 1 | ||
| fi | ||
|
|
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Early-exit guard suppresses the detailed findings breakdown for the common "real findings" path.
semgrep ci exits with code 1 whenever it detects blocking findings — this is the normal/expected outcome whenever the scan actually finds something, not just when the tool crashes (exit code 2). Since steps.semgrep.outcome reflects the raw exit status even with continue-on-error: true, this new guard will short-circuit with a generic "scan step failed" message for real unsuppressed findings too, before the script ever reaches the total/suppressed/unsuppressed counting and messaging at lines 128-145. That logic was specifically designed to distinguish "all findings suppressed, pass" from "unsuppressed findings, fail" — a distinction this guard now bypasses whenever semgrep ci exits non-zero for legitimate findings.
Consider gating the early-exit on an actual tool-failure signal (e.g., missing/empty/invalid SARIF file) rather than the raw step outcome, since the existing [ ! -f semgrep.sarif ] check at Line 122 already covers the "scan never produced output" failure mode described in the PR objective.
🐛 Suggested approach: validate SARIF content instead of raw outcome
- name: check semgrep results for unsuppressed findings
id: semgrep-check
if: always()
run: |
- if [ "${{ steps.semgrep.outcome }}" != "success" ]; then
- echo "❌ Semgrep scan step failed, treating as a failed check"
- exit 1
- fi
-
# Check if semgrep.sarif exists and has findings
- if [ ! -f semgrep.sarif ]; then
- echo "No SARIF file generated, failing..."
+ if [ ! -s semgrep.sarif ] || ! jq -e '.runs' semgrep.sarif >/dev/null 2>&1; then
+ echo "❌ Semgrep scan step outcome: ${{ steps.semgrep.outcome }}; no valid SARIF produced, failing..."
exit 1
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: check semgrep results for unsuppressed findings | |
| id: semgrep-check | |
| if: always() | |
| run: | | |
| if [ "${{ steps.semgrep.outcome }}" != "success" ]; then | |
| echo "❌ Semgrep scan step failed, treating as a failed check" | |
| exit 1 | |
| fi | |
| - name: check semgrep results for unsuppressed findings | |
| id: semgrep-check | |
| if: always() | |
| run: | | |
| # Check if semgrep.sarif exists and has findings | |
| if [ ! -s semgrep.sarif ] || ! jq -e '.runs' semgrep.sarif >/dev/null 2>&1; then | |
| echo "❌ Semgrep scan step outcome: ${{ steps.semgrep.outcome }}; no valid SARIF produced, failing..." | |
| exit 1 | |
| fi |
🧰 Tools
🪛 zizmor (1.26.1)
[info] 116-116: code injection via template expansion (template-injection): may expand into attacker-controllable code
(template-injection)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/security-scan-source.yml around lines 112 - 120, The new
semgrep guard in the semgrep-check step is treating normal non-zero exits from
semgrep ci as a tool failure, which skips the existing findings summary logic.
Update the check around steps.semgrep.outcome in the security-scan-source
workflow so it only fails on true execution problems such as missing or invalid
semgrep.sarif, and let the total/suppressed/unsuppressed counting path run for
real findings. Keep the semgrep-check step and the semgrep.sarif validation
logic aligned so the detailed breakdown still reports unsuppressed findings
correctly.
…flate findings with tool failure - Move inputs.semgrep_rules into env (SEMGREP_RULES) instead of interpolating the GitHub expression directly into the shell script, avoiding the template-injection pattern flagged by zizmor/CodeRabbit. - Replace the semgrep step-outcome guard with a direct SARIF validity check (file non-empty and parses as JSON via `jq empty`). semgrep ci exits non-zero both on tool/config errors and whenever it has findings, so checking `steps.semgrep.outcome != success` would have also treated a successful scan with real findings as a failed scan and skipped the findings breakdown entirely.
Summary
semgrep cionly accepts one--configvalue per flag;security-scan-source.ymlwas passing the wholesemgrep_rulesstring (e.g.auto p/security-audit p/owasp-top-ten p/ci p/nodejs p/typescript) as a single--config=argument, so everything after the first word was parsed as unknown positional arguments andsemgrep ciexited 2 without writingsemgrep.sarif.semgrep_rulesvalue, since at least 2026-06-19 (confirmed intehw0lf/tehwol.fi's daily cron runs going back to that date).semgrep_ruleson whitespace and pass--config <rule>once per entry (the formsemgrep ciactually supports for multiple rulesets), and make the findings-check step fail closed if the scan step itself failed, instead of trusting SARIF presence/emptiness alone.Test plan
python3 -c "import yaml; yaml.safe_load(...)")--config auto --config p/security-audit --config p/owasp-top-ten --config p/ci --config p/nodejs --config p/typescriptfrom the existingsemgrep_rulesdefault used by callerstehw0lf/tehwol.fi'sSecurity Monitoringworkflow runs (both cron and a triggering PR run) — identicalsemgrep ci: too many argumentserror andInvalid SARIFfollow-on error in every run since 2026-06-19WoWQuote2-Manager's PR-triggered security scan, which currently exhibits this exact bug)Summary by CodeRabbit