fix: pass each semgrep_rules entry as its own --config flag by tehw0lf · Pull Request #98 · tehw0lf/workflows

tehw0lf · 2026-07-01T20:46:45Z

Summary

semgrep ci only accepts one --config value per flag; security-scan-source.yml was passing the whole semgrep_rules string (e.g. auto p/security-audit p/owasp-top-ten p/ci p/nodejs p/typescript) as a single --config= argument, so everything after the first word was parsed as unknown positional arguments and semgrep ci exited 2 without writing semgrep.sarif.
The downstream "check semgrep results" step then treated the missing/empty SARIF as "0 findings" and exited 0, so the check has been showing green without Semgrep actually scanning anything, in every repo/run using this workflow with a multi-ruleset semgrep_rules value, since at least 2026-06-19 (confirmed in tehw0lf/tehwol.fi's daily cron runs going back to that date).
Fix: split semgrep_rules on whitespace and pass --config <rule> once per entry (the form semgrep ci actually supports for multiple rulesets), and make the findings-check step fail closed if the scan step itself failed, instead of trusting SARIF presence/emptiness alone.

Test plan

YAML parses (python3 -c "import yaml; yaml.safe_load(...)")
Verified the new arg-splitting bash logic locally produces --config auto --config p/security-audit --config p/owasp-top-ten --config p/ci --config p/nodejs --config p/typescript from the existing semgrep_rules default used by callers
Confirmed root cause by pulling the actual failing logs from tehw0lf/tehwol.fi's Security Monitoring workflow runs (both cron and a triggering PR run) — identical semgrep ci: too many arguments error and Invalid SARIF follow-on error in every run since 2026-06-19
Will re-verify against a live caller once merged (e.g. re-run WoWQuote2-Manager's PR-triggered security scan, which currently exhibits this exact bug)

Summary by CodeRabbit

Bug Fixes
- Improved the security scan process to handle multiple rules more reliably.
- Added a clearer failure path when the scan does not complete successfully, helping surface issues sooner.

semgrep ci only accepts a single --config value; passing a space-separated ruleset string (e.g. "auto p/security-audit p/owasp-top-ten p/ci p/nodejs p/typescript") as one --config= argument made the remaining words parse as unknown positional arguments, so semgrep ci exited 2 and never wrote semgrep.sarif. The downstream findings check then silently treated the missing/empty SARIF as "0 findings" and reported success, so every caller of this reusable workflow has been getting a green Semgrep check without an actual scan running since at least 2026-06-19. Fix: split semgrep_rules on whitespace and repeat --config per rule (the flag semgrep ci actually supports for multiple rulesets), and make the findings check fail closed if the scan step itself failed instead of only trusting the SARIF file's presence.

coderabbitai · 2026-07-01T20:46:59Z

📝 Walkthrough

Walkthrough

The Semgrep security scan workflow was updated to build a config_args array from iterating over inputs.semgrep_rules, passing multiple --config arguments to semgrep ci. The results-checking step now fails early with an error if the semgrep step outcome is not success.

Changes

Semgrep Scan Workflow Update

Layer / File(s)	Summary
Multi-rule config and early failure guard `.github/workflows/security-scan-source.yml`	Semgrep scan now loops over `inputs.semgrep_rules` to build multiple `--config` arguments for `semgrep ci`, and the results checker exits with an error before SARIF checks if the semgrep step did not succeed.

Estimated code review effort: 1 (Trivial) | ~5 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly describes the main workflow fix: passing each semgrep_rules entry as its own --config flag.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/semgrep-multi-config-args

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

.github/workflows/security-scan-source.yml (1)
116-116: 🔒 Security & Privacy | 🔵 Trivial | 💤 Low value

Minor: same template-in-shell pattern, lower risk.

steps.semgrep.outcome is a fixed GitHub-controlled enum, so this expansion is not attacker-influenced, but moving it to env: matches the fix used elsewhere and avoids further zizmor noise.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/security-scan-source.yml at line 116, The semgrep outcome
check in the workflow still uses a direct GitHub expression inside the shell
condition; move the `steps.semgrep.outcome` value into `env:` for the job or
step and reference that environment variable in the `if` test instead. Update
the existing shell guard around `steps.semgrep.outcome` so it follows the same
pattern used elsewhere in the workflow and removes the template-in-shell
warning.
Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/security-scan-source.yml:
- Around line 104-109: The shell loop in the Semgrep step is expanding
inputs.semgrep_rules directly inside the run script, creating a
template-to-shell injection risk. Move semgrep_rules into an env value for the
step and have the script read that variable instead, keeping the configuration
assembly in the same semgrep ci invocation logic but removing direct GitHub
expression interpolation from the shell body.
- Around line 112-120: The new semgrep guard in the semgrep-check step is
treating normal non-zero exits from semgrep ci as a tool failure, which skips
the existing findings summary logic. Update the check around
steps.semgrep.outcome in the security-scan-source workflow so it only fails on
true execution problems such as missing or invalid semgrep.sarif, and let the
total/suppressed/unsuppressed counting path run for real findings. Keep the
semgrep-check step and the semgrep.sarif validation logic aligned so the
detailed breakdown still reports unsuppressed findings correctly.

---

Nitpick comments:
In @.github/workflows/security-scan-source.yml:
- Line 116: The semgrep outcome check in the workflow still uses a direct GitHub
expression inside the shell condition; move the `steps.semgrep.outcome` value
into `env:` for the job or step and reference that environment variable in the
`if` test instead. Update the existing shell guard around
`steps.semgrep.outcome` so it follows the same pattern used elsewhere in the
workflow and removes the template-in-shell warning.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 94b658b3-8983-4fe9-af4e-3d069ffb8c69

📥 Commits

Reviewing files that changed from the base of the PR and between 685fea9 and 5df5c29.

📒 Files selected for processing (1)

.github/workflows/security-scan-source.yml

coderabbitai · 2026-07-01T20:52:51Z

+        run: |
+          config_args=()
+          for rule in ${{ inputs.semgrep_rules }}; do
+            config_args+=(--config "$rule")
+          done
+          semgrep ci "${config_args[@]}" --sarif > semgrep.sarif


🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Workflow input spliced directly into shell (template injection).

inputs.semgrep_rules is expanded directly inside the run: script body. If this reusable workflow can be triggered with an untrusted/attacker-influenced value for semgrep_rules (e.g., from a fork PR context), this allows arbitrary shell command injection into a step with security-events: write / actions: read permissions. zizmor also flags this at line 106.

🔒 Proposed fix using env to break the template-to-shell boundary

- name: semgrep security scan id: semgrep + env: + SEMGREP_RULES_INPUT: ${{ inputs.semgrep_rules }} run: | config_args=() - for rule in ${{ inputs.semgrep_rules }}; do + for rule in $SEMGREP_RULES_INPUT; do config_args+=(--config "$rule") done semgrep ci "${config_args[@]}" --sarif > semgrep.sarif continue-on-error: true

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

run: |

config_args=()

for rule in ${{ inputs.semgrep_rules }}; do

config_args+=(--config "$rule")

done

semgrep ci "${config_args[@]}" --sarif > semgrep.sarif

env:

SEMGREP_RULES_INPUT: ${{ inputs.semgrep_rules }}

run: |

config_args=()

for rule in $SEMGREP_RULES_INPUT; do

config_args+=(--config "$rule")

done

semgrep ci "${config_args[@]}" --sarif > semgrep.sarif

🧰 Tools

🪛 zizmor (1.26.1)

[error] 106-106: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/security-scan-source.yml around lines 104 - 109, The shell loop in the Semgrep step is expanding inputs.semgrep_rules directly inside the run script, creating a template-to-shell injection risk. Move semgrep_rules into an env value for the step and have the script read that variable instead, keeping the configuration assembly in the same semgrep ci invocation logic but removing direct GitHub expression interpolation from the shell body.

Source: Linters/SAST tools

coderabbitai · 2026-07-01T20:52:51Z

      - name: check semgrep results for unsuppressed findings
        id: semgrep-check
        if: always()
        run: |
+          if [ "${{ steps.semgrep.outcome }}" != "success" ]; then
+            echo "❌ Semgrep scan step failed, treating as a failed check"
+            exit 1
+          fi
+


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Early-exit guard suppresses the detailed findings breakdown for the common "real findings" path.

semgrep ci exits with code 1 whenever it detects blocking findings — this is the normal/expected outcome whenever the scan actually finds something, not just when the tool crashes (exit code 2). Since steps.semgrep.outcome reflects the raw exit status even with continue-on-error: true, this new guard will short-circuit with a generic "scan step failed" message for real unsuppressed findings too, before the script ever reaches the total/suppressed/unsuppressed counting and messaging at lines 128-145. That logic was specifically designed to distinguish "all findings suppressed, pass" from "unsuppressed findings, fail" — a distinction this guard now bypasses whenever semgrep ci exits non-zero for legitimate findings.

Consider gating the early-exit on an actual tool-failure signal (e.g., missing/empty/invalid SARIF file) rather than the raw step outcome, since the existing [ ! -f semgrep.sarif ] check at Line 122 already covers the "scan never produced output" failure mode described in the PR objective.

🐛 Suggested approach: validate SARIF content instead of raw outcome

- name: check semgrep results for unsuppressed findings id: semgrep-check if: always() run: | - if [ "${{ steps.semgrep.outcome }}" != "success" ]; then - echo "❌ Semgrep scan step failed, treating as a failed check" - exit 1 - fi - # Check if semgrep.sarif exists and has findings - if [ ! -f semgrep.sarif ]; then - echo "No SARIF file generated, failing..." + if [ ! -s semgrep.sarif ] || ! jq -e '.runs' semgrep.sarif >/dev/null 2>&1; then + echo "❌ Semgrep scan step outcome: ${{ steps.semgrep.outcome }}; no valid SARIF produced, failing..." exit 1 fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: check semgrep results for unsuppressed findings

id: semgrep-check

if: always()

run: |

if [ "${{ steps.semgrep.outcome }}" != "success" ]; then

echo "❌ Semgrep scan step failed, treating as a failed check"

exit 1

fi

- name: check semgrep results for unsuppressed findings

id: semgrep-check

if: always()

run: |

# Check if semgrep.sarif exists and has findings

if [ ! -s semgrep.sarif ] || ! jq -e '.runs' semgrep.sarif >/dev/null 2>&1; then

echo "❌ Semgrep scan step outcome: ${{ steps.semgrep.outcome }}; no valid SARIF produced, failing..."

exit 1

fi

🧰 Tools

🪛 zizmor (1.26.1)

[info] 116-116: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/security-scan-source.yml around lines 112 - 120, The new semgrep guard in the semgrep-check step is treating normal non-zero exits from semgrep ci as a tool failure, which skips the existing findings summary logic. Update the check around steps.semgrep.outcome in the security-scan-source workflow so it only fails on true execution problems such as missing or invalid semgrep.sarif, and let the total/suppressed/unsuppressed counting path run for real findings. Keep the semgrep-check step and the semgrep.sarif validation logic aligned so the detailed breakdown still reports unsuppressed findings correctly.

…flate findings with tool failure - Move inputs.semgrep_rules into env (SEMGREP_RULES) instead of interpolating the GitHub expression directly into the shell script, avoiding the template-injection pattern flagged by zizmor/CodeRabbit. - Replace the semgrep step-outcome guard with a direct SARIF validity check (file non-empty and parses as JSON via `jq empty`). semgrep ci exits non-zero both on tool/config errors and whenever it has findings, so checking `steps.semgrep.outcome != success` would have also treated a successful scan with real findings as a failed scan and skipped the findings breakdown entirely.

coderabbitai Bot reviewed Jul 1, 2026

View reviewed changes

tehw0lf merged commit ddb54a4 into main Jul 1, 2026
2 checks passed

tehw0lf deleted the fix/semgrep-multi-config-args branch July 1, 2026 21:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pass each semgrep_rules entry as its own --config flag#98

fix: pass each semgrep_rules entry as its own --config flag#98
tehw0lf merged 2 commits into
mainfrom
fix/semgrep-multi-config-args

tehw0lf commented Jul 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jul 1, 2026

Uh oh!

coderabbitai Bot Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tehw0lf commented Jul 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tehw0lf commented Jul 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading