Document sandbox-blocked shell expansion patterns in prompt shell-safety guide#1933
Document sandbox-blocked shell expansion patterns in prompt shell-safety guide#1933
Conversation
…ported prompt Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/b1abb222-646f-4408-b108-80d663f077d5 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…phase Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/9d9e631e-f2c5-4eb0-b92a-347d6ffcc6ee Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
… memory after every phase Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/9d9e631e-f2c5-4eb0-b92a-347d6ffcc6ee Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
…ty guide Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/1d01c554-a9df-4a64-8b4c-bdedd07cb917 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🏷️ Automatic Labeling SummaryThis PR has been automatically labeled based on the files changed and PR metadata. Applied Labels: documentation,size-m Label Categories
For more information, see |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
This PR updates the gh-aw prompt modules to (a) document shell-expansion patterns that are rejected by the execution sandbox, and (b) add “phase checkpoint” instructions intended to persist intermediate outputs to repo-memory across pipeline phases.
Changes:
- Add a sandbox blocklist table for banned Bash expansion patterns (with rationale + safe equivalents).
- Introduce a reusable “phase checkpoint” snippet in
00-base-contract.mdand require checkpoints at key workflow phases across the analysis/article/PR modules. - Update secret-handling guidance in the shell-safety prompt to avoid embedding GitHub Actions expressions inside prompt modules.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/prompts/00-base-contract.md | Adds the phase-checkpoint contract + reusable snippet for repo-memory persistence. |
| .github/prompts/01-bash-and-shell-safety.md | Documents sandbox-blocked shell-expansion patterns and updates secret-safety guidance. |
| .github/prompts/03-data-download.md | Adds an instruction to checkpoint after writing the download manifest. |
| .github/prompts/04-analysis-pipeline.md | Adds checkpoint instructions between Pass 1/Pass 2 and before the gate. |
| .github/prompts/05-analysis-gate.md | Adds a checkpoint requirement after a gate pass. |
| .github/prompts/06-article-generation.md | Adds a checkpoint requirement after article generation. |
| .github/prompts/07-commit-and-pr.md | Adds a final checkpoint requirement immediately before PR creation. |
| |----------------|------------------|-----------------| | ||
| | `${var@P}` / `${var@Q}` / `${var@E}` / `${var@A}` / `${var@a}` | Parameter transformation re-evaluates the variable's contents as shell syntax — a known prompt-injection vector. | Expand the target explicitly: `printf '%s' "$var"`, or a plain `"$var"` substitution. | | ||
| | `${!var}` | Indirect expansion uses `$var`'s *value* as another variable name — lets an attacker-controlled string pick which variable is read. | Use an associative array: `declare -A MAP; MAP[foo]=bar; echo "${MAP[$key]}"`. | | ||
| | Nested `$(…$(…)…)` | Builds a command string dynamically from inner results — the classic staged injection shape. | Split into two lines with a temporary variable: `inner=$(cmd1); outer=$(cmd2 "$inner")`. | |
There was a problem hiding this comment.
The “Nested $(…$(…)…)” row’s safe equivalent is inconsistent with the nested example shape: it assigns inner=$(cmd1); outer=$(cmd2 "$inner"), which reverses the typical outer(cmd(inner)) ordering and conflicts with the example in the PR description. Rewrite the placeholders so the safe equivalent clearly mirrors outer=$(cmd1 "$(cmd2)") split into two steps (e.g., inner=$(cmd2); outer=$(cmd1 "$inner")) to avoid teaching the wrong transformation.
| | Nested `$(…$(…)…)` | Builds a command string dynamically from inner results — the classic staged injection shape. | Split into two lines with a temporary variable: `inner=$(cmd1); outer=$(cmd2 "$inner")`. | | |
| | Nested `$(…$(…)…)` | Builds a command string dynamically from inner results — the classic staged injection shape. | Split into two lines with a temporary variable: `inner=$(cmd2); outer=$(cmd1 "$inner")`. | |
|
|
||
| | Banned pattern | Why it's blocked | Safe equivalent | | ||
| |----------------|------------------|-----------------| | ||
| | `${var@P}` / `${var@Q}` / `${var@E}` / `${var@A}` / `${var@a}` | Parameter transformation re-evaluates the variable's contents as shell syntax — a known prompt-injection vector. | Expand the target explicitly: `printf '%s' "$var"`, or a plain `"$var"` substitution. | |
There was a problem hiding this comment.
The rationale for banning ${var@P}/${var@Q}/… says these transformations “re-evaluate the variable's contents as shell syntax”, but that isn’t accurate for all of the listed operators (e.g. @Q produces a quoted form rather than evaluating the contents). Since this is a security/safety guide, please tighten the wording to describe the real risk (transformations that produce re-parsable shell fragments / can smuggle syntax) without overstating how each operator behaves.
| | `${var@P}` / `${var@Q}` / `${var@E}` / `${var@A}` / `${var@a}` | Parameter transformation re-evaluates the variable's contents as shell syntax — a known prompt-injection vector. | Expand the target explicitly: `printf '%s' "$var"`, or a plain `"$var"` substitution. | | |
| | `${var@P}` / `${var@Q}` / `${var@E}` / `${var@A}` / `${var@a}` | These parameter transformations can produce shell-reparsable fragments or syntax-bearing representations, which can smuggle attacker-controlled content into later parsing steps — a known prompt-injection vector. | Expand the target explicitly: `printf '%s' "$var"`, or a plain `"$var"` substitution. | |
| | 06 Article generated | `phase-06-article` | `$ANALYSIS_DIR` + today's `news/$YYYY/$MM/$DD/*.html` | | ||
| | 07 Immediately before `create_pull_request` | `phase-07-final` | `$ANALYSIS_DIR` + articles | | ||
| | `news-translate` per batch | `phase-translate-<lang>` | Translated `news/$YYYY/$MM/$DD/*.html` | | ||
|
|
||
| Each checkpoint is mandatory. Skipping them forfeits the only cross-run safety net for analysis work. | ||
|
|
||
| ### Reusable snippet | ||
|
|
||
| Run this bash block at the end of every phase (pass the phase label as `$1`): | ||
|
|
||
| ```bash | ||
| set -Eeuo pipefail | ||
| : "${GH_AW_MEMORY_DIR:=/tmp/gh-aw/repo-memory/default}" | ||
| : "${ARTICLE_DATE:?ARTICLE_DATE required for checkpoint}" | ||
| : "${SUBFOLDER:?SUBFOLDER required for checkpoint (use batch/<lang> for news-translate)}" |
There was a problem hiding this comment.
The checkpoint docs/snippet assume articles are stored under news/$YYYY/$MM/$DD/, but this repo currently writes articles directly under the flat news/ directory (e.g. news/2026-04-21-...-en.html). As written, the snippet will usually skip copying any HTML into repo-memory, undermining the purpose of the checkpoint. Update the snippet (and the referenced source paths) to match the real on-disk layout, e.g. copy news/${ARTICLE_DATE}-*.html (or whatever filename pattern the generator uses).
| | 06 Article generated | `phase-06-article` | `$ANALYSIS_DIR` + today's `news/$YYYY/$MM/$DD/*.html` | | |
| | 07 Immediately before `create_pull_request` | `phase-07-final` | `$ANALYSIS_DIR` + articles | | |
| | `news-translate` per batch | `phase-translate-<lang>` | Translated `news/$YYYY/$MM/$DD/*.html` | | |
| Each checkpoint is mandatory. Skipping them forfeits the only cross-run safety net for analysis work. | |
| ### Reusable snippet | |
| Run this bash block at the end of every phase (pass the phase label as `$1`): | |
| ```bash | |
| set -Eeuo pipefail | |
| : "${GH_AW_MEMORY_DIR:=/tmp/gh-aw/repo-memory/default}" | |
| : "${ARTICLE_DATE:?ARTICLE_DATE required for checkpoint}" | |
| : "${SUBFOLDER:?SUBFOLDER required for checkpoint (use batch/<lang> for news-translate)}" | |
| | 06 Article generated | `phase-06-article` | `$ANALYSIS_DIR` + today's `news/${ARTICLE_DATE}-*.html` | | |
| | 07 Immediately before `create_pull_request` | `phase-07-final` | `$ANALYSIS_DIR` + articles from `news/${ARTICLE_DATE}-*.html` | | |
| | `news-translate` per batch | `phase-translate-<lang>` | Translated `news/${ARTICLE_DATE}-*.html` | | |
| Each checkpoint is mandatory. Skipping them forfeits the only cross-run safety net for analysis work. | |
| ### Reusable snippet | |
| Run this bash block at the end of every phase (pass the phase label as `$1`). Article HTML is written directly under the flat `news/` directory, so checkpoint copies must use `news/${ARTICLE_DATE}-*.html` rather than `news/$YYYY/$MM/$DD/*.html`: | |
| ```bash | |
| set -Eeuo pipefail | |
| : "${GH_AW_MEMORY_DIR:=/tmp/gh-aw/repo-memory/default}" | |
| : "${ARTICLE_DATE:?ARTICLE_DATE required for checkpoint}" | |
| : "${SUBFOLDER:?SUBFOLDER required for checkpoint (use batch/<lang> for news-translate)}" | |
| ARTICLE_HTML_GLOB="news/${ARTICLE_DATE}-*.html" |
| # Snapshot today's produced article HTML (if any exists at this phase). | ||
| YYYY="${ARTICLE_DATE:0:4}"; MM="${ARTICLE_DATE:5:2}"; DD="${ARTICLE_DATE:8:2}" | ||
| if [ -d "news/$YYYY/$MM/$DD" ]; then | ||
| find "news/$YYYY/$MM/$DD" -maxdepth 1 -type f -name '*.html' \ |
There was a problem hiding this comment.
The checkpoint snippet only attempts to copy HTML from news/$YYYY/$MM/$DD, but articles in this repo are written into the flat news/ directory (see existing files like news/2026-04-21-...-en.html). This means the checkpoint will copy zero HTML files in normal runs. Adjust the find/path logic to target the real article location/pattern (for example, find news -maxdepth 1 -name "${ARTICLE_DATE}-*.html").
| # Snapshot today's produced article HTML (if any exists at this phase). | |
| YYYY="${ARTICLE_DATE:0:4}"; MM="${ARTICLE_DATE:5:2}"; DD="${ARTICLE_DATE:8:2}" | |
| if [ -d "news/$YYYY/$MM/$DD" ]; then | |
| find "news/$YYYY/$MM/$DD" -maxdepth 1 -type f -name '*.html' \ | |
| # Snapshot today's produced article HTML from the flat news/ directory (if any exists at this phase). | |
| if [ -d "news" ]; then | |
| find "news" -maxdepth 1 -type f -name "${ARTICLE_DATE}-*.html" \ |
| ## Phase checkpoint — persist every phase to repo memory | ||
|
|
||
| Valuable analysis must never be lost. After each pipeline phase completes, snapshot its output to the gh-aw repo-memory mount at `$GH_AW_MEMORY_DIR` (runtime default `/tmp/gh-aw/repo-memory/default`). gh-aw pushes that directory to the `memory/news-generation` branch in a **separate post-job** — so checkpoints survive even if the content PR job fails, crashes, or times out. | ||
|
|
||
| ### Mandatory checkpoint points |
There was a problem hiding this comment.
The PR title/description focuses on documenting sandbox-blocked shell expansion patterns in 01-bash-and-shell-safety.md, but this PR also introduces a new cross-module “phase checkpoint” mechanism and updates multiple prompt modules to require it (00/03/04/05/06/07). Please update the PR description (or split the PR) so reviewers can clearly track the intent and rollout scope.
|
@copilot apply changes based on the comments in this thread |
…ix nested $() example ordering, align checkpoint paths with flat news/ layout Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/b9b77a65-5237-4d61-be20-09c205c5dbf0 Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
Scope
This PR makes two preventative documentation improvements to
.github/prompts/:01-bash-and-shell-safety.md) — new "Banned expansion patterns" subsection documenting the 5 shell patterns the execution sandbox rejects (${var@P/Q/E/A/a},${!var}, nested$(...$(...)...), chained builder assignments,evalon variable contents), each with rationale + safe equivalent.00-base-contract.md+ propagation into03–07) — mandatory per-phase snapshots of$ANALYSIS_DIRand produced article HTML into$GH_AW_MEMORY_DIR, so analysis survives even if the content-PR job crashes or times out.Both changes propagate to all 12 news workflows via the existing
runtime-importof these modules — no workflow YAML/lock-file edits.Review feedback addressed
@P/@Q/@E/@A/@arationale — rewrote to describe the real risk (transformations produce shell-reparsable / syntax-bearing representations that can smuggle content into later parsing) instead of the blanket "re-evaluates as shell syntax" claim that doesn't hold for@Q.$(…$(…)…)safe-equivalent ordering — flipped placeholders so the rewrite mirrors the originalouter=$(cmd1 "$(cmd2)")shape:inner=$(cmd2); outer=$(cmd1 "$inner").00-base-contract.mdto use the repo's actual flatnews/${ARTICLE_DATE}-*.htmllayout (confirmed against existing files likenews/2026-04-21-...-en.html), instead of thenews/$YYYY/$MM/$DD/*.htmlnested layout that would have copied zero files.Verification
01-bash-and-shell-safety.md: 72 lines (cap 300) ✅00-base-contract.md: 115 lines (cap 300) ✅*.lock.ymlfiles touched ✅