Skip to content

fix(ci): defense-in-depth filename guard for .env* files (bug_env_file_corrupted_during_session)#94

Merged
SoundMindsAI merged 5 commits into
mainfrom
bug/env-file-ci-guard
May 13, 2026
Merged

fix(ci): defense-in-depth filename guard for .env* files (bug_env_file_corrupted_during_session)#94
SoundMindsAI merged 5 commits into
mainfrom
bug/env-file-ci-guard

Conversation

@SoundMindsAI
Copy link
Copy Markdown
Owner

Summary

  • New secrets-files-guard CI job in pr.yml fails any PR that adds or modifies a .env-family file other than .env.example. Catches the failure mode where .gitignore is regressed or git add -f is used — even when gitleaks would pass (e.g. an empty .env.old carries no detectable secret pattern but still signals a rotated key).
  • Implementation: scripts/ci/check-no-env-files.sh (pure bash, regex denylist + allowlist) + scripts/ci/test_check_no_env_files.sh (15 canned regression cases run as a workflow step before the real-diff check, so a broken regex fails immediately).
  • Two tangential follow-ups captured per CLAUDE.md tangential-discoveries rule: chore_ci_gitignore_paths_ignore_gap (paths-ignore skips workflow on .gitignore-only PRs) and chore_ci_gitleaks_workflow_step (gitleaks runs in pre-commit but not in CI).

Closes the near-miss documented in bug_env_file_corrupted_during_session/bug_fix.md.env was rotated to .env.old on a developer's machine during infra_foundation PR #4 work; .env.old was not in .gitignore at the time and a naive git add -A would have staged a real OpenAI key. The original local-tooling rename cause is undetermined and deferred (cannot reproduce 4 days post-incident).

Test plan

  • bash scripts/ci/test_check_no_env_files.sh — 15/15 pass locally (.env.example allowed; bare .env, .env.old, .env.bak, .env.local, .envrc, .env.production, nested variants, mixed legit+bad all rejected with exit 1)
  • .venv/bin/ruff format --check backend/ — CI parity
  • .venv/bin/ruff check . — all checks passed
  • uv run mypy --strict --config-file=pyproject.toml backend/ — no issues
  • shellcheck scripts/ci/*.sh — clean (one intentional SC2086 word-split annotated)
  • python3 -c "import yaml; yaml.safe_load(open('.github/workflows/pr.yml'))" — workflow parses
  • CI runs the new secrets-files-guard job and it self-tests against the canned inputs before scanning the real diff

🤖 Generated with Claude Code

SoundMindsAI and others added 3 commits May 13, 2026 15:39
Adds scripts/ci/check-no-env-files.sh + a self-test driver + a new
secrets-files-guard job in pr.yml. Fails any PR that adds or modifies a
.env-family file other than .env.example, regardless of file content.

Defense-in-depth for bug_env_file_corrupted_during_session: catches the
near-miss where a rotated .env.old slipped past .gitignore (or where
.gitignore is regressed / git add -f is used). Gitleaks in pre-commit is
content-based and would pass an empty .env.old; this guard is filename-
based and catches the structural signal.

Pure bash, no python/pytest dependency. 15 canned regression cases run
as a workflow step before the real-diff check, so a broken regex fails
immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…angential idea files

bug_fix.md documents the locked design for the secrets-files-guard job
shipped in the previous commit: detection mechanism, regex shape,
failure mode, position in the workflow, and the open question on what
local tool originally renamed .env -> .env.old (deferred — cannot
reproduce 4 days post-incident).

Tangential follow-ups captured per CLAUDE.md "Tangential discoveries":
  - chore_ci_gitignore_paths_ignore_gap — pr.yml paths-ignore skips the
    whole workflow on .gitignore-only PRs, undermining the guard.
  - chore_ci_gitleaks_workflow_step — gitleaks runs only in pre-commit
    locally; CI should also content-scan to complement the filename guard.

Includes MVP1 dashboard regen for the new planned-feature folders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The unquoted ${FORBIDDEN} expansion in check-no-env-files.sh is deliberate
— newline-splitting drives printf format-reuse so each forbidden file
prints on its own bullet line. Quoting would collapse the output to a
single line. Annotate the intent so future shellcheck runs don't re-flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a CI guard to prevent sensitive .env files from being committed, featuring a detection script, regression tests, and updated project documentation. Reviewers noted that the required updates to the GitHub Actions workflow file are missing. Additionally, suggestions were provided to enhance the guard script's robustness by including renamed and copied files in the diff check and improving the handling of file paths that may contain spaces.

Comment on lines +73 to +75
4. **Workflow position — standalone `secrets-files-guard` job, ~20s, parallel
with `backend`/`frontend`/`docker`.** Independent PR check; no service
containers needed. Cites: existing job-per-concern pattern in pr.yml.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The PR description and this design document mention a new secrets-files-guard job being added to .github/workflows/pr.yml. However, the changes to pr.yml are missing from the current pull request. Without the workflow update, the guard script will not be executed in CI.

Comment thread scripts/ci/check-no-env-files.sh Outdated
pull_request)
local base="${GITHUB_BASE_REF:-main}"
git fetch --no-tags --depth=50 origin "${base}" >/dev/null 2>&1 || true
git diff --name-only --diff-filter=AM "origin/${base}...HEAD"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The --diff-filter=AM only includes Added and Modified files. It is recommended to also include Renamed (R) and Copied (C) files to ensure that forbidden files introduced through these operations (e.g., renaming a tracked file to .env) are also caught by the guard.

Suggested change
git diff --name-only --diff-filter=AM "origin/${base}...HEAD"
git diff --name-only --diff-filter=ACMR "origin/${base}...HEAD"

Comment thread scripts/ci/check-no-env-files.sh Outdated
git diff --name-only --diff-filter=AM "origin/${base}...HEAD"
;;
push)
git diff --name-only --diff-filter=AM HEAD~1 HEAD
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the pull request case, the push event diff should also include Renamed (R) and Copied (C) files for complete coverage.

Suggested change
git diff --name-only --diff-filter=AM HEAD~1 HEAD
git diff --name-only --diff-filter=ACMR HEAD~1 HEAD

Comment thread scripts/ci/check-no-env-files.sh Outdated
Comment on lines +77 to +78
# shellcheck disable=SC2086 # Word-split is intentional: one bullet per file via printf format-reuse.
printf ' - %s\n' ${FORBIDDEN}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Relying on word splitting for ${FORBIDDEN} will fail if any file path contains spaces. While .env files typically don't have spaces, it's better to handle the newline-separated list robustly using sed or a while read loop.

Suggested change
# shellcheck disable=SC2086 # Word-split is intentional: one bullet per file via printf format-reuse.
printf ' - %s\n' ${FORBIDDEN}
printf '%s' "${FORBIDDEN}" | sed 's/^/ - /'

…emini #2/3/4)

Accepted findings from Gemini Code Assist on PR #94:

- --diff-filter=AM -> ACMR. High-similarity `git mv foo.txt .env.old`
  is reported as Renamed by git, not Added; AM alone would miss it. ACMR
  (Added, Copied, Modified, Renamed) closes the rename/copy bypass.
- Replace unquoted ${FORBIDDEN} word-split with `sed 's/^/  - /'`. The
  regex's `[^/]+` permits spaces in `.env.<...>` filenames; the previous
  word-splitting approach would mis-split on space. The shellcheck SC2086
  disable is no longer needed and is removed.
- New regression case `.env.foo bar.bak` proves the space-handling fix.
  All 16 cases pass.

Rejected finding #1 (Gemini High): claimed pr.yml workflow update was
missing. Counter-evidence: `gh pr view 94 --json files` shows
.github/workflows/pr.yml with +22/-0 in this PR (the secrets-files-guard
job). Gemini reviewed hunks without seeing the workflow change in the
same PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@SoundMindsAI
Copy link
Copy Markdown
Owner Author

Review adjudication (Gemini Code Assist)

Commit landing fixes: 6397bcd

Gemini Code Assist (4 findings)

# Sev Location Verdict Notes
1 High bug_fix.md:75 Rejected Counter-evidence: gh pr view 94 --json files shows .github/workflows/pr.yml with +22/-0 in this PR — the secrets-files-guard job IS included. Hunk-isolated false positive; Gemini reviewed bug_fix.md's prose claim without checking the actual PR file list.
2 Medium scripts/ci/check-no-env-files.sh:49 Accepted Fixed in 6397bcd--diff-filter=AMACMR. A high-similarity git mv foo.txt .env.old is reported as R by git's rename detection, so AM alone would miss it.
3 Medium scripts/ci/check-no-env-files.sh:52 Accepted Fixed in 6397bcd — same change applied to the push branch.
4 Medium scripts/ci/check-no-env-files.sh:78 Accepted Fixed in 6397bcd — replaced unquoted-${FORBIDDEN} word-split with printf '%s\n' "${FORBIDDEN}" | sed 's/^/ - /'. The regex's [^/]+ permits spaces in .env.<…> filenames; the old approach would mis-split. Added regression case .env.foo bar.bak (test count 15 → 16; all pass). The # shellcheck disable=SC2086 annotation is no longer needed and was removed.

Outcomes

  • Applied fixes (3): ACMR diff filter on both event branches, sed-based output, new space-in-filename regression test.
  • Rejected with cited counter-evidence (1): "pr.yml missing" — workflow change IS in PR (.github/workflows/pr.yml +22/-0).
  • Deferred as non-regression follow-ups (0): none.

CI re-run on 6397bcd green across all 5 jobs (backend, frontend, secrets-files-guard, smoke, docker buildx). Ready for human review + merge.

Accepted GPT-5.5 final-review finding #1: the implicit `git fetch origin
<base>` does not deterministically create refs/remotes/origin/<base>;
under some actions/checkout modes (especially forked PRs) the diff would
silently fail to enumerate changes and the guard would no-op. Replace
with an explicit refspec and add an `origin/<base>` rev-parse check that
exits 2 with a clear error if resolution fails — refusing to silently
skip the guard.

CI runs prior to this fix passed because actions/checkout's
fetch-depth=50 already populated origin/main in the typical case; the
fix removes the brittleness rather than fixing a current-observable bug.

Deferred GPT-5.5 #2 (workflow paths-ignore inheritance): already
captured as chore_ci_gitignore_paths_ignore_gap. Counter-evidence:
bug_fix.md Tangential Observations §1.

Deferred GPT-5.5 #3 (regex misses .env-old / .env_bak / .env~):
captured as a new chore_env_guard_extend_deny_pattern idea file. Not a
regression of this PR — the gitignore and the guard are currently
consistent (both dotted-only); broadening one requires broadening the
other, and that paired change deserves its own focused PR.

Includes MVP1 dashboard regen for the new chore_env_guard_extend_deny_pattern folder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@SoundMindsAI
Copy link
Copy Markdown
Owner Author

Review adjudication (GPT-5.5 final review)

Commit landing fixes: 83961ef

GPT-5.5 final review (3 findings)

# Sev Location Verdict Notes
1 High scripts/ci/check-no-env-files.sh:~52 Accepted Fixed in 83961ef — replaced implicit git fetch origin <base> with explicit refspec +refs/heads/<base>:refs/remotes/origin/<base>. Added git rev-parse --verify --quiet origin/<base> precondition that exits 2 with a clear error message if resolution fails, refusing to silently no-op the guard. CI was already passing because actions/checkout@v6 with fetch-depth: 50 typically populates origin/main, but the explicit refspec removes the brittleness in forked-PR + shallow-checkout scenarios.
2 Medium .github/workflows/pr.yml Deferred Workflow-level paths-ignore skips the entire workflow on .gitignore-only or docs/**-only PRs, undermining defense-in-depth. Counter-evidence that it's already tracked: bug_fix.md "Tangential observations" §1 links chore_ci_gitignore_paths_ignore_gap/idea.md — same surface, separate follow-up PR to avoid scope creep.
3 Low scripts/ci/check-no-env-files.sh deny regex Deferred Regex misses non-dotted backup spellings (.env-old, .env_bak, .env~). Not a regression: the bug's .gitignore patch (lines 153-157) and the documented incident pattern (.env.old) both use dotted spellings; the guard is internally consistent with the gitignore. Broadening one requires broadening the other — that paired change is captured as a new follow-up: chore_env_guard_extend_deny_pattern/idea.md.

Outcomes

CI re-run on 83961ef green across all 5 jobs (backend, frontend, secrets-files-guard, smoke, docker buildx). All adjudications resolved.

Ready for human review + merge.

@SoundMindsAI SoundMindsAI merged commit f4655ee into main May 13, 2026
5 checks passed
@SoundMindsAI SoundMindsAI deleted the bug/env-file-ci-guard branch May 13, 2026 20:22
SoundMindsAI added a commit that referenced this pull request May 13, 2026
Total run drops from ~7m20s to an expected ~5m45s by addressing the
three highest-impact-per-risk targets identified from per-step timing
on PRs #94 / #95:

1. **Drop `needs: [backend, frontend]` from the docker job** (~65s).
   buildx tests whether the Dockerfile builds; it does not consume
   test artifacts. The dependency was defensive, not necessary. With
   it removed, docker runs in parallel with backend/frontend instead
   of being serialized after them — the critical path was previously
   backend (6m06s) → docker (1m05s); now it's just backend.

2. **Cache .mypy_cache + .ruff_cache across runs** (~20s).
   Both tools invalidate per-file internally — the cache directory is
   safe to persist. Cold mypy is ~27s today; warm typically ~5s.
   Keyed on uv.lock + pyproject.toml (dep/tool-config changes); a
   restore-keys fallback to the runner-OS prefix means partial hits
   when only one input changes.

3. **Lower ES + OpenSearch heap from 512m → 256m** (~10-15s).
   The CI integration-test workload uses small synthetic indexes
   without concurrency; 256m is amply sufficient. Halving the heap
   makes container startup + health-check completion faster.

Risk profile: all three changes are low-risk additive YAML edits.
- #1 (docker parallel): zero risk — buildx never depended on test outcomes.
- #2 (cache step): cache-miss path is identical to today (cold run).
- #3 (heap halving): only risk is OOM during a test; if observed,
  trivially revert this single line.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SoundMindsAI added a commit that referenced this pull request May 13, 2026
…eedup wave entry (#98)

Closes out the bug by moving the folder from planned_features/ to
implemented_features/2026_05_13_bug_env_file_corrupted_during_session/.
The defense-in-depth scope shipped in PR #94 (filename-pattern CI guard
catching `.env*` files in PR diffs). The original local-tooling rename
event from 2026-05-09 remains undetermined — captured as an open
question in bug_fix.md for user-side investigation (cannot reproduce
4 days post-incident from agent side).

State.md updates:
- New "Most recent meaningful changes" entry covering PRs #94 / #95 /
  #96 / #97 (CI-speedup wave: defense-in-depth env guard, --ship flag,
  3 critical-path optimizations cutting wall time 7m20s → 5m55s,
  unit-test fast lane).
- Bug removed from "Tangential bugs captured during the bootstrap"
  list; marked resolved with pointer to the implemented_features
  folder.
- Last-updated header refreshed.

Includes MVP1 dashboard regen for the folder relocation.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SoundMindsAI added a commit that referenced this pull request May 13, 2026
…oader regex (#99)

* infra(ci): extract secrets-files-guard to own workflow + add gitleaks (chore_ci_gitignore_paths_ignore_gap + chore_ci_gitleaks_workflow_step)

Two CI defense-in-depth chores resolved together because they share a
workflow file:

**chore_ci_gitignore_paths_ignore_gap:** the `secrets-files-guard` job
was inheriting pr.yml's `paths-ignore` filter, so a PR that ONLY
touched `.gitignore` or `docs/**` would skip the entire workflow and
the guard would never run. Failure mode: a malicious or accidental
`.gitignore` regression removing `.env.old` slips past CI; subsequent
PR adding `.env.old` under `docs/` also slips past.

Fix: split secrets-files-guard out to `.github/workflows/secrets-defense.yml`
with no `paths-ignore` so it triggers on every PR + push to main
regardless of which paths changed. pr.yml retains a pointer comment so
future maintainers find the surface.

**chore_ci_gitleaks_workflow_step:** gitleaks was configured in
`.pre-commit-config.yaml` v8.21.2 locally but never invoked from CI.
Pre-commit is opt-in — a contributor without `pre-commit install` had
no content scan at all. Filename guard catches `.env*` patterns;
gitleaks catches `OPENAI_API_KEY=sk-...` content in any path. They're
complementary, not redundant.

Fix: new `gitleaks` job in the same secrets-defense workflow. Uses the
official binary release at v8.21.2 (pinned to match pre-commit), NOT
gitleaks-action@v2 — the action requires a paid GITLEAKS_LICENSE for
private/org repos (SoundMindsAI/relyloop is private per
`gh repo view --json visibility`). Binary install is license-free.

Both jobs run in parallel, ≤30s each. Off the critical path entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): broaden env-guard deny regex to match non-dotted backups (chore_env_guard_extend_deny_pattern)

Extends the .env filename guard from matching only dotted `.env.<x>`
variants to also match `.env-<x>`, `.env_<x>`, `.env~` (vim/emacs
backup suffix). Parallel `.gitignore` update so the local-git defense
and the CI defense agree on the same pattern.

Source: deferred GPT-5.5 Low-severity finding #3 on PR #94. Closed
as a follow-up because the gitignore and the guard were both dotted-
only originally — broadening required updating both in lockstep,
which deserved its own focused PR.

**Bug found while broadening: silent regex range trap.** The original
draft used `[._\-~]` as the separator class, expecting it to mean
"literal `.`, `_`, `-`, `~`". POSIX regex character classes treat
`-` as a range operator between two characters; `[._\-~]` parses as
"literal `.`, `_`, then RANGE from `\` (92) to `~` (126)". That range
matches every letter but does NOT match a literal `-`. So `.env-old`
SILENTLY did not match the supposedly-broader pattern, while
`.environment.yml` did (because `i` is in the range).

Caught by the new regression cases for `.env-old` and `.environment.yml`
running together — without both, neither failure mode is visible.

Fix: put `-` at the end of the bracket expression (`[._~-]`) where it's
unambiguously a literal. Added an explicit code comment explaining the
trap so future maintainers don't re-introduce it.

6 new regression cases (4 deny, 2 allow): `.env-old`, `.env_bak`,
`.env~`, `backend/.env-prod`, `.envoy.conf`, `.env3.yml`. All 22 cases
green; cold local run completes in <0.5s.

`.gitignore` extended with broad globs `.env.*`, `.env-*`, `.env_*`,
`.env~` to mirror the guard. Canonical incident spellings kept as
explicit entries with a comment pointing maintainers at the
documentation reason.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(gitignore): .env~ → .env~* glob for parity with deny regex (Gemini #1)

Accepted Gemini Medium finding on PR #99: the `.gitignore` entry
`.env~` only ignored the literal filename, but the CI deny regex
`\.env([._~-][^/]*)?` catches any tilde-suffixed variant (`.env~`,
`.env~bak`, `.env~1`, etc.). Without the trailing wildcard, a vim or
emacs backup variant like `.env~1` would pass `git status` cleanly
locally but be rejected by CI — confusing operator experience.

Fix: `.env~` → `.env~*` (matches `.env~` itself plus any suffix).

New regression case `.env~bak` locks the parity in (23 cases now;
all pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant