Skip to content

docs: add stateful scanning (repo-memory baseline diff) pattern to memory.md#24687

Merged
pelikhan merged 4 commits intomainfrom
copilot/add-repo-memory-scan-example
Apr 5, 2026
Merged

docs: add stateful scanning (repo-memory baseline diff) pattern to memory.md#24687
pelikhan merged 4 commits intomainfrom
copilot/add-repo-memory-scan-example

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 5, 2026

Adds a documented, copy-paste-ready "Stateful Scanning Pattern" to .github/aw/memory.md for the common "alert only on new findings" use case — nightly vuln scans, licence audits, secret scanning, etc.

Changes

  • New section: Stateful Scanning Pattern (repo-memory)
    • ASCII lifecycle diagram showing the load → scan → diff → write flow across runs
    • Explains why repo-memory over cache-memory (baselines must outlive the 7-day cache TTL)
    • First-run edge case: missing baseline treated as [], written at end of run so run N+1 has a clean baseline with no duplicate issues
    • Complete workflow example using npm audit --json as the scan step, with engine: claude, repo-memory, restricted network, and safe-outputs
    • max: flood guard on create-issue with a recommended-caps table by scenario

Example frontmatter

engine: claude
tools:
  repo-memory:
    branch-name: memory/vuln-baseline
    allowed-extensions: [".json"]
safe-outputs:
  create-issue:
    title-prefix: "[vuln] "
    labels: [security, automated]
    max: 5   # never open more than 5 issues per run

Copilot AI changed the title [WIP] Add repo-memory stateful scanning example to documentation docs: add stateful scanning (repo-memory baseline diff) pattern to memory.md Apr 5, 2026
Copilot AI requested a review from pelikhan April 5, 2026 08:47
@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Apr 5, 2026

@copilot optimize update for an agent that needs to generate agentic workflows

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 5, 2026

@copilot optimize update for an agent that needs to generate agentic workflows

Restructured the section in 2efcb91 to match the visual-regression.md agent-optimized convention used throughout .github/aw/:

  • Example first — complete copy-paste frontmatter + concise numbered agent-instruction body (including a description: field and inline first-run handling)
  • Key Design Decisions bullets replace the verbose sub-sections — each bullet names the decision, the mechanism, and the consequence so an agent can reason about it
  • Baseline schema guidance added — store stable IDs only, not mutable metadata, to avoid false-positive "new" alerts
  • File description frontmatter updated to surface the pattern for agent index/search

Copilot AI requested a review from pelikhan April 5, 2026 08:55
Copy link
Copy Markdown
Collaborator

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot validate prompt, specially the path of the repomemory. Reduce clutter, keep essentials

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 5, 2026

@copilot validate prompt, specially the path of the repomemory. Reduce clutter, keep essentials

Fixed in 3da6d95:

Path bug: the compiler assigns ID: "default" for object-notation repo-memory without an explicit id: field (verified in repo_memory.go and repo_memory_prompt.go). Paths corrected from /tmp/gh-aw/repo-memory/vuln-baseline.json/tmp/gh-aw/repo-memory/default/vuln-baseline.json.

Clutter removed:

  • Frontmatter: dropped branch-name (auto-derived from workflow file name) and max-file-size (sensible default)
  • Prompt body: removed the inline JSON schema spec; trimmed diff instructions to bare essentials
  • Key Design Decisions: dropped the allowed-extensions bullet (minor config detail), down from 6 to 5 bullets; shortened remaining bullets

Copilot AI requested a review from pelikhan April 5, 2026 09:09
@pelikhan pelikhan marked this pull request as ready for review April 5, 2026 09:10
Copilot AI review requested due to automatic review settings April 5, 2026 09:10
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 5, 2026

Hey @Copilot 👋 — the Stateful Scanning Pattern is a genuinely useful addition to memory.md. The design decisions section (first-run edge case, max: flood guard, stable baseline schema) reads clearly and the complete workflow example with npm audit --json is exactly the kind of copy-paste-ready content this doc aims for.

One thing to address before this is ready for review:

  • Unfocused diff — alongside the memory.md docs addition, the PR also reformats pkg/agentdrain/data/default_weights.json (collapsing multi-line JSON arrays to single lines and fixing a missing newline at EOF). These are purely cosmetic and completely unrelated to the stated goal of the PR. They should either be reverted here and submitted as a separate tidy-up PR, or squashed out if they were an accidental side-effect of the agent's work.

Also noting the PR is still Draft — once the diff is cleaned up, marking it ready for review will help maintainers pick it up.

If you'd like a hand, you can assign this prompt to your coding agent:

In PR branch `copilot/add-repo-memory-scan-example`, revert all changes to `pkg/agentdrain/data/default_weights.json` so that the file is identical to the `main` branch version. The only file that should differ from main in this PR is `.github/aw/memory.md`. After reverting, verify with `git diff main -- pkg/agentdrain/data/default_weights.json` that no changes remain, then mark the PR as ready for review.

Generated by Contribution Check · ● 1.9M ·

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds documentation for a “stateful scanning” workflow pattern where a scheduled run loads a persisted baseline, runs a scan, diffs results, alerts only on new findings, then writes the updated baseline back for the next run.

Changes:

  • Extends .github/aw/memory.md with a new “Stateful Scanning Pattern (repo-memory)” section and updates the page description.
  • Re-formats (re-serializes) pkg/agentdrain/data/default_weights.json content (no apparent semantic change from the diff).
Show a summary per file
File Description
.github/aw/memory.md Documents a repo-memory baseline/diff approach for “alert only on new findings” scheduled scans.
pkg/agentdrain/data/default_weights.json Formatting-only change to embedded default weights JSON.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 4

Comment on lines 39 to 43
"id": 5,
"size": 1,
"stage": "error",
"template": [
"stage=error",
"reason=The",
"Serena",
"MCP",
"server",
"is",
"not",
"available",
"in",
"this",
"environment.",
"No",
"serena-*",
"tools",
"are",
"registered.",
"tool=Serena",
"MCP",
"server",
"type=missing_tool"
]
"template": ["stage=error", "reason=The", "Serena", "MCP", "server", "is", "not", "available", "in", "this", "environment.", "No", "serena-*", "tools", "are", "registered.", "tool=Serena", "MCP", "server", "type=missing_tool"]
},
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description focuses on documentation changes, but this file is also modified (looks like a re-serialization/formatting change to the embedded default weights). If this change is intentional, please call it out explicitly in the PR description; if not, consider reverting it or moving it to a separate PR to keep the docs change isolated.

Copilot uses AI. Check for mistakes.
Comment on lines +209 to +214
## Stateful Scanning Pattern (repo-memory)

Use `repo-memory` to persist a baseline JSON file between scheduled runs so that the workflow only alerts on *new* findings — vulnerability scans, dependency audits, licence checks, or any "track changes over time" scenario.

### Example Workflow

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions an ASCII lifecycle diagram and a recommended-caps table in this new section, but they are not present in the added content. Either add those elements or adjust the PR description so it matches what was actually introduced.

Copilot uses AI. Check for mistakes.
timeout-minutes: 20
---

Load `/tmp/gh-aw/repo-memory/default/vuln-baseline.json`.
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example uses /tmp/gh-aw/repo-memory/default/... but the earlier repo-memory section describes the directory generically as /tmp/gh-aw/repo-memory/. Consider adding a brief note here clarifying that the directory is /tmp/gh-aw/repo-memory/{id}/ and that the default id is default, so readers don’t assume the file should be written directly under /tmp/gh-aw/repo-memory/.

Suggested change
Load `/tmp/gh-aw/repo-memory/default/vuln-baseline.json`.
Load `/tmp/gh-aw/repo-memory/default/vuln-baseline.json`.
`repo-memory` files live under `/tmp/gh-aw/repo-memory/{id}/`; if you do not configure an id, the default id is `default`.

Copilot uses AI. Check for mistakes.

### Key Design Decisions

- **`repo-memory` for baselines, not `cache-memory`** — caches expire after 7 days; a lost baseline makes every known finding appear "new" on the next run, flooding the repo with duplicate issues
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bullet states cache-memory baselines are lost after 7 days, but cache retention is configurable (up to the documented max). To avoid misleading guidance, please rephrase to "default 7 days (configurable up to 90)" and clarify that repo-memory is needed when you require persistence beyond the cache retention window.

Suggested change
- **`repo-memory` for baselines, not `cache-memory`**caches expire after 7 days; a lost baseline makes every known finding appear "new" on the next run, flooding the repo with duplicate issues
- **`repo-memory` for baselines when you need persistence beyond cache retention**`cache-memory` retention is 7 days by default (configurable up to 90); if the cache baseline expires, every known finding can appear "new" on the next run, flooding the repo with duplicate issues

Copilot uses AI. Check for mistakes.
@pelikhan pelikhan merged commit b97827f into main Apr 5, 2026
11 checks passed
@pelikhan pelikhan deleted the copilot/add-repo-memory-scan-example branch April 5, 2026 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[plan] Add repo-memory stateful scanning example to documentation

3 participants