Automated nightly codebase scanner by derekmisler · Pull Request #1573 · docker/cagent

derekmisler · 2026-02-03T03:10:00Z

Summary

Adds an automated nightly codebase scanner that uses a multi-agent architecture to detect code quality issues and automatically create GitHub issues for findings.

Key features:

Multi-agent system with 5 specialized agents: orchestrator (Claude Sonnet), security analyzer (OpenAI o3-mini), bug detector (Gemini Flash), documentation checker (Claude
Haiku), and issue reporter (Claude Haiku)
Persistent memory via SQLite database cached between runs - learns from previous scans to avoid false positives
Automated issue creation using gh CLI with duplicate detection and a strict 2-issue-per-run limit
Multi-provider support leveraging different models' strengths (reasoning models for security, fast models for bugs)

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Root Orchestrator                        │
│                   (claude-sonnet-4-5)                       │
│  • Loads memory from previous scans                         │
│  • Delegates to sub-agents                                  │
│  • Filters/prioritizes findings                             │
│  • Adds CATEGORY field before forwarding to reporter        │
└─────────────────────────────────────────────────────────────┘
          │              │                │              │
          ▼              ▼                ▼              ▼
    ┌──────────┐  ┌──────────┐  ┌───────────────┐  ┌──────────┐
    │ Security │  │   Bugs   │  │ Documentation │  │ Reporter │
    │ (o3-mini)│  │ (gemini) │  │    (haiku)    │  │ (haiku)  │
    └──────────┘  └──────────┘  └───────────────┘  └──────────┘

Files Changed

.github/agents/nightly-scanner.yaml - Multi-agent configuration with instructions, toolsets, and permissions
.github/workflows/nightly-scan.yml - GitHub Actions workflow (runs daily at 6am UTC, supports dry-run mode)

Test plan

Trigger workflow manually with dry-run: true to verify agents run without creating issues
Verify the scanner memory database is created and cached
Trigger without dry-run: true on a branch with a known issue to verify issue creation
Verify duplicate detection prevents recreating the same issue

krissetto · 2026-02-03T13:29:10Z

Might be good to try out some alloy models in this and other actions, could potentially give deeper perspective to the agents and better overall results

derekmisler · 2026-02-03T20:17:24Z

/review

fixed important issues

derekmisler · 2026-02-03T22:17:45Z

Might be good to try out some alloy models in this and other actions, could potentially give deeper perspective to the agents and better overall results

sure! what did you have in mind? i'm using 4 different models right now (1 root, 3 sub-agents). is that what you mean? or, like, two sub-agents with different models that both do kind of the same thing, and then the root compares their output?

edit: i just saw that you can combine models like model: claude,gpt-4o. how did i miss that?

changes have been made

Adds a read-only agent that scans the codebase daily and creates GitHub issues for security vulnerabilities, bugs, and documentation gaps. - Runs daily at 6am UTC (or manual trigger) - Creates max 2 issues per run to avoid flooding - Deduplicates against existing open issues with 'nightly-scan' label - Dry-run mode for testing - Strong anti-hallucination rules (read-only, must verify files exist)

- Root agent (claude-sonnet) orchestrates scan across sub-agents - Security sub-agent (claude-opus) for vulnerability detection - Bugs sub-agent (claude-sonnet) for logic errors and resource leaks - Documentation sub-agent (claude-haiku) for doc gap detection - Add GitHub Actions cache for persistent scanner memory - Memory stores skip patterns, context, and feedback across runs - Each sub-agent has strict grounding rules to prevent hallucinations

Agent improvements: - Documentation agent reads all markdown files before analysis - All sub-agents explicitly accept empty results as valid outcome - Added read_multiple_files tool to documentation agent Workflow improvements: - Pin actions/cache to v4.2.0 SHA for security - Use static cache key (matches cagent-action pattern)

- Replace custom JSON memory with cagent's SQLite memory toolset - Agent uses get_memories/add_memory tools instead of file I/O - Memory path resolves to .github/agents/scanner-memory.db - Removes manual JSON initialization step

Sub-agents now return findings in simple text format: FILE: path/to/file.go LINE: 123 SEVERITY: high ... Benefits over JSON: - More natural for LLM output - Less prone to formatting errors - Matches cagent-action PR review pattern Root agent still outputs JSON for workflow parsing.

Move issue creation from workflow to agent: - New `reporter` sub-agent uses `gh` CLI to create issues - Checks for duplicates before creating - Selects appropriate labels based on category - Workflow reduced from 175 lines to 55 lines Dry-run mode now passed as prompt to agent.

Model assignments: - Security: openai/o3-mini (reasoning model for subtle vulnerabilities) - Bugs: google/gemini-2.5-flash (fast, good at Go code analysis) - Documentation: anthropic/claude-haiku (sufficient for simpler task) - Orchestrator: anthropic/claude-sonnet (coordination) - Reporter: anthropic/claude-haiku (formatting + gh commands) Workflow now passes all three API keys: - ANTHROPIC_API_KEY - OPENAI_API_KEY - GOOGLE_API_KEY (mapped from GEMINI_API_KEY secret)

Fixes from code review: 1. Restrict shell permissions - Changed `gh *` to `gh issue list *` and `gh issue create *` - Principle of least privilege 2. CATEGORY field mismatch - Root agent now adds CATEGORY when forwarding to reporter - Added explicit "Forwarding to reporter" section with example 3. Inconsistent array/NO_ISSUES terminology - Changed "Return an empty array `[]`" to "Output `NO_ISSUES`" - Consistent across all three analysis agents 4. Documentation trigger clarity - Changed to "ONLY run if BOTH security AND bugs returned `NO_ISSUES`" - Unambiguous trigger condition 5. Better duplicate detection - Changed from downloading 100 issues to `gh issue list --search` - Searches by file path in issue body 6. Sub-agent failure handling - Added explicit error handling strategy - Log errors and continue with other agents - Report partial results if some agents fail - Added FAILED status to reporter output

derekmisler · 2026-02-04T21:25:43Z

/review

github-actions

Review Summary

I've analyzed the PR and found 2 issues worth noting:

Logic flaw in documentation agent execution - Documentation issues will only be detected when there are NO security or bug issues, which contradicts the stated goal of comprehensive nightly scanning.
Incomplete error handling for reporter agent failures - The orchestrator has error handling for analysis agents but not for the final reporter agent, which could lead to unreported failures when creating GitHub issues.

Both issues are in the agent configuration and represent design flaws rather than critical bugs. The workflow will function but may not behave as comprehensively as intended.

.github/agents/nightly-scanner.yaml

- Add rationale comment for documentation agent prioritization (only runs when no bugs/security issues to reduce noise) - Add error handling for reporter agent failures - Add output format for failed issue creation

derekmisler self-assigned this Feb 3, 2026

derekmisler marked this pull request as ready for review February 3, 2026 03:13

derekmisler requested a review from a team as a code owner February 3, 2026 03:13

derekmisler force-pushed the nightly-issue-scanner branch from 0d1c6d7 to 23e46fb Compare February 3, 2026 16:55

derekmisler marked this pull request as draft February 3, 2026 17:08

This comment was marked as outdated.

Sign in to view

derekmisler force-pushed the nightly-issue-scanner branch from 23e46fb to 65081da Compare February 3, 2026 21:11

derekmisler marked this pull request as ready for review February 3, 2026 21:12

This comment was marked as resolved.

Sign in to view

derekmisler force-pushed the nightly-issue-scanner branch from 65081da to 5a76f90 Compare February 3, 2026 21:43

derekmisler marked this pull request as draft February 3, 2026 21:55

derekmisler force-pushed the nightly-issue-scanner branch from 5a76f90 to b147fd1 Compare February 3, 2026 22:16

derekmisler added 10 commits February 4, 2026 16:25

refactor: use cagent's built-in memory system

3e6208e

- Replace custom JSON memory with cagent's SQLite memory toolset - Agent uses get_memories/add_memory tools instead of file I/O - Memory path resolves to .github/agents/scanner-memory.db - Removes manual JSON initialization step

refactor: PR feedback

dc77459

feat: alloy

50a396e

derekmisler force-pushed the nightly-issue-scanner branch from 0620d32 to 50a396e Compare February 4, 2026 21:25

This comment was marked as resolved.

Sign in to view

derekmisler added 2 commits February 4, 2026 16:39

fix: gha will not create directories

9490f68

feat: add keys for our gha app

0a5ccaa

derekmisler marked this pull request as ready for review February 4, 2026 22:07

github-actions bot reviewed Feb 4, 2026

View reviewed changes

.github/agents/nightly-scanner.yaml Show resolved Hide resolved

.github/agents/nightly-scanner.yaml Show resolved Hide resolved

Address review feedback for nightly scanner agent

e9ad604

- Add rationale comment for documentation agent prioritization (only runs when no bugs/security issues to reduce noise) - Add error handling for reporter agent failures - Add output format for failed issue creation

dgageot approved these changes Feb 5, 2026

View reviewed changes

derekmisler merged commit 9cf6776 into docker:main Feb 5, 2026
5 checks passed

BrewTestBot mentioned this pull request Feb 7, 2026

cagent 1.20.6 Homebrew/homebrew-core#266303

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated nightly codebase scanner#1573

Automated nightly codebase scanner#1573
derekmisler merged 13 commits intodocker:mainfrom
derekmisler:nightly-issue-scanner

derekmisler commented Feb 3, 2026 •

edited

Loading

Uh oh!

krissetto commented Feb 3, 2026

Uh oh!

derekmisler commented Feb 3, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as resolved.

Uh oh!

derekmisler commented Feb 3, 2026 •

edited

Loading

Uh oh!

derekmisler commented Feb 4, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

derekmisler commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

Files Changed

Test plan

Uh oh!

krissetto commented Feb 3, 2026

Uh oh!

derekmisler commented Feb 3, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as resolved.

Uh oh!

derekmisler commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

derekmisler commented Feb 4, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

derekmisler commented Feb 3, 2026 •

edited

Loading

derekmisler commented Feb 3, 2026 •

edited

Loading