Skip to content

Automated nightly codebase scanner#1573

Merged
derekmisler merged 13 commits intodocker:mainfrom
derekmisler:nightly-issue-scanner
Feb 5, 2026
Merged

Automated nightly codebase scanner#1573
derekmisler merged 13 commits intodocker:mainfrom
derekmisler:nightly-issue-scanner

Conversation

@derekmisler
Copy link
Contributor

@derekmisler derekmisler commented Feb 3, 2026

Summary

Adds an automated nightly codebase scanner that uses a multi-agent architecture to detect code quality issues and automatically create GitHub issues for findings.

Key features:

  • Multi-agent system with 5 specialized agents: orchestrator (Claude Sonnet), security analyzer (OpenAI o3-mini), bug detector (Gemini Flash), documentation checker (Claude
    Haiku), and issue reporter (Claude Haiku)
  • Persistent memory via SQLite database cached between runs - learns from previous scans to avoid false positives
  • Automated issue creation using gh CLI with duplicate detection and a strict 2-issue-per-run limit
  • Multi-provider support leveraging different models' strengths (reasoning models for security, fast models for bugs)

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Root Orchestrator                        │
│                   (claude-sonnet-4-5)                       │
│  • Loads memory from previous scans                         │
│  • Delegates to sub-agents                                  │
│  • Filters/prioritizes findings                             │
│  • Adds CATEGORY field before forwarding to reporter        │
└─────────────────────────────────────────────────────────────┘
          │              │                │              │
          ▼              ▼                ▼              ▼
    ┌──────────┐  ┌──────────┐  ┌───────────────┐  ┌──────────┐
    │ Security │  │   Bugs   │  │ Documentation │  │ Reporter │
    │ (o3-mini)│  │ (gemini) │  │    (haiku)    │  │ (haiku)  │
    └──────────┘  └──────────┘  └───────────────┘  └──────────┘

Files Changed

  • .github/agents/nightly-scanner.yaml - Multi-agent configuration with instructions, toolsets, and permissions
  • .github/workflows/nightly-scan.yml - GitHub Actions workflow (runs daily at 6am UTC, supports dry-run mode)

Test plan

  • Trigger workflow manually with dry-run: true to verify agents run without creating issues
  • Verify the scanner memory database is created and cached
  • Trigger without dry-run: true on a branch with a known issue to verify issue creation
  • Verify duplicate detection prevents recreating the same issue

@derekmisler derekmisler self-assigned this Feb 3, 2026
@derekmisler derekmisler marked this pull request as ready for review February 3, 2026 03:13
@derekmisler derekmisler requested a review from a team as a code owner February 3, 2026 03:13
@krissetto
Copy link
Contributor

Might be good to try out some alloy models in this and other actions, could potentially give deeper perspective to the agents and better overall results

@derekmisler derekmisler force-pushed the nightly-issue-scanner branch from 0d1c6d7 to 23e46fb Compare February 3, 2026 16:55
@derekmisler derekmisler marked this pull request as draft February 3, 2026 17:08
@derekmisler
Copy link
Contributor Author

/review

github-actions[bot]

This comment was marked as outdated.

@derekmisler derekmisler force-pushed the nightly-issue-scanner branch from 23e46fb to 65081da Compare February 3, 2026 21:11
@derekmisler derekmisler marked this pull request as ready for review February 3, 2026 21:12
github-actions[bot]

This comment was marked as resolved.

@derekmisler derekmisler force-pushed the nightly-issue-scanner branch from 65081da to 5a76f90 Compare February 3, 2026 21:43
@derekmisler derekmisler marked this pull request as draft February 3, 2026 21:55
@derekmisler derekmisler force-pushed the nightly-issue-scanner branch from 5a76f90 to b147fd1 Compare February 3, 2026 22:16
@derekmisler
Copy link
Contributor Author

derekmisler commented Feb 3, 2026

Might be good to try out some alloy models in this and other actions, could potentially give deeper perspective to the agents and better overall results

sure! what did you have in mind? i'm using 4 different models right now (1 root, 3 sub-agents). is that what you mean? or, like, two sub-agents with different models that both do kind of the same thing, and then the root compares their output?

edit: i just saw that you can combine models like model: claude,gpt-4o. how did i miss that?

Adds a read-only agent that scans the codebase daily and creates
GitHub issues for security vulnerabilities, bugs, and documentation gaps.

- Runs daily at 6am UTC (or manual trigger)
- Creates max 2 issues per run to avoid flooding
- Deduplicates against existing open issues with 'nightly-scan' label
- Dry-run mode for testing
- Strong anti-hallucination rules (read-only, must verify files exist)
- Root agent (claude-sonnet) orchestrates scan across sub-agents
- Security sub-agent (claude-opus) for vulnerability detection
- Bugs sub-agent (claude-sonnet) for logic errors and resource leaks
- Documentation sub-agent (claude-haiku) for doc gap detection
- Add GitHub Actions cache for persistent scanner memory
- Memory stores skip patterns, context, and feedback across runs
- Each sub-agent has strict grounding rules to prevent hallucinations
Agent improvements:
- Documentation agent reads all markdown files before analysis
- All sub-agents explicitly accept empty results as valid outcome
- Added read_multiple_files tool to documentation agent

Workflow improvements:
- Pin actions/cache to v4.2.0 SHA for security
- Use static cache key (matches cagent-action pattern)
- Replace custom JSON memory with cagent's SQLite memory toolset
- Agent uses get_memories/add_memory tools instead of file I/O
- Memory path resolves to .github/agents/scanner-memory.db
- Removes manual JSON initialization step
Sub-agents now return findings in simple text format:
  FILE: path/to/file.go
  LINE: 123
  SEVERITY: high
  ...

Benefits over JSON:
- More natural for LLM output
- Less prone to formatting errors
- Matches cagent-action PR review pattern

Root agent still outputs JSON for workflow parsing.
Move issue creation from workflow to agent:
- New `reporter` sub-agent uses `gh` CLI to create issues
- Checks for duplicates before creating
- Selects appropriate labels based on category
- Workflow reduced from 175 lines to 55 lines

Dry-run mode now passed as prompt to agent.
Model assignments:
- Security: openai/o3-mini (reasoning model for subtle vulnerabilities)
- Bugs: google/gemini-2.5-flash (fast, good at Go code analysis)
- Documentation: anthropic/claude-haiku (sufficient for simpler task)
- Orchestrator: anthropic/claude-sonnet (coordination)
- Reporter: anthropic/claude-haiku (formatting + gh commands)

Workflow now passes all three API keys:
- ANTHROPIC_API_KEY
- OPENAI_API_KEY
- GOOGLE_API_KEY (mapped from GEMINI_API_KEY secret)
Fixes from code review:

1. Restrict shell permissions
   - Changed `gh *` to `gh issue list *` and `gh issue create *`
   - Principle of least privilege

2. CATEGORY field mismatch
   - Root agent now adds CATEGORY when forwarding to reporter
   - Added explicit "Forwarding to reporter" section with example

3. Inconsistent array/NO_ISSUES terminology
   - Changed "Return an empty array `[]`" to "Output `NO_ISSUES`"
   - Consistent across all three analysis agents

4. Documentation trigger clarity
   - Changed to "ONLY run if BOTH security AND bugs returned `NO_ISSUES`"
   - Unambiguous trigger condition

5. Better duplicate detection
   - Changed from downloading 100 issues to `gh issue list --search`
   - Searches by file path in issue body

6. Sub-agent failure handling
   - Added explicit error handling strategy
   - Log errors and continue with other agents
   - Report partial results if some agents fail
   - Added FAILED status to reporter output
@derekmisler derekmisler force-pushed the nightly-issue-scanner branch from 0620d32 to 50a396e Compare February 4, 2026 21:25
@derekmisler
Copy link
Contributor Author

/review

github-actions[bot]

This comment was marked as resolved.

@derekmisler derekmisler marked this pull request as ready for review February 4, 2026 22:07
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

I've analyzed the PR and found 2 issues worth noting:

  1. Logic flaw in documentation agent execution - Documentation issues will only be detected when there are NO security or bug issues, which contradicts the stated goal of comprehensive nightly scanning.

  2. Incomplete error handling for reporter agent failures - The orchestrator has error handling for analysis agents but not for the final reporter agent, which could lead to unreported failures when creating GitHub issues.

Both issues are in the agent configuration and represent design flaws rather than critical bugs. The workflow will function but may not behave as comprehensively as intended.

- Add rationale comment for documentation agent prioritization
  (only runs when no bugs/security issues to reduce noise)
- Add error handling for reporter agent failures
- Add output format for failed issue creation
@derekmisler derekmisler merged commit 9cf6776 into docker:main Feb 5, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants