Skip to content

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report β€” Feb 2026Β #1106

@github-actions

Description

@github-actions

πŸ“Š Executive Summary

The gh-aw-firewall repository shows impressive agentic workflow maturity (Level 4/5) with ~35 workflow files spanning security, CI monitoring, documentation, testing, and issue management. The security-first philosophy is well-reflected in the automation choices. The top opportunities are: an issue triage labeling agent, a /pr-fix slash-command for CI self-healing, a daily malicious code scanner, and a workflow health meta-agent.


πŸŽ“ Patterns Learned from Pelis Agent Factory

From crawling the full Pelis Agent Factory blog series and the githubnext/agentics repo, key patterns include:

Pattern Description Relevance
Specialization over monolith Many focused agents beat one "do-everything" agent This repo does this well
Meta-agents Agents that monitor other agents (Metrics Collector, Audit Workflows, Workflow Health Manager) Missing here
Cache-memory for state Persistent state across runs for incremental work Used in 1 workflow
Multi-phase workflows Long-running projects (research β†’ setup β†’ implement) Not used here yet
ChatOps slash commands /plan, /pr-fix, /test-assist for on-demand help Only /plan exists
Causal chains Agent A creates issue β†’ Agent B picks it up β†’ CI validates Partially implemented via Issue Monster
Domain-aware security workflows Firewall, secrets analysis, malicious code scan Strong here, one gap
Issue triage with labels Auto-label on open for maintainer triage Missing
Schema consistency checking Detect drift between code, docs, schemas Missing

How this repo compares: The Pelis Factory runs 100+ workflows; this repo runs ~35. The ratio of security-focused to total workflows (~17%) is appropriate for a security tool. The causal chain (Issue Monster β†’ Copilot coding agent) is excellent. The main gaps are in the meta-layer (observability, health monitoring) and some hygiene automations.


πŸ“‹ Current Agentic Workflow Inventory

Workflow Purpose Trigger Assessment
security-guard Reviews PRs for security regressions PR opened/sync βœ… Excellent β€” uses Claude, domain-specific
security-review Daily comprehensive security + threat modeling Daily + dispatch βœ… Excellent β€” evidence-based, web-fetch enabled
dependency-security-monitor Monitors CVEs, creates issues + draft PRs Daily βœ… Good β€” 30-day expiry on issues
secret-digger-claude/codex/copilot Scans for exposed credentials (3 engines) Daily βœ… Strong β€” multi-engine coverage
ci-doctor Investigates CI failures, creates issues workflow_run βœ… Good β€” 26 workflows monitored
ci-cd-gaps-assessment Assesses CI/CD pipeline gaps Daily βœ… Good β€” discussion output
test-coverage-improver Adds tests for security-critical paths Weekly βœ… Good β€” security-focused
smoke-claude/codex/copilot/chroot End-to-end smoke tests of the firewall PR + dispatch βœ… Excellent β€” multi-engine smoke testing
build-test-* (8 langs) Tests PRs build in multiple language runtimes PR opened/sync βœ… Excellent β€” broad language coverage
doc-maintainer Syncs docs with recent code changes Daily βœ… Good β€” PR output
cli-flag-consistency-checker Checks CLI flags vs docs Weekly βœ… Good β€” discussion output
issue-monster Dispatches issues to Copilot coding agent Issues opened + hourly βœ… Good β€” task dispatcher
issue-duplication-detector Detects duplicate issues (cache-memory) Issues opened βœ… Good β€” uses cache-memory
plan Generates project plans via /plan slash command Slash command βœ… Good β€” ChatOps
update-release-notes Enhances release notes from diff Release published βœ… Good
pelis-agent-factory-advisor This workflow β€” periodic advisory analysis Scheduled βœ… Meta-awareness

πŸš€ Actionable Recommendations

P0 β€” Implement Immediately

🏷️ Issue Triage Agent

What: Auto-label incoming issues with appropriate labels (bug, enhancement, documentation, question, security, performance, help-wanted).

Why: Currently issue-monster dispatches issues but no labeling happens. ~15 open issues visible with no labels; maintainers waste time manually categorizing. The Pelis Factory "hello world" workflow β€” and it's straightforward to implement.

How: Add a workflow triggered on issues: [opened, reopened]. Agent reads issue title/body, searches related issues, assigns a label and leaves a brief comment explaining why. Use lockdown: false to handle issues from all contributors.

Effort: Low (~30 min)

---
description: Automatically labels and triages incoming issues
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
    lockdown: false
safe-outputs:
  add-labels:
    allowed: [bug, enhancement, documentation, question, security, performance, help-wanted, good-first-issue]
  add-comment:
    max: 1
timeout-minutes: 5
---
# Issue Triage Agent

For the newly opened issue #$\{\{ github.event.issue.number }} in $\{\{ github.repository }},
analyze the title and body to assign the most appropriate label from the allowed list.

Research the issue in the context of this firewall/security codebase. After labeling,
comment to explain the label choice and briefly describe how the issue might be addressed.

Skip if the issue already has labels or is assigned to a user.

πŸ”§ PR Fix Slash Command (/pr-fix)

What: An on-demand slash command that investigates and attempts to fix failing CI checks on a PR.

Why: Multiple open issues show CI failures that require manual investigation (e.g., #1091, #1092, #1093, etc.). The Pelis Factory /pr-fix workflow had very high adoption β€” developers love being able to type /pr-fix in a PR comment and have the agent attempt repairs. The CI Doctor creates investigation issues, but doesn't fix them; this fills the gap.

How: Add a slash_command: pr-fix workflow that reads the failing job logs, identifies the root cause, and attempts a fix via a create-pull-request safe output. Already exists in githubnext/agentics: gh aw add-wizard githubnext/agentics/workflows/pr-fix.md β€” then customize for this repo's TypeScript/Node.js stack.

Effort: Low (template available)


πŸ” Daily Malicious Code Scanner

What: Daily scan of recent code commits for suspicious patterns, supply chain attacks, and backdoors.

Why: This is a security tool that itself could be a target. The Pelis Factory's "Daily Malicious Code Scan" is listed as one of 5 security workflows, specifically because the ag-aw project itself processes untrusted code. For gh-aw-firewall β€” where the tool literally runs AI agent code β€” this is critical. Already a template: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md.

How: Add a daily workflow scanning recent commits for suspicious patterns: exfiltration patterns, unexpected network calls, obfuscated code, unusual capability requests, backdoors in container entrypoints.

Effort: Low (template available, customize for container/iptables patterns)


P1 β€” Plan for Near-Term

πŸ₯ Workflow Health Manager (Meta-Agent)

What: A meta-agent that monitors the health of all other agentic workflows β€” detecting stalled workflows, cost anomalies, zero-output workflows, and infrastructure issues.

Why: With 35+ workflows, the CI Doctor covers traditional CI but nobody watches the agentic workflows. Issues #1097 (Security Guard failed), #1101 (Issue Monster failed), #1105 (Secret Digger failed) are currently sitting unresolved. A health manager would automatically investigate these agentic workflow failures. The Pelis Factory's Workflow Health Manager created 40 issues with 25 leading to PRs β€” one of the highest-ROI workflows.

How: Daily scheduled workflow that uses tools: agentic-workflows to query recent runs of all .md workflows, identifies failures, stalls, or cost anomalies, and creates investigation issues.

Effort: Medium


πŸ”„ Schema Consistency Checker

What: Detects drift between the CLI types (src/types.ts), Squid configuration generator (src/squid-config.ts), documentation (docs/), and the docs-site reference (docs-site/src/content/docs/reference/).

Why: This repo has complex configuration schemas: WrapperConfig in src/types.ts, Squid ACL patterns, Docker Compose structure, and CLI flags β€” all of which have corresponding documentation. Drift between these is common (AGENTS.md already notes this happens). The Pelis Factory's Schema Consistency Checker created 55 discussion reports of schema drift. This repo's AGENTS.md mentions doc drift as a specific known problem.

How: Weekly workflow that reads src/types.ts, src/cli.ts, and key docs, identifies inconsistencies, and creates a discussion report. Can be extended to create PRs for clear drift.

Effort: Medium


πŸ’₯ Breaking Change Checker

What: Monitors PRs for backward-incompatible changes to the CLI interface, Docker API, or domain allowlist behavior.

Why: gh-aw-firewall is used by downstream repos and CI/CD systems that depend on stable CLI flags and behavior. A breaking change to --allow-domains semantics or container startup could silently break user workflows. The Pelis Factory's Breaking Change Checker creates alert issues before changes merge.

How: PR-triggered workflow that reads the diff, identifies changes to src/cli.ts (flag removals/renames), src/types.ts (interface changes), or container API changes, and adds a warning comment or label.

Effort: Medium


πŸ“Š Audit Workflows (Meta-Analytics)

What: A daily meta-agent that audits all agentic workflow runs for cost, error patterns, output quality, and success rates.

Why: The Pelis Factory's Audit Workflows is described as "essential for observability" β€” the difference between a well-oiled machine and an expensive black box. With 35+ workflows running daily, having a meta-agent tracking token usage, error rates, and quality would help justify and optimize the investment.

How: Daily workflow using tools: agentic-workflows to query the last 7 days of workflow runs, aggregate metrics, and post a discussion report.

Effort: Medium


P2 β€” Consider for Roadmap

🧹 Code Simplifier (Continuous Cleanup)

What: Daily agent that analyzes recently modified TypeScript files and creates PRs with simplifications (extract helpers, remove nesting, simplify conditions).

Why: The codebase is growing complex β€” src/docker-manager.ts is noted as large and complex (AGENTS.md mentions it frequently). A continuous simplifier would help keep it manageable. Pelis Factory Code Simplifier had 83% merge rate.

How: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/code-simplifier.md then customize for TypeScript.

Effort: Low (template available)


πŸ”— Link Checker

What: Daily/weekly automated check of all links in docs/ and docs-site/ to detect broken or outdated URLs.

Why: The repo has extensive docs (20+ files in docs/). External links to GitHub docs, Docker docs, and Squid documentation can go stale. Already available: gh aw add-wizard githubnext/agentics/workflows/link-checker.md.

Effort: Low (template available)


🌐 Firewall Egress Test Reporter

What: A domain-specific daily report that validates the firewall's egress filtering is working correctly β€” testing that known-good domains are allowed and known-blocked domains are rejected.

Why: This is unique to this repository's domain. The firewall tool's own CI already has smoke tests, but a daily report of "what domains are being tested and with what pass/fail rates" would help maintainers spot coverage gaps in the integration tests. This is something no template covers β€” it's a custom workflow leveraging the repo's domain knowledge.

How: Daily workflow that reviews tests/integration/ for domain coverage, identifies untested domains, and generates a report. Could evolve to propose new integration tests.

Effort: Medium


πŸ“ˆ Metrics Collector (Agent Ecosystem Analytics)

What: Daily collection of agentic workflow performance metrics β€” runs per workflow, token usage, output quality, issue/PR creation rates.

Why: As the workflow count grows, understanding which agents are high-value vs. low-ROI becomes important. Pelis Factory's Metrics Collector created 41 daily discussions and directly fed the Portfolio Analyst.

Effort: Medium


P3 β€” Future Ideas

  • Mergefest: Auto-merge main into long-lived PR branches to reduce conflicts
  • Sub Issue Closer: Auto-close sub-issues when parent is resolved
  • Daily Accessibility Review: Playwright-based accessibility testing of docs-site/
  • Weekly Issue Summary: Summarize open issues for maintainers each week
  • Contribution Guidelines Checker: Verify new PRs follow CONTRIBUTING.md, commit message format, etc.
  • Container CVE Notifier: When new CVEs affect ubuntu:22.04 or ubuntu/squid base images, create an issue automatically

πŸ“ˆ Maturity Assessment

Dimension Score Notes
Security automation 5/5 Best-in-class: 6 security workflows, multi-engine
CI/CD quality 5/5 CI Doctor, smoke tests, build-test across 8 languages
Documentation maintenance 4/5 Doc maintainer + CLI checker; missing link-checker
Issue management 3/5 Monster + dedup + plan; missing triage labels
Code quality automation 2/5 Test coverage improver; missing simplifier, dedup
Meta-observability 2/5 This advisor; missing health manager, audit, metrics
ChatOps / slash commands 2/5 Only /plan; missing /pr-fix, /ask

Current Overall Level: 4/5 β€” Mature, security-first, production-grade automation.

Target Level: 5/5 β€” Add meta-observability layer and fill ChatOps + code quality gaps.

Gap: The meta-layer (health monitoring, audit, metrics) is the biggest gap. Without it, it's hard to know if the existing agents are working well. The [agentics] failure issues (#1097, #1101, #1105) sitting unresolved illustrate this β€” a Workflow Health Manager would have caught and investigated these automatically.


πŸ”„ Comparison with Best Practices

What This Repo Does Well (vs Pelis Factory)

  • βœ… Security specialization: The 3 secret-digger variants (claude/codex/copilot) are more sophisticated than typical Factory security workflows
  • βœ… Domain-specific smoke tests: Multi-engine firewall smoke tests are a unique strength
  • βœ… Cache-memory adoption: issue-duplication-detector uses cache-memory for cross-run state
  • βœ… CI Doctor: One of the most impactful workflows from the Factory, already implemented
  • βœ… Causal chains: Issue Monster + Copilot coding agent creates a dispatch chain
  • βœ… Multiple engines: Using claude, codex, and copilot gives redundancy and comparison

What Could Improve

  • ⚠️ Meta-layer gap: No workflow auditing other workflow health at the agentic level
  • ⚠️ No issue triage labels: Issues go unlabeled; maintainers manually categorize
  • ⚠️ No code simplification: Codebase growing in complexity without cleanup automation
  • ⚠️ ChatOps limited to /plan: /pr-fix would be very high-value for this repo

Unique Opportunities Given the Domain

  1. Self-testing the firewall daily β€” The tool's own security is the product; daily automated egress validation reports are uniquely valuable here
  2. Iptables/container security audit β€” Daily analysis of recent changes to setup-iptables.sh, entrypoint.sh, and host-iptables.ts for security regressions β€” deeper than the current security-guard PR review
  3. Schema-conformance for Squid config β€” Validating generated Squid configs against known-good patterns is uniquely valuable for a proxy config tool

Generated by Pelis Agent Factory Advisor. Notes saved to cache memory for trend tracking across runs.


Note: This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.

Tip: Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.

Generated by Pelis Agent Factory Advisor

  • expires on Mar 7, 2026, 3:18 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions