[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Feb 2026

## 📊 Executive Summary

The `gh-aw-firewall` repository shows **impressive agentic workflow maturity** (Level 4/5) with ~35 workflow files spanning security, CI monitoring, documentation, testing, and issue management. The security-first philosophy is well-reflected in the automation choices. The **top opportunities** are: an issue triage labeling agent, a `/pr-fix` slash-command for CI self-healing, a daily malicious code scanner, and a workflow health meta-agent.

---

## 🎓 Patterns Learned from Pelis Agent Factory

From crawling the full [Pelis Agent Factory blog series](https://github.github.io/gh-aw/blog/2026-01-12-welcome-to-pelis-agent-factory/) and the [githubnext/agentics](https://github.com/githubnext/agentics) repo, key patterns include:

| Pattern | Description | Relevance |
|---------|-------------|-----------|
| **Specialization over monolith** | Many focused agents beat one "do-everything" agent | This repo does this well |
| **Meta-agents** | Agents that monitor other agents (Metrics Collector, Audit Workflows, Workflow Health Manager) | Missing here |
| **Cache-memory for state** | Persistent state across runs for incremental work | Used in 1 workflow |
| **Multi-phase workflows** | Long-running projects (research → setup → implement) | Not used here yet |
| **ChatOps slash commands** | `/plan`, `/pr-fix`, `/test-assist` for on-demand help | Only `/plan` exists |
| **Causal chains** | Agent A creates issue → Agent B picks it up → CI validates | Partially implemented via Issue Monster |
| **Domain-aware security workflows** | Firewall, secrets analysis, malicious code scan | Strong here, one gap |
| **Issue triage with labels** | Auto-label on open for maintainer triage | Missing |
| **Schema consistency checking** | Detect drift between code, docs, schemas | Missing |

**How this repo compares**: The Pelis Factory runs 100+ workflows; this repo runs ~35. The ratio of security-focused to total workflows (~17%) is appropriate for a security tool. The causal chain (Issue Monster → Copilot coding agent) is excellent. The main gaps are in the meta-layer (observability, health monitoring) and some hygiene automations.

---

## 📋 Current Agentic Workflow Inventory

| Workflow | Purpose | Trigger | Assessment |
|----------|---------|---------|------------|
| `security-guard` | Reviews PRs for security regressions | PR opened/sync | ✅ Excellent — uses Claude, domain-specific |
| `security-review` | Daily comprehensive security + threat modeling | Daily + dispatch | ✅ Excellent — evidence-based, web-fetch enabled |
| `dependency-security-monitor` | Monitors CVEs, creates issues + draft PRs | Daily | ✅ Good — 30-day expiry on issues |
| `secret-digger-claude/codex/copilot` | Scans for exposed credentials (3 engines) | Daily | ✅ Strong — multi-engine coverage |
| `ci-doctor` | Investigates CI failures, creates issues | workflow_run | ✅ Good — 26 workflows monitored |
| `ci-cd-gaps-assessment` | Assesses CI/CD pipeline gaps | Daily | ✅ Good — discussion output |
| `test-coverage-improver` | Adds tests for security-critical paths | Weekly | ✅ Good — security-focused |
| `smoke-claude/codex/copilot/chroot` | End-to-end smoke tests of the firewall | PR + dispatch | ✅ Excellent — multi-engine smoke testing |
| `build-test-*` (8 langs) | Tests PRs build in multiple language runtimes | PR opened/sync | ✅ Excellent — broad language coverage |
| `doc-maintainer` | Syncs docs with recent code changes | Daily | ✅ Good — PR output |
| `cli-flag-consistency-checker` | Checks CLI flags vs docs | Weekly | ✅ Good — discussion output |
| `issue-monster` | Dispatches issues to Copilot coding agent | Issues opened + hourly | ✅ Good — task dispatcher |
| `issue-duplication-detector` | Detects duplicate issues (cache-memory) | Issues opened | ✅ Good — uses cache-memory |
| `plan` | Generates project plans via `/plan` slash command | Slash command | ✅ Good — ChatOps |
| `update-release-notes` | Enhances release notes from diff | Release published | ✅ Good |
| `pelis-agent-factory-advisor` | This workflow — periodic advisory analysis | Scheduled | ✅ Meta-awareness |

---

## 🚀 Actionable Recommendations

### P0 — Implement Immediately

#### 🏷️ Issue Triage Agent

**What**: Auto-label incoming issues with appropriate labels (`bug`, `enhancement`, `documentation`, `question`, `security`, `performance`, `help-wanted`).

**Why**: Currently `issue-monster` dispatches issues but no labeling happens. ~15 open issues visible with no labels; maintainers waste time manually categorizing. The Pelis Factory "hello world" workflow — and it's straightforward to implement.

**How**: Add a workflow triggered on `issues: [opened, reopened]`. Agent reads issue title/body, searches related issues, assigns a label and leaves a brief comment explaining why. Use `lockdown: false` to handle issues from all contributors.

**Effort**: Low (~30 min)

````markdown
---
description: Automatically labels and triages incoming issues
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
    lockdown: false
safe-outputs:
  add-labels:
    allowed: [bug, enhancement, documentation, question, security, performance, help-wanted, good-first-issue]
  add-comment:
    max: 1
timeout-minutes: 5
---
# Issue Triage Agent

For the newly opened issue #$\{\{ github.event.issue.number }} in $\{\{ github.repository }},
analyze the title and body to assign the most appropriate label from the allowed list.

Research the issue in the context of this firewall/security codebase. After labeling,
comment to explain the label choice and briefly describe how the issue might be addressed.

Skip if the issue already has labels or is assigned to a user.
````

---

#### 🔧 PR Fix Slash Command (`/pr-fix`)

**What**: An on-demand slash command that investigates and attempts to fix failing CI checks on a PR.

**Why**: Multiple open issues show CI failures that require manual investigation (e.g., #1091, #1092, #1093, etc.). The Pelis Factory `/pr-fix` workflow had very high adoption — developers love being able to type `/pr-fix` in a PR comment and have the agent attempt repairs. The CI Doctor creates investigation issues, but doesn't fix them; this fills the gap.

**How**: Add a `slash_command: pr-fix` workflow that reads the failing job logs, identifies the root cause, and attempts a fix via a `create-pull-request` safe output. Already exists in `githubnext/agentics`: `gh aw add-wizard githubnext/agentics/workflows/pr-fix.md` — then customize for this repo's TypeScript/Node.js stack.

**Effort**: Low (template available)

---

#### 🔍 Daily Malicious Code Scanner

**What**: Daily scan of recent code commits for suspicious patterns, supply chain attacks, and backdoors.

**Why**: This is a **security tool** that itself could be a target. The Pelis Factory's "Daily Malicious Code Scan" is listed as one of 5 security workflows, specifically because the ag-aw project itself processes untrusted code. For gh-aw-firewall — where the tool literally runs AI agent code — this is critical. Already a template: `gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md`.

**How**: Add a daily workflow scanning recent commits for suspicious patterns: exfiltration patterns, unexpected network calls, obfuscated code, unusual capability requests, backdoors in container entrypoints.

**Effort**: Low (template available, customize for container/iptables patterns)

---

### P1 — Plan for Near-Term

#### 🏥 Workflow Health Manager (Meta-Agent)

**What**: A meta-agent that monitors the health of all other agentic workflows — detecting stalled workflows, cost anomalies, zero-output workflows, and infrastructure issues.

**Why**: With 35+ workflows, the CI Doctor covers traditional CI but nobody watches the _agentic_ workflows. Issues #1097 (Security Guard failed), #1101 (Issue Monster failed), #1105 (Secret Digger failed) are currently sitting unresolved. A health manager would automatically investigate these agentic workflow failures. The Pelis Factory's Workflow Health Manager created **40 issues with 25 leading to PRs** — one of the highest-ROI workflows.

**How**: Daily scheduled workflow that uses `tools: agentic-workflows` to query recent runs of all `.md` workflows, identifies failures, stalls, or cost anomalies, and creates investigation issues.

**Effort**: Medium

---

#### 🔄 Schema Consistency Checker

**What**: Detects drift between the CLI types (`src/types.ts`), Squid configuration generator (`src/squid-config.ts`), documentation (`docs/`), and the docs-site reference (`docs-site/src/content/docs/reference/`).

**Why**: This repo has complex configuration schemas: `WrapperConfig` in `src/types.ts`, Squid ACL patterns, Docker Compose structure, and CLI flags — all of which have corresponding documentation. Drift between these is common (AGENTS.md already notes this happens). The Pelis Factory's Schema Consistency Checker created **55 discussion reports** of schema drift. This repo's AGENTS.md mentions doc drift as a specific known problem.

**How**: Weekly workflow that reads `src/types.ts`, `src/cli.ts`, and key docs, identifies inconsistencies, and creates a discussion report. Can be extended to create PRs for clear drift.

**Effort**: Medium

---

#### 💥 Breaking Change Checker

**What**: Monitors PRs for backward-incompatible changes to the CLI interface, Docker API, or domain allowlist behavior.

**Why**: `gh-aw-firewall` is used by downstream repos and CI/CD systems that depend on stable CLI flags and behavior. A breaking change to `--allow-domains` semantics or container startup could silently break user workflows. The Pelis Factory's Breaking Change Checker creates alert issues before changes merge.

**How**: PR-triggered workflow that reads the diff, identifies changes to `src/cli.ts` (flag removals/renames), `src/types.ts` (interface changes), or container API changes, and adds a warning comment or label.

**Effort**: Medium

---

#### 📊 Audit Workflows (Meta-Analytics)

**What**: A daily meta-agent that audits all agentic workflow runs for cost, error patterns, output quality, and success rates.

**Why**: The Pelis Factory's Audit Workflows is described as "essential for observability" — the difference between a well-oiled machine and an expensive black box. With 35+ workflows running daily, having a meta-agent tracking token usage, error rates, and quality would help justify and optimize the investment.

**How**: Daily workflow using `tools: agentic-workflows` to query the last 7 days of workflow runs, aggregate metrics, and post a discussion report.

**Effort**: Medium

---

### P2 — Consider for Roadmap

#### 🧹 Code Simplifier (Continuous Cleanup)

**What**: Daily agent that analyzes recently modified TypeScript files and creates PRs with simplifications (extract helpers, remove nesting, simplify conditions).

**Why**: The codebase is growing complex — `src/docker-manager.ts` is noted as large and complex (AGENTS.md mentions it frequently). A continuous simplifier would help keep it manageable. Pelis Factory Code Simplifier had **83% merge rate**.

**How**: `gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/code-simplifier.md` then customize for TypeScript.

**Effort**: Low (template available)

---

#### 🔗 Link Checker

**What**: Daily/weekly automated check of all links in `docs/` and `docs-site/` to detect broken or outdated URLs.

**Why**: The repo has extensive docs (20+ files in `docs/`). External links to GitHub docs, Docker docs, and Squid documentation can go stale. Already available: `gh aw add-wizard githubnext/agentics/workflows/link-checker.md`.

**Effort**: Low (template available)

---

#### 🌐 Firewall Egress Test Reporter

**What**: A domain-specific daily report that validates the firewall's egress filtering is working correctly — testing that known-good domains are allowed and known-blocked domains are rejected.

**Why**: This is unique to this repository's domain. The firewall tool's own CI already has smoke tests, but a daily report of "what domains are being tested and with what pass/fail rates" would help maintainers spot coverage gaps in the integration tests. This is something no template covers — it's a custom workflow leveraging the repo's domain knowledge.

**How**: Daily workflow that reviews `tests/integration/` for domain coverage, identifies untested domains, and generates a report. Could evolve to propose new integration tests.

**Effort**: Medium

---

#### 📈 Metrics Collector (Agent Ecosystem Analytics)

**What**: Daily collection of agentic workflow performance metrics — runs per workflow, token usage, output quality, issue/PR creation rates.

**Why**: As the workflow count grows, understanding which agents are high-value vs. low-ROI becomes important. Pelis Factory's Metrics Collector created **41 daily discussions** and directly fed the Portfolio Analyst.

**Effort**: Medium

---

### P3 — Future Ideas

- **Mergefest**: Auto-merge `main` into long-lived PR branches to reduce conflicts
- **Sub Issue Closer**: Auto-close sub-issues when parent is resolved
- **Daily Accessibility Review**: Playwright-based accessibility testing of `docs-site/`
- **Weekly Issue Summary**: Summarize open issues for maintainers each week
- **Contribution Guidelines Checker**: Verify new PRs follow CONTRIBUTING.md, commit message format, etc.
- **Container CVE Notifier**: When new CVEs affect ubuntu:22.04 or ubuntu/squid base images, create an issue automatically

---

## 📈 Maturity Assessment

| Dimension | Score | Notes |
|-----------|-------|-------|
| **Security automation** | 5/5 | Best-in-class: 6 security workflows, multi-engine |
| **CI/CD quality** | 5/5 | CI Doctor, smoke tests, build-test across 8 languages |
| **Documentation maintenance** | 4/5 | Doc maintainer + CLI checker; missing link-checker |
| **Issue management** | 3/5 | Monster + dedup + plan; missing triage labels |
| **Code quality automation** | 2/5 | Test coverage improver; missing simplifier, dedup |
| **Meta-observability** | 2/5 | This advisor; missing health manager, audit, metrics |
| **ChatOps / slash commands** | 2/5 | Only `/plan`; missing `/pr-fix`, `/ask` |

**Current Overall Level: 4/5** — Mature, security-first, production-grade automation.

**Target Level: 5/5** — Add meta-observability layer and fill ChatOps + code quality gaps.

**Gap**: The meta-layer (health monitoring, audit, metrics) is the biggest gap. Without it, it's hard to know if the existing agents are working well. The `[agentics]` failure issues (#1097, #1101, #1105) sitting unresolved illustrate this — a Workflow Health Manager would have caught and investigated these automatically.

---

## 🔄 Comparison with Best Practices

### What This Repo Does Well (vs Pelis Factory)
- ✅ **Security specialization**: The 3 secret-digger variants (claude/codex/copilot) are more sophisticated than typical Factory security workflows
- ✅ **Domain-specific smoke tests**: Multi-engine firewall smoke tests are a unique strength
- ✅ **Cache-memory adoption**: `issue-duplication-detector` uses cache-memory for cross-run state
- ✅ **CI Doctor**: One of the most impactful workflows from the Factory, already implemented
- ✅ **Causal chains**: Issue Monster + Copilot coding agent creates a dispatch chain
- ✅ **Multiple engines**: Using claude, codex, and copilot gives redundancy and comparison

### What Could Improve
- ⚠️ **Meta-layer gap**: No workflow auditing other workflow health at the agentic level
- ⚠️ **No issue triage labels**: Issues go unlabeled; maintainers manually categorize
- ⚠️ **No code simplification**: Codebase growing in complexity without cleanup automation
- ⚠️ **ChatOps limited to `/plan`**: `/pr-fix` would be very high-value for this repo

### Unique Opportunities Given the Domain
1. **Self-testing the firewall daily** — The tool's own security is the product; daily automated egress validation reports are uniquely valuable here
2. **Iptables/container security audit** — Daily analysis of recent changes to `setup-iptables.sh`, `entrypoint.sh`, and `host-iptables.ts` for security regressions — deeper than the current security-guard PR review
3. **Schema-conformance for Squid config** — Validating generated Squid configs against known-good patterns is uniquely valuable for a proxy config tool

---

*Generated by [Pelis Agent Factory Advisor](https://github.github.io/gh-aw/blog/2026-01-12-welcome-to-pelis-agent-factory/). Notes saved to cache memory for trend tracking across runs.*

---

> **Note:** This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
>
> **Tip:** Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.




> Generated by [Pelis Agent Factory Advisor](https://github.com/github/gh-aw-firewall/actions/runs/22512136617)
> - [x] expires  on Mar 7, 2026, 3:18 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Feb 2026 #1106

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

🏷️ Issue Triage Agent

🔧 PR Fix Slash Command (`/pr-fix`)

🔍 Daily Malicious Code Scanner

P1 — Plan for Near-Term

🏥 Workflow Health Manager (Meta-Agent)

🔄 Schema Consistency Checker

💥 Breaking Change Checker

📊 Audit Workflows (Meta-Analytics)

P2 — Consider for Roadmap

🧹 Code Simplifier (Continuous Cleanup)

🔗 Link Checker

🌐 Firewall Egress Test Reporter

📈 Metrics Collector (Agent Ecosystem Analytics)

P3 — Future Ideas

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repo Does Well (vs Pelis Factory)

What Could Improve

Unique Opportunities Given the Domain

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pattern	Description	Relevance
Specialization over monolith	Many focused agents beat one "do-everything" agent	This repo does this well
Meta-agents	Agents that monitor other agents (Metrics Collector, Audit Workflows, Workflow Health Manager)	Missing here
Cache-memory for state	Persistent state across runs for incremental work	Used in 1 workflow
Multi-phase workflows	Long-running projects (research → setup → implement)	Not used here yet
ChatOps slash commands	`/plan`, `/pr-fix`, `/test-assist` for on-demand help	Only `/plan` exists
Causal chains	Agent A creates issue → Agent B picks it up → CI validates	Partially implemented via Issue Monster
Domain-aware security workflows	Firewall, secrets analysis, malicious code scan	Strong here, one gap
Issue triage with labels	Auto-label on open for maintainer triage	Missing
Schema consistency checking	Detect drift between code, docs, schemas	Missing

Workflow	Purpose	Trigger	Assessment
`security-guard`	Reviews PRs for security regressions	PR opened/sync	✅ Excellent — uses Claude, domain-specific
`security-review`	Daily comprehensive security + threat modeling	Daily + dispatch	✅ Excellent — evidence-based, web-fetch enabled
`dependency-security-monitor`	Monitors CVEs, creates issues + draft PRs	Daily	✅ Good — 30-day expiry on issues
`secret-digger-claude/codex/copilot`	Scans for exposed credentials (3 engines)	Daily	✅ Strong — multi-engine coverage
`ci-doctor`	Investigates CI failures, creates issues	workflow_run	✅ Good — 26 workflows monitored
`ci-cd-gaps-assessment`	Assesses CI/CD pipeline gaps	Daily	✅ Good — discussion output
`test-coverage-improver`	Adds tests for security-critical paths	Weekly	✅ Good — security-focused
`smoke-claude/codex/copilot/chroot`	End-to-end smoke tests of the firewall	PR + dispatch	✅ Excellent — multi-engine smoke testing
`build-test-*` (8 langs)	Tests PRs build in multiple language runtimes	PR opened/sync	✅ Excellent — broad language coverage
`doc-maintainer`	Syncs docs with recent code changes	Daily	✅ Good — PR output
`cli-flag-consistency-checker`	Checks CLI flags vs docs	Weekly	✅ Good — discussion output
`issue-monster`	Dispatches issues to Copilot coding agent	Issues opened + hourly	✅ Good — task dispatcher
`issue-duplication-detector`	Detects duplicate issues (cache-memory)	Issues opened	✅ Good — uses cache-memory
`plan`	Generates project plans via `/plan` slash command	Slash command	✅ Good — ChatOps
`update-release-notes`	Enhances release notes from diff	Release published	✅ Good
`pelis-agent-factory-advisor`	This workflow — periodic advisory analysis	Scheduled	✅ Meta-awareness

Dimension	Score	Notes
Security automation	5/5	Best-in-class: 6 security workflows, multi-engine
CI/CD quality	5/5	CI Doctor, smoke tests, build-test across 8 languages
Documentation maintenance	4/5	Doc maintainer + CLI checker; missing link-checker
Issue management	3/5	Monster + dedup + plan; missing triage labels
Code quality automation	2/5	Test coverage improver; missing simplifier, dedup
Meta-observability	2/5	This advisor; missing health manager, audit, metrics
ChatOps / slash commands	2/5	Only `/plan`; missing `/pr-fix`, `/ask`

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Feb 2026 #1106

Description

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

🏷️ Issue Triage Agent

🔧 PR Fix Slash Command (/pr-fix)

🔍 Daily Malicious Code Scanner

P1 — Plan for Near-Term

🏥 Workflow Health Manager (Meta-Agent)

🔄 Schema Consistency Checker

💥 Breaking Change Checker

📊 Audit Workflows (Meta-Analytics)

P2 — Consider for Roadmap

🧹 Code Simplifier (Continuous Cleanup)

🔗 Link Checker

🌐 Firewall Egress Test Reporter

📈 Metrics Collector (Agent Ecosystem Analytics)

P3 — Future Ideas

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repo Does Well (vs Pelis Factory)

What Could Improve

Unique Opportunities Given the Domain

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

🔧 PR Fix Slash Command (`/pr-fix`)