-
Notifications
You must be signed in to change notification settings - Fork 14
Description
π Current CI/CD Pipeline Status
The repository has a mature and extensive CI/CD setup with 56 total GitHub Actions workflows (28 standard .yml + 28 compiled agentic .lock.yml workflows). The pipeline covers build, lint, type-check, unit tests, integration tests, security scanning, container scanning, and AI-assisted smoke tests.
Workflow Inventory
| Category | Count | Trigger |
|---|---|---|
Standard CI/CD workflows (.yml) |
20 | PR / push to main / schedule |
Agentic workflows (.md / .lock.yml) |
28 | PR / schedule / reaction |
| Total | 48 active | β |
Health at a Glance
Recent scheduled run sample (last 30 runs from the agentic scheduler):
| Workflow | Status |
|---|---|
| Secret Digger (Claude) | β Success |
| Secret Digger (Codex) | β Success |
| Secret Digger (Copilot) | |
| Issue Monster | β Success |
| Agentic Maintenance | β Success |
| CI Doctor | skipped (7/7 runs) β monitoring may not be triggering correctly |
β Existing Quality Gates
On Every PR (pull_request trigger to main)
| Check | Workflow | What It Verifies |
|---|---|---|
| Build Verification | build.yml |
TypeScript compile on Node 20 & 22, dist output exists, API proxy unit tests |
| ESLint | lint.yml |
Code style / static analysis of src/ |
| TypeScript Type Check | test-integration.yml |
tsc --noEmit strict mode check |
| Test Coverage | test-coverage.yml |
Jest unit tests + coverage delta vs base branch, PR comment |
| Integration Tests | test-integration-suite.yml |
4 parallel Docker-based test jobs (domain/network, protocol/security, container/ops, API proxy) |
| Chroot Integration Tests | test-chroot.yml |
4 parallel jobs: language runtimes, package managers, /proc FS, edge cases |
| Examples Test | test-examples.yml |
End-to-end execution of examples/*.sh scripts |
| Test Setup Action | test-action.yml |
action.yml self-test (latest version, specific version, image pull, invalid version) |
| CodeQL | codeql.yml |
Static security analysis (JavaScript/TypeScript + GitHub Actions) |
| Container Security Scan | container-scan.yml |
Trivy CRITICAL/HIGH CVE scan of agent and squid images (path-filtered to containers/**) |
| Dependency Audit | dependency-audit.yml |
npm audit --audit-level=high for main and docs-site packages |
| PR Title Check | pr-title.yml |
Conventional Commits format enforcement |
| AI Security Guard | security-guard.lock.yml |
Claude reviews PR diff for security regressions |
| Build-Test Workflows | 8 agentic workflows | Real-world project builds (Go, Rust, Java, Node, Bun, C++, Deno, .NET) through the firewall |
| Smoke Tests | 4 agentic workflows | Claude/Codex/Copilot/Chroot end-to-end agent execution (reaction or PR triggered) |
Recurring (Scheduled, not PR-blocking)
- Weekly:
dependency-audit.yml,container-scan.yml, CodeQL - Daily:
security-review.md,dependency-security-monitor.md,doc-maintainer.md,ci-cd-gaps-assessment.md - Hourly:
issue-monster.md,secret-digger-*.md
π Identified Gaps
π΄ High Priority
1. Critically low unit test coverage on core modules
docker-manager.ts (the most complex file β ~250 statements, 25 functions) has only 18% statement coverage and 4% function coverage. cli.ts (the main entry point, ~69 statements) has 0% coverage. These files contain the container lifecycle logic, cleanup handlers, signal processing, and exit code propagation β all of which are critical paths.
Current overall coverage: 38% statements / 31% branches with very low enforcement thresholds (38% / 30%).
2. chroot-copilot-home.test.ts not wired to any CI workflow
The file tests/integration/chroot-copilot-home.test.ts exists but is not included in any --testPathPatterns in test-integration-suite.yml or test-chroot.yml. These tests never run in CI, meaning regressions in Copilot home directory handling will go undetected.
3. build-test-node.md is uncompiled
agenticworkflows-status reports build-test-node with compiled: "No". This means the Node.js build-test workflow (which exercises real npm projects through the firewall) does not execute in CI. Node.js is the primary ecosystem for this project.
4. API Proxy container is not included in container-scan.yml
The Trivy scan in container-scan.yml covers the agent and squid images but not the api-proxy container. The API proxy is a Node.js HTTP server that handles authentication token injection β a high-value security target. Its base image and npm dependencies should be scanned for CVEs on every change to containers/api-proxy/**.
π‘ Medium Priority
5. Duplicate ESLint execution
Both build.yml and lint.yml run npm run lint on every PR. This is redundant and wastes ~1β2 minutes of CI time per PR. One of these should be removed or consolidated.
6. No code formatting check (Prettier not enforced)
The project has ESLint but no Prettier or --fix enforcement. Code style inconsistencies can accumulate silently. There is no formatting gate preventing unformatted code from merging.
7. No shell script linting (shellcheck)
The repository contains multiple shell scripts in containers/agent/ (setup-iptables.sh, entrypoint.sh), containers/squid/, and scripts/ci/. These scripts contain security-critical logic (iptables setup, capability drops) but are not validated by shellcheck in CI. Shell bugs in these scripts could silently weaken the firewall.
8. container-scan.yml only triggers on containers/** path changes
The container scan is path-filtered to containers/**, meaning PRs that change the container base images or packages indirectly (e.g., through apt calls in scripts referenced by Dockerfiles) won't trigger a scan. The weekly schedule catches this eventually, but a window exists.
9. No binary artifact size monitoring
The release pipeline builds standalone binaries (awf-linux-x64, awf-darwin-arm64, etc.). There is no check to detect unexpected size increases (which could indicate accidental large dependency inclusion). A simple size threshold check in the release workflow or a separate PR check would catch this.
10. CI Doctor shows all-skipped runs
All 7 recent CI Doctor runs have conclusion skipped. The CI Doctor workflow monitors the health of other workflows via workflow_run trigger, but if the triggering workflows aren't completing as expected (or the name list is stale), the doctor never fires. This monitoring gap means workflow regressions (broken workflows that stop running entirely) may go unnoticed.
11. No integration test coverage for docs-site
The docs-site/ Astro/Starlight documentation site has its own package.json and dependencies audited, but there is no build test for it in CI (only deploy-docs.yml which deploys β but doesn't test the build on PRs that don't change docs). A broken docs build on a non-doc PR would only be caught at deploy time.
π’ Low Priority
12. No dependency license compliance check
There is no check for license compatibility of new npm dependencies. A contributor could introduce a GPL-licensed dependency that conflicts with the project's MIT license without CI catching it. Tools like license-checker or licensee could be added.
13. No performance regression benchmarks
The firewall's container startup time and proxy latency are important UX metrics. There are no benchmarks tracking these across PRs. While this is complex to implement correctly, even a simple "time to first byte" check in the integration tests would surface major regressions.
14. No test flakiness tracking or retry mechanism
Integration tests using Docker containers can have intermittent failures (network timing, container startup races). There's no flakiness tracking or automatic retry configured in the integration test workflows. This leads to manual re-runs and reduces developer confidence in the CI signal.
15. SECRET_DIGGER_COPILOT has a 60% failure rate
The scheduled Secret Digger (Copilot) workflow shows 3 failures out of 5 recent runs. This recurring failure should be investigated to determine if it's a token/quota issue or a workflow bug.
π Actionable Recommendations
| # | Gap | Recommended Solution | Complexity | Impact |
|---|---|---|---|---|
| 1 | Low coverage on docker-manager.ts/cli.ts |
Add unit tests using Jest mocks for execa and file system; target 60%+ coverage |
High | High |
| 2 | chroot-copilot-home.test.ts not in CI |
Add chroot-copilot-home to a --testPathPatterns in test-chroot.yml |
Low | High |
| 3 | build-test-node.md uncompiled |
Run gh aw compile .github/workflows/build-test-node.md && npx tsx scripts/ci/postprocess-smoke-workflows.ts |
Low | High |
| 4 | No API proxy container scan | Add a scan-api-proxy job to container-scan.yml mirroring the existing scan-agent job |
Low | High |
| 5 | Duplicate ESLint | Remove lint.yml (keep ESLint in build.yml); or remove lint from build.yml and keep lint.yml |
Low | Medium |
| 6 | No Prettier enforcement | Add prettier --check step to build.yml or create a dedicated formatting workflow |
Low | Medium |
| 7 | No shellcheck | Add shellcheck containers/**/*.sh scripts/ci/*.sh step to build.yml |
Low | High |
| 8 | Container scan path filter too narrow | Add 'containers/api-proxy/**' to container-scan.yml paths |
Low | Medium |
| 9 | No binary size monitoring | Add a step in release.yml to assert each binary is within expected size bounds |
Low | Low |
| 10 | CI Doctor skipping | Audit the workflow_run trigger list in ci-doctor.md and recompile |
Low | Medium |
| 11 | Docs site not built on PRs | Add a docs build step (npm run docs:build) to build.yml or a dedicated docs-check workflow |
Low | Medium |
| 12 | No license check | Add npx license-checker --onlyAllow 'MIT;ISC;Apache-2.0;BSD-2-Clause;BSD-3-Clause;CC0-1.0' to dependency-audit |
Low | Low |
| 14 | No test retry | Add --retries 2 to Jest integration test runs for Docker-dependent tests |
Low | Medium |
| 15 | Secret Digger failures | Investigate Copilot token/quota issues in secret-digger-copilot.md |
Medium | Medium |
π Metrics Summary
| Metric | Value |
|---|---|
| Total workflows | 56 (48 active) |
| Workflows triggering on PR | ~28 |
| Unit test statement coverage | 38.39% |
| Unit test branch coverage | 31.78% |
| Integration test files | 27 |
| Integration tests not wired to CI | 1 (chroot-copilot-home.test.ts) |
| Agentic workflows uncompiled | 1 (build-test-node.md) |
| Container images scanned | 2 of 3 (api-proxy missing) |
| Recent Secret Digger Copilot failure rate | 60% (3/5 runs) |
| CI Doctor effectiveness |
Assessment generated by ci-cd-gaps-assessment workflow on 2026-03-01. Workflow run: #22553999520
Note: This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
Tip: Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.
Generated by CI/CD Pipelines and Integration Tests Gap Assessment
- expires on Mar 8, 2026, 10:20 PM UTC