[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment — March 2026

## 📊 Current CI/CD Pipeline Status

The repository has a **mature and extensive CI/CD setup** with **56 total GitHub Actions workflows** (28 standard `.yml` + 28 compiled agentic `.lock.yml` workflows). The pipeline covers build, lint, type-check, unit tests, integration tests, security scanning, container scanning, and AI-assisted smoke tests.

### Workflow Inventory

| Category | Count | Trigger |
|----------|-------|---------|
| Standard CI/CD workflows (`.yml`) | 20 | PR / push to main / schedule |
| Agentic workflows (`.md` / `.lock.yml`) | 28 | PR / schedule / reaction |
| **Total** | **48 active** | — |

### Health at a Glance

Recent scheduled run sample (last 30 runs from the agentic scheduler):

| Workflow | Status |
|----------|--------|
| Secret Digger (Claude) | ✅ Success |
| Secret Digger (Codex) | ✅ Success |
| Secret Digger (Copilot) | ⚠️ Mixed (3 failures / 5 runs) |
| Issue Monster | ✅ Success |
| Agentic Maintenance | ✅ Success |
| CI Doctor | ⚠️ All `skipped` (7/7 runs) — monitoring may not be triggering correctly |

---

## ✅ Existing Quality Gates

### On Every PR (`pull_request` trigger to `main`)

| Check | Workflow | What It Verifies |
|-------|----------|-----------------|
| Build Verification | `build.yml` | TypeScript compile on Node 20 & 22, dist output exists, API proxy unit tests |
| ESLint | `lint.yml` | Code style / static analysis of `src/` |
| TypeScript Type Check | `test-integration.yml` | `tsc --noEmit` strict mode check |
| Test Coverage | `test-coverage.yml` | Jest unit tests + coverage delta vs base branch, PR comment |
| Integration Tests | `test-integration-suite.yml` | 4 parallel Docker-based test jobs (domain/network, protocol/security, container/ops, API proxy) |
| Chroot Integration Tests | `test-chroot.yml` | 4 parallel jobs: language runtimes, package managers, `/proc` FS, edge cases |
| Examples Test | `test-examples.yml` | End-to-end execution of `examples/*.sh` scripts |
| Test Setup Action | `test-action.yml` | `action.yml` self-test (latest version, specific version, image pull, invalid version) |
| CodeQL | `codeql.yml` | Static security analysis (JavaScript/TypeScript + GitHub Actions) |
| Container Security Scan | `container-scan.yml` | Trivy CRITICAL/HIGH CVE scan of agent and squid images (path-filtered to `containers/**`) |
| Dependency Audit | `dependency-audit.yml` | `npm audit --audit-level=high` for main and docs-site packages |
| PR Title Check | `pr-title.yml` | Conventional Commits format enforcement |
| AI Security Guard | `security-guard.lock.yml` | Claude reviews PR diff for security regressions |
| Build-Test Workflows | 8 agentic workflows | Real-world project builds (Go, Rust, Java, Node, Bun, C++, Deno, .NET) through the firewall |
| Smoke Tests | 4 agentic workflows | Claude/Codex/Copilot/Chroot end-to-end agent execution (reaction or PR triggered) |

### Recurring (Scheduled, not PR-blocking)

- Weekly: `dependency-audit.yml`, `container-scan.yml`, CodeQL
- Daily: `security-review.md`, `dependency-security-monitor.md`, `doc-maintainer.md`, `ci-cd-gaps-assessment.md`
- Hourly: `issue-monster.md`, `secret-digger-*.md`

---

## 🔍 Identified Gaps

### 🔴 High Priority

#### 1. Critically low unit test coverage on core modules

`docker-manager.ts` (the most complex file — ~250 statements, 25 functions) has only **18% statement coverage** and **4% function coverage**. `cli.ts` (the main entry point, ~69 statements) has **0% coverage**. These files contain the container lifecycle logic, cleanup handlers, signal processing, and exit code propagation — all of which are critical paths.

Current overall coverage: **38% statements / 31% branches** with very low enforcement thresholds (38% / 30%).

#### 2. `chroot-copilot-home.test.ts` not wired to any CI workflow

The file `tests/integration/chroot-copilot-home.test.ts` exists but is not included in any `--testPathPatterns` in `test-integration-suite.yml` or `test-chroot.yml`. These tests never run in CI, meaning regressions in Copilot home directory handling will go undetected.

#### 3. `build-test-node.md` is uncompiled

`agenticworkflows-status` reports `build-test-node` with `compiled: "No"`. This means the Node.js build-test workflow (which exercises real npm projects through the firewall) **does not execute in CI**. Node.js is the primary ecosystem for this project.

#### 4. API Proxy container is not included in `container-scan.yml`

The Trivy scan in `container-scan.yml` covers the `agent` and `squid` images but **not the `api-proxy` container**. The API proxy is a Node.js HTTP server that handles authentication token injection — a high-value security target. Its base image and npm dependencies should be scanned for CVEs on every change to `containers/api-proxy/**`.

---

### 🟡 Medium Priority

#### 5. Duplicate ESLint execution

Both `build.yml` and `lint.yml` run `npm run lint` on every PR. This is redundant and wastes ~1–2 minutes of CI time per PR. One of these should be removed or consolidated.

#### 6. No code formatting check (Prettier not enforced)

The project has ESLint but no Prettier or `--fix` enforcement. Code style inconsistencies can accumulate silently. There is no formatting gate preventing unformatted code from merging.

#### 7. No shell script linting (shellcheck)

The repository contains multiple shell scripts in `containers/agent/` (`setup-iptables.sh`, `entrypoint.sh`), `containers/squid/`, and `scripts/ci/`. These scripts contain security-critical logic (iptables setup, capability drops) but are **not validated by `shellcheck`** in CI. Shell bugs in these scripts could silently weaken the firewall.

#### 8. `container-scan.yml` only triggers on `containers/**` path changes

The container scan is path-filtered to `containers/**`, meaning PRs that change the container base images or packages indirectly (e.g., through `apt` calls in scripts referenced by Dockerfiles) won't trigger a scan. The weekly schedule catches this eventually, but a window exists.

#### 9. No binary artifact size monitoring

The release pipeline builds standalone binaries (`awf-linux-x64`, `awf-darwin-arm64`, etc.). There is no check to detect unexpected size increases (which could indicate accidental large dependency inclusion). A simple size threshold check in the release workflow or a separate PR check would catch this.

#### 10. CI Doctor shows all-`skipped` runs

All 7 recent CI Doctor runs have conclusion `skipped`. The CI Doctor workflow monitors the health of other workflows via `workflow_run` trigger, but if the triggering workflows aren't completing as expected (or the name list is stale), the doctor never fires. This monitoring gap means workflow regressions (broken workflows that stop running entirely) may go unnoticed.

#### 11. No integration test coverage for `docs-site`

The `docs-site/` Astro/Starlight documentation site has its own `package.json` and dependencies audited, but there is no build test for it in CI (only `deploy-docs.yml` which deploys — but doesn't test the build on PRs that don't change docs). A broken docs build on a non-doc PR would only be caught at deploy time.

---

### 🟢 Low Priority

#### 12. No dependency license compliance check

There is no check for license compatibility of new npm dependencies. A contributor could introduce a GPL-licensed dependency that conflicts with the project's MIT license without CI catching it. Tools like `license-checker` or `licensee` could be added.

#### 13. No performance regression benchmarks

The firewall's container startup time and proxy latency are important UX metrics. There are no benchmarks tracking these across PRs. While this is complex to implement correctly, even a simple "time to first byte" check in the integration tests would surface major regressions.

#### 14. No test flakiness tracking or retry mechanism

Integration tests using Docker containers can have intermittent failures (network timing, container startup races). There's no flakiness tracking or automatic retry configured in the integration test workflows. This leads to manual re-runs and reduces developer confidence in the CI signal.

#### 15. `SECRET_DIGGER_COPILOT` has a 60% failure rate

The scheduled `Secret Digger (Copilot)` workflow shows 3 failures out of 5 recent runs. This recurring failure should be investigated to determine if it's a token/quota issue or a workflow bug.

---

## 📋 Actionable Recommendations

| # | Gap | Recommended Solution | Complexity | Impact |
|---|-----|---------------------|------------|--------|
| 1 | Low coverage on `docker-manager.ts`/`cli.ts` | Add unit tests using Jest mocks for `execa` and file system; target 60%+ coverage | High | High |
| 2 | `chroot-copilot-home.test.ts` not in CI | Add `chroot-copilot-home` to a `--testPathPatterns` in `test-chroot.yml` | Low | High |
| 3 | `build-test-node.md` uncompiled | Run `gh aw compile .github/workflows/build-test-node.md && npx tsx scripts/ci/postprocess-smoke-workflows.ts` | Low | High |
| 4 | No API proxy container scan | Add a `scan-api-proxy` job to `container-scan.yml` mirroring the existing `scan-agent` job | Low | High |
| 5 | Duplicate ESLint | Remove `lint.yml` (keep ESLint in `build.yml`); or remove lint from `build.yml` and keep `lint.yml` | Low | Medium |
| 6 | No Prettier enforcement | Add `prettier --check` step to `build.yml` or create a dedicated formatting workflow | Low | Medium |
| 7 | No shellcheck | Add `shellcheck containers/**/*.sh scripts/ci/*.sh` step to `build.yml` | Low | High |
| 8 | Container scan path filter too narrow | Add `'containers/api-proxy/**'` to `container-scan.yml` paths | Low | Medium |
| 9 | No binary size monitoring | Add a step in `release.yml` to assert each binary is within expected size bounds | Low | Low |
| 10 | CI Doctor skipping | Audit the workflow_run trigger list in `ci-doctor.md` and recompile | Low | Medium |
| 11 | Docs site not built on PRs | Add a docs build step (`npm run docs:build`) to `build.yml` or a dedicated docs-check workflow | Low | Medium |
| 12 | No license check | Add `npx license-checker --onlyAllow 'MIT;ISC;Apache-2.0;BSD-2-Clause;BSD-3-Clause;CC0-1.0'` to dependency-audit | Low | Low |
| 14 | No test retry | Add `--retries 2` to Jest integration test runs for Docker-dependent tests | Low | Medium |
| 15 | Secret Digger failures | Investigate Copilot token/quota issues in `secret-digger-copilot.md` | Medium | Medium |

---

## 📈 Metrics Summary

| Metric | Value |
|--------|-------|
| Total workflows | 56 (48 active) |
| Workflows triggering on PR | ~28 |
| Unit test statement coverage | 38.39% |
| Unit test branch coverage | 31.78% |
| Integration test files | 27 |
| Integration tests not wired to CI | 1 (`chroot-copilot-home.test.ts`) |
| Agentic workflows uncompiled | 1 (`build-test-node.md`) |
| Container images scanned | 2 of 3 (api-proxy missing) |
| Recent Secret Digger Copilot failure rate | 60% (3/5 runs) |
| CI Doctor effectiveness | ⚠️ All recent runs skipped |

---

*Assessment generated by `ci-cd-gaps-assessment` workflow on 2026-03-01. Workflow run: [#22553999520](https://github.com/github/gh-aw-firewall/actions/runs/22553999520)*

---

> **Note:** This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
>
> **Tip:** Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.




> Generated by [CI/CD Pipelines and Integration Tests Gap Assessment](https://github.com/github/gh-aw-firewall/actions/runs/22553999520)
> - [x] expires  on Mar 8, 2026, 10:20 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment — March 2026 #1113

📊 Current CI/CD Pipeline Status

Workflow Inventory

Health at a Glance

✅ Existing Quality Gates

On Every PR (`pull_request` trigger to `main`)

Recurring (Scheduled, not PR-blocking)

🔍 Identified Gaps

🔴 High Priority

1. Critically low unit test coverage on core modules

2. `chroot-copilot-home.test.ts` not wired to any CI workflow

3. `build-test-node.md` is uncompiled

4. API Proxy container is not included in `container-scan.yml`

🟡 Medium Priority

5. Duplicate ESLint execution

6. No code formatting check (Prettier not enforced)

7. No shell script linting (shellcheck)

8. `container-scan.yml` only triggers on `containers/**` path changes

9. No binary artifact size monitoring

10. CI Doctor shows all-`skipped` runs

11. No integration test coverage for `docs-site`

🟢 Low Priority

12. No dependency license compliance check

13. No performance regression benchmarks

14. No test flakiness tracking or retry mechanism

15. `SECRET_DIGGER_COPILOT` has a 60% failure rate

📋 Actionable Recommendations

📈 Metrics Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Category	Count	Trigger
Standard CI/CD workflows (`.yml`)	20	PR / push to main / schedule
Agentic workflows (`.md` / `.lock.yml`)	28	PR / schedule / reaction
Total	48 active	—

Workflow	Status
Secret Digger (Claude)	✅ Success
Secret Digger (Codex)	✅ Success
Secret Digger (Copilot)	⚠️ Mixed (3 failures / 5 runs)
Issue Monster	✅ Success
Agentic Maintenance	✅ Success
CI Doctor	⚠️ All `skipped` (7/7 runs) — monitoring may not be triggering correctly

Check	Workflow	What It Verifies
Build Verification	`build.yml`	TypeScript compile on Node 20 & 22, dist output exists, API proxy unit tests
ESLint	`lint.yml`	Code style / static analysis of `src/`
TypeScript Type Check	`test-integration.yml`	`tsc --noEmit` strict mode check
Test Coverage	`test-coverage.yml`	Jest unit tests + coverage delta vs base branch, PR comment
Integration Tests	`test-integration-suite.yml`	4 parallel Docker-based test jobs (domain/network, protocol/security, container/ops, API proxy)
Chroot Integration Tests	`test-chroot.yml`	4 parallel jobs: language runtimes, package managers, `/proc` FS, edge cases
Examples Test	`test-examples.yml`	End-to-end execution of `examples/*.sh` scripts
Test Setup Action	`test-action.yml`	`action.yml` self-test (latest version, specific version, image pull, invalid version)
CodeQL	`codeql.yml`	Static security analysis (JavaScript/TypeScript + GitHub Actions)
Container Security Scan	`container-scan.yml`	Trivy CRITICAL/HIGH CVE scan of agent and squid images (path-filtered to `containers/**`)
Dependency Audit	`dependency-audit.yml`	`npm audit --audit-level=high` for main and docs-site packages
PR Title Check	`pr-title.yml`	Conventional Commits format enforcement
AI Security Guard	`security-guard.lock.yml`	Claude reviews PR diff for security regressions
Build-Test Workflows	8 agentic workflows	Real-world project builds (Go, Rust, Java, Node, Bun, C++, Deno, .NET) through the firewall
Smoke Tests	4 agentic workflows	Claude/Codex/Copilot/Chroot end-to-end agent execution (reaction or PR triggered)

#	Gap	Recommended Solution	Complexity	Impact
1	Low coverage on `docker-manager.ts`/`cli.ts`	Add unit tests using Jest mocks for `execa` and file system; target 60%+ coverage	High	High
2	`chroot-copilot-home.test.ts` not in CI	Add `chroot-copilot-home` to a `--testPathPatterns` in `test-chroot.yml`	Low	High
3	`build-test-node.md` uncompiled	Run `gh aw compile .github/workflows/build-test-node.md && npx tsx scripts/ci/postprocess-smoke-workflows.ts`	Low	High
4	No API proxy container scan	Add a `scan-api-proxy` job to `container-scan.yml` mirroring the existing `scan-agent` job	Low	High
5	Duplicate ESLint	Remove `lint.yml` (keep ESLint in `build.yml`); or remove lint from `build.yml` and keep `lint.yml`	Low	Medium
6	No Prettier enforcement	Add `prettier --check` step to `build.yml` or create a dedicated formatting workflow	Low	Medium
7	No shellcheck	Add `shellcheck containers/*/.sh scripts/ci/*.sh` step to `build.yml`	Low	High
8	Container scan path filter too narrow	Add `'containers/api-proxy/**'` to `container-scan.yml` paths	Low	Medium
9	No binary size monitoring	Add a step in `release.yml` to assert each binary is within expected size bounds	Low	Low
10	CI Doctor skipping	Audit the workflow_run trigger list in `ci-doctor.md` and recompile	Low	Medium
11	Docs site not built on PRs	Add a docs build step (`npm run docs:build`) to `build.yml` or a dedicated docs-check workflow	Low	Medium
12	No license check	Add `npx license-checker --onlyAllow 'MIT;ISC;Apache-2.0;BSD-2-Clause;BSD-3-Clause;CC0-1.0'` to dependency-audit	Low	Low
14	No test retry	Add `--retries 2` to Jest integration test runs for Docker-dependent tests	Low	Medium
15	Secret Digger failures	Investigate Copilot token/quota issues in `secret-digger-copilot.md`	Medium	Medium

Metric	Value
Total workflows	56 (48 active)
Workflows triggering on PR	~28
Unit test statement coverage	38.39%
Unit test branch coverage	31.78%
Integration test files	27
Integration tests not wired to CI	1 (`chroot-copilot-home.test.ts`)
Agentic workflows uncompiled	1 (`build-test-node.md`)
Container images scanned	2 of 3 (api-proxy missing)
Recent Secret Digger Copilot failure rate	60% (3/5 runs)
CI Doctor effectiveness	⚠️ All recent runs skipped

[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment — March 2026 #1113

Description

📊 Current CI/CD Pipeline Status

Workflow Inventory

Health at a Glance

✅ Existing Quality Gates

On Every PR (pull_request trigger to main)

Recurring (Scheduled, not PR-blocking)

🔍 Identified Gaps

🔴 High Priority

1. Critically low unit test coverage on core modules

2. chroot-copilot-home.test.ts not wired to any CI workflow

3. build-test-node.md is uncompiled

4. API Proxy container is not included in container-scan.yml

🟡 Medium Priority

5. Duplicate ESLint execution

6. No code formatting check (Prettier not enforced)

7. No shell script linting (shellcheck)

8. container-scan.yml only triggers on containers/** path changes

9. No binary artifact size monitoring

10. CI Doctor shows all-skipped runs

11. No integration test coverage for docs-site

🟢 Low Priority

12. No dependency license compliance check

13. No performance regression benchmarks

14. No test flakiness tracking or retry mechanism

15. SECRET_DIGGER_COPILOT has a 60% failure rate

📋 Actionable Recommendations

📈 Metrics Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

On Every PR (`pull_request` trigger to `main`)

2. `chroot-copilot-home.test.ts` not wired to any CI workflow

3. `build-test-node.md` is uncompiled

4. API Proxy container is not included in `container-scan.yml`

8. `container-scan.yml` only triggers on `containers/**` path changes

10. CI Doctor shows all-`skipped` runs

11. No integration test coverage for `docs-site`

15. `SECRET_DIGGER_COPILOT` has a 60% failure rate