[repository-quality] 🎯 Repository Quality Improvement Report - Agentic Workflow Runtime Safety & Compliance (2026-05-27) #35222

2026-05-27T14:07:54Z

github-actions[bot]
Bot May 27, 2026

🎯 Repository Quality Improvement Report - Agentic Workflow Runtime Safety & Compliance

Analysis Date: 2026-05-27
Focus Area: Agentic Workflow Runtime Safety & Compliance
Strategy Type: Custom
Custom Area: Yes — this repository has 236 agentic workflow markdown files with complex runtime configurations (engines, timeouts, A/B experiments, token budgets). No standard category captures these workflow-authoring compliance concerns.

Executive Summary

Analysis of 236 workflow markdown files in .github/workflows/ reveals four actionable compliance gaps affecting production reliability. The most critical: 12 workflows lack timeout-minutes, including 6 with scheduled triggers (daily or higher frequency) — any of these can run indefinitely if the agent hangs, silently burning GitHub Actions minutes. The second concern is mode: remote appearing in two non-test production workflows (schema-feature-coverage.md and github-mcp-tools-report.md); per AGENTS.md this mode does not work with GITHUB_TOKEN and requires a PAT or GitHub App token, meaning these workflows likely fail their MCP GitHub calls at runtime.

A lower-severity but pervasive concern is stale A/B experiments: 6 experiments started ≥ 14 days ago (the min_samples threshold used in most) have no closing deadline in their frontmatter and have not been concluded — they accumulate data indefinitely without a forcing function for analysis. Finally, daily-observability-report.md sets max-effective-tokens: 80000000 (80 M tokens) — twice the next-highest value in the repository and scheduled to run daily, with no documented justification for the elevated ceiling.

Full Analysis Report

Focus Area: Agentic Workflow Runtime Safety & Compliance

Current State Assessment

Metrics Collected:

Metric	Value	Status
Total workflow `.md` files	236	✅
Copilot engine workflows	96	✅
Claude engine workflows	51	✅
Codex engine workflows	9	✅
Workflows missing `timeout-minutes`	12	❌
Scheduled workflows missing `timeout-minutes`	6	❌
Workflows using `mode: remote`	4 (2 test, 2 production)	⚠️
Active A/B experiments	24	⚠️
Stale experiments (started ≥ 14 days ago)	6	⚠️
Workflows with `max-effective-tokens`	6	✅
Max single token cap	80 M (`daily-observability-report.md`)	⚠️

Findings

Strengths

224 of 236 workflows correctly declare timeout-minutes, showing high adoption of the safeguard
All list_code_scanning_alerts usages correctly guard with state: open and severity: critical,high (see code-scanning-fixer.md)
No workflows use the deprecated needs.activation.outputs.text/title/body expressions — full migration to steps.sanitized.* is complete
Shared import files (shared/meta-analysis-base.md, shared/gh.md, etc.) cleanly provide GitHub MCP config to 37+ workflows without duplication

Areas for Improvement

❌ HIGH — 6 scheduled workflows with no timeout-minutes; agent hangs cost unbounded CI minutes
⚠️ MEDIUM — schema-feature-coverage.md and github-mcp-tools-report.md use mode: remote which requires a PAT/App token not available to the standard GITHUB_TOKEN runner environment
⚠️ MEDIUM — 6 A/B experiments older than their min_samples threshold with no deadline/expiry, creating open-ended experiments
⚠️ LOW — daily-observability-report.md carries max-effective-tokens: 80000000 with no inline justification comment, making future review difficult

Detailed Analysis

Missing `timeout-minutes` — Scheduled Workflows (Critical Subset)

The following 6 scheduled workflows have no timeout-minutes declaration and therefore no upper bound on execution time:

Workflow	Engine	Schedule
`constraint-solving-potd.md`	(unset)	daily
`contribution-check.md`	(unset)	every 4 hours
`daily-astrostylelite-markdown-spellcheck.md`	claude	daily
`daily-semgrep-scan.md`	(unset)	daily
`daily-sentrux-report.md`	copilot	daily
`otlp-data-quality-validator.md`	(unset)	daily on weekdays

Six additional non-scheduled workflows also lack the field: ace-editor.md, dependabot-burner.md, dependabot-repair.md, smoke-ci.md (runs on PR + push), test-dispatcher.md, test-project-url-default.md.

The smoke-ci.md case is particularly notable — it triggers on every pull_request open/sync/reopen event and has concurrency: cancel-in-progress: true, but without a timeout a leaked agent process could hold the concurrency slot open.

`mode: remote` in Non-Test Production Workflows

Four workflows use mode: remote for GitHub MCP:

codex-github-remote-mcp-test.md — ✅ test-only (workflow_dispatch)
github-remote-mcp-auth-test.md — ✅ test-only, engine unset
github-mcp-tools-report.md — ⚠️ Claude engine, on: workflow_dispatch — production use
schema-feature-coverage.md — ⚠️ Codex engine, weekly scheduled — production use

Per AGENTS.md: "Never use mode: remote — it does not work with the GitHub Actions token (GITHUB_TOKEN) and requires a special PAT or GitHub App token." schema-feature-coverage.md is particularly risky as it runs weekly and attempts to create pull requests — if MCP auth silently fails, the schema coverage check produces no output.

Stale A/B Experiments

Experiments with start_date ≥ 14 days before today (2026-05-27) and min_samples: 14:

Workflow	Start Date	Days Running
`issue-arborist.md`	2026-05-05	22
`deep-report.md`	2026-05-06	21
`daily-issues-report.md`	2026-05-07	20
`daily-fact.md`	2026-05-11	16
`daily-news.md`	2026-05-12	15
`daily-security-red-team.md`	2026-05-12	15

None of these have an end_date or deadline field. Without a forcing function, experiments accumulate indefinitely.

🤖 Tasks for Copilot Agent

NOTE TO PLANNER AGENT: Split the following tasks into individual work items.

Improvement Tasks

Task 1: Add `timeout-minutes` to All Scheduled Workflows Missing It

Priority: High
Estimated Effort: Small
Focus Area: Agentic Workflow Runtime Safety

Description: Six scheduled agentic workflows and six additional high-frequency workflows lack timeout-minutes. Add appropriate per-engine defaults: copilot/claude → 30 min, unset-engine shell-heavy → 20 min, every 4 hours frequency → 15 min (contribution-check.md).

Acceptance Criteria:

All 12 workflows listed below have timeout-minutes added to their frontmatter
Values are appropriate for the engine and task complexity (20–60 min range)
make recompile passes with zero errors after changes
make build passes

Code Region: .github/workflows/constraint-solving-potd.md, .github/workflows/contribution-check.md, .github/workflows/daily-astrostylelite-markdown-spellcheck.md, .github/workflows/daily-semgrep-scan.md, .github/workflows/daily-sentrux-report.md, .github/workflows/otlp-data-quality-validator.md, .github/workflows/ace-editor.md, .github/workflows/dependabot-burner.md, .github/workflows/dependabot-repair.md, .github/workflows/smoke-ci.md, .github/workflows/test-dispatcher.md, .github/workflows/test-project-url-default.md

Add `timeout-minutes` to the following 12 workflow markdown files in `.github/workflows/`. Each file is missing this field in its YAML frontmatter (between the `---` delimiters).

For each file, insert `timeout-minutes: <N>` after the `permissions:` block (or after `engine:` if no permissions block exists). Use these values:

- `constraint-solving-potd.md` → `timeout-minutes: 30`
- `contribution-check.md` → `timeout-minutes: 15` (runs every 4 hours; keep short)
- `daily-astrostylelite-markdown-spellcheck.md` → `timeout-minutes: 30`
- `daily-semgrep-scan.md` → `timeout-minutes: 20`
- `daily-sentrux-report.md` → `timeout-minutes: 30`
- `otlp-data-quality-validator.md` → `timeout-minutes: 20`
- `ace-editor.md` → `timeout-minutes: 30`
- `dependabot-burner.md` → `timeout-minutes: 20`
- `dependabot-repair.md` → `timeout-minutes: 20`
- `smoke-ci.md` → `timeout-minutes: 30`
- `test-dispatcher.md` → `timeout-minutes: 20`
- `test-project-url-default.md` → `timeout-minutes: 20`

After editing, run `make recompile` and `make build` to verify no compilation errors. Do NOT create a new file; edit each existing file in place.

Task 2: Fix `mode: remote` in Production Workflows

Priority: Medium
Estimated Effort: Small
Focus Area: Agentic Workflow Runtime Compliance

Description: schema-feature-coverage.md (weekly scheduled, codex) and github-mcp-tools-report.md (claude, workflow_dispatch) use mode: remote for their GitHub MCP tool, which per AGENTS.md "does not work with the GitHub Actions token (GITHUB_TOKEN) and requires a special PAT or GitHub App token." Replace with mode: gh-proxy (the safe alternative that works with GITHUB_TOKEN) or remove the mode: key entirely (default is gh-proxy).

Acceptance Criteria:

Neither schema-feature-coverage.md nor github-mcp-tools-report.md contain mode: remote
Both workflows still declare a github: tools section with appropriate toolsets
Test-only workflows codex-github-remote-mcp-test.md and github-remote-mcp-auth-test.md are not modified (they intentionally test remote mode)
make recompile and make build pass

Code Region: .github/workflows/schema-feature-coverage.md, .github/workflows/github-mcp-tools-report.md

Fix `mode: remote` in two production workflows. Per AGENTS.md, `mode: remote` does not work with the standard `GITHUB_TOKEN` and will cause MCP authentication failures at runtime.

In `.github/workflows/schema-feature-coverage.md`, find the tools section:
```yaml
tools:
  ...
  github:
    mode: remote
    toolsets: [default]

Change mode: remote to mode: gh-proxy (or remove the mode: line entirely, since gh-proxy is the default).

In .github/workflows/github-mcp-tools-report.md, find the equivalent mode: remote under the github: tools key and make the same change.

Do NOT modify codex-github-remote-mcp-test.md or github-remote-mcp-auth-test.md — those are intentional test workflows for the remote mode feature.

After editing, run make recompile and make build to verify.


---

#### Task 3: Close or Extend Stale A/B Experiments

**Priority**: Medium
**Estimated Effort**: Medium
**Focus Area**: Agentic Workflow Experiment Hygiene

**Description:** Six A/B experiments started 15–22 days ago with `min_samples: 14` have no `end_date` or closing marker. These experiments have likely accumulated enough data for analysis but will continue indefinitely. Add an `end_date` field to each experiment's configuration and/or add a comment directing the team to analyse the results.

**Acceptance Criteria:**
- [ ] Each of the 6 stale experiments has either an `end_date` added (set to today + 7 days for a final collection window) or is marked `status: concluded` with a brief comment
- [ ] The `notify:` issue (if present) is updated to reference the analysis request
- [ ] `make recompile` passes

**Code Region:** `.github/workflows/issue-arborist.md`, `.github/workflows/deep-report.md`, `.github/workflows/daily-issues-report.md`, `.github/workflows/daily-fact.md`, `.github/workflows/daily-news.md`, `.github/workflows/daily-security-red-team.md`

```markdown
Six A/B experiments in `.github/workflows/` have been running for ≥15 days and have reached or exceeded their `min_samples: 14` threshold, but have no `end_date` or closure marker.

For each of these workflow files, locate the `experiments:` block in the YAML frontmatter and add an `end_date: "2026-06-03"` field inside the experiment definition (one week from today). This signals that data collection should stop and results should be analysed.

Workflows to update:
- `.github/workflows/issue-arborist.md` (started 2026-05-05)
- `.github/workflows/deep-report.md` (started 2026-05-06)
- `.github/workflows/daily-issues-report.md` (started 2026-05-07)
- `.github/workflows/daily-fact.md` (started 2026-05-11)
- `.github/workflows/daily-news.md` (started 2026-05-12)
- `.github/workflows/daily-security-red-team.md` (started 2026-05-12)

For each, find the experiment entry that contains `start_date:` and add `end_date: "2026-06-03"` on the next line. Example:
```yaml
experiments:
  my_experiment:
    start_date: "2026-05-05"
    end_date: "2026-06-03"   # ← add this line
    min_samples: 14

After editing all 6 files, run make recompile to verify no parse errors.


---

#### Task 4: Document or Reduce Extreme Token Cap in `daily-observability-report.md`

**Priority**: Low
**Estimated Effort**: Small
**Focus Area**: Agentic Workflow Cost Management

**Description:** `daily-observability-report.md` sets `max-effective-tokens: 80000000` — 80 M tokens, the highest in the repository and 2× the next-highest value (40 M in `agent-performance-analyzer.md` and `daily-safe-output-optimizer.md`). This workflow runs daily. Either document the technical justification with an inline comment, or reduce to 40 M if the higher limit was set speculatively.

**Acceptance Criteria:**
- [ ] `daily-observability-report.md` either (a) has a comment above `max-effective-tokens` explaining why 80 M is necessary, or (b) is reduced to `40000000` if no justification exists
- [ ] `make recompile` passes

**Code Region:** `.github/workflows/daily-observability-report.md`

```markdown
In `.github/workflows/daily-observability-report.md`, locate the `max-effective-tokens: 80000000` line in the YAML frontmatter.

Review the workflow's task description and agent prompt to determine if 80 M tokens is warranted:
- If the workflow does genuinely extensive multi-source analysis that requires the higher limit, add a YAML comment on the preceding line explaining the justification. Example:
  ```yaml
  # Elevated from default: aggregates OTLP traces + CI metrics + 30-day history; 40M insufficient per run logs
  max-effective-tokens: 80000000

If no clear justification exists (e.g. the limit was set speculatively), reduce it to max-effective-tokens: 40000000 to align with other complex daily workflows.

After editing, run make recompile and make build to verify.


---

## 📊 Historical Context

<details>
<summary>Previous Focus Areas</summary>

| Date | Focus Area | Type | Custom | Key Outcomes |
|------|------------|------|--------|--------------|
| 2026-05-20 | Large File Refactoring & Maintainability | Custom | Y | 5 tasks on files >800 LOC |
| 2026-05-21 | Error Message Quality & User Experience | Custom | Y | 5 tasks on console formatting |
| 2026-05-22 | MCP Integration Robustness & Error Recovery | Custom | Y | 5 tasks on timeout & resilience |
| 2026-05-25 | Test Infrastructure & Build Tag Compliance | Custom | Y | 4 tasks on missing build tags |
| 2026-05-26 | Go Dependency Hygiene & Import Path Consistency | Custom | Y | 4 tasks on yaml.v3 migration |

</details>

---

## 🎯 Recommendations

### Immediate Actions (This Week)
1. Add `timeout-minutes` to all 12 missing workflows — Priority: High

### Short-term Actions (This Month)
1. Replace `mode: remote` with `mode: gh-proxy` in the two production workflows — Priority: Medium
2. Add `end_date` to 6 stale A/B experiments — Priority: Medium

### Long-term Actions (This Quarter)
1. Document or reduce the 80 M token cap in `daily-observability-report.md` — Priority: Low
2. Consider a lint rule (pre-compile check) that warns when `timeout-minutes` is absent from scheduled workflows

---

## 📈 Success Metrics

- **Workflows without timeout**: 12 → 0
- **Production `mode: remote` usage**: 2 → 0
- **Stale experiments (≥14 days, no end_date)**: 6 → 0
- **Undocumented extreme token caps**: 1 → 0

---

## Next Steps

1. Review and prioritise the tasks above
2. Assign tasks to Copilot coding agent via planner agent
3. Track progress on improvement items
4. Re-evaluate this focus area in 2 weeks

---

*Generated by Repository Quality Improvement Agent*
*Next analysis: 2026-05-28 — Focus area selected by diversity algorithm*




> Generated by [⚡ Repository Quality Improvement Agent](https://github.com/github/gh-aw/actions/runs/26515807190) · sonnet46 2.6M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Frepository-quality-improver%22&type=discussions)
> - [x] expires <!-- gh-aw-expires: 2026-05-28T14:07:53.882Z --> on May 28, 2026, 2:07 PM UTC

<!-- gh-aw-agentic-workflow: Repository Quality Improvement Agent, engine: copilot, version: 1.0.52, model: claude-sonnet-4.6, id: 26515807190, workflow_id: repository-quality-improver, run: https://github.com/github/gh-aw/actions/runs/26515807190 -->

<!-- gh-aw-workflow-id: repository-quality-improver -->
<!-- gh-aw-workflow-call-id: github/gh-aw/repository-quality-improver -->

2026-05-27T15:21:18Z

github-actions[bot]
Bot May 27, 2026
Author

Smoke test agent was here. Me bonk button. Sparks fly. Checks roar.

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
android.clients.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot · gpt55 8.6M · ◷

0 replies

2026-05-28T14:14:29Z

github-actions[bot]
Bot May 28, 2026
Author

This discussion was automatically closed because it expired on 2026-05-28T14:07:53.882Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[repository-quality] 🎯 Repository Quality Improvement Report - Agentic Workflow Runtime Safety & Compliance (2026-05-27) #35222

Uh oh!

{{title}}

Uh oh!

Focus Area: Agentic Workflow Runtime Safety & Compliance

Current State Assessment

Findings

Strengths

Areas for Improvement

Detailed Analysis

Missing `timeout-minutes` — Scheduled Workflows (Critical Subset)

`mode: remote` in Non-Test Production Workflows

Stale A/B Experiments

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[repository-quality] 🎯 Repository Quality Improvement Report - Agentic Workflow Runtime Safety & Compliance (2026-05-27) #35222

Uh oh!

github-actions[bot] Bot May 27, 2026

🎯 Repository Quality Improvement Report - Agentic Workflow Runtime Safety & Compliance

Executive Summary

Focus Area: Agentic Workflow Runtime Safety & Compliance

Current State Assessment

Findings

Strengths

Areas for Improvement

Detailed Analysis

Missing timeout-minutes — Scheduled Workflows (Critical Subset)

mode: remote in Non-Test Production Workflows

Stale A/B Experiments

🤖 Tasks for Copilot Agent

Improvement Tasks

Task 1: Add timeout-minutes to All Scheduled Workflows Missing It

Task 2: Fix mode: remote in Production Workflows

Replies: 2 comments

Uh oh!

github-actions[bot] Bot May 27, 2026 Author

Uh oh!

github-actions[bot] Bot May 28, 2026 Author

github-actions[bot]
Bot May 27, 2026

Missing `timeout-minutes` — Scheduled Workflows (Critical Subset)

`mode: remote` in Non-Test Production Workflows

Task 1: Add `timeout-minutes` to All Scheduled Workflows Missing It

Task 2: Fix `mode: remote` in Production Workflows

github-actions[bot]
Bot May 27, 2026
Author

github-actions[bot]
Bot May 28, 2026
Author