[prompt-clustering] Copilot Agent Prompt Clustering Analysis — 2026-04-05 to 2026-04-25 #28491

2026-04-25T20:18:25Z

github-actions[bot]
Bot Apr 25, 2026

Summary

Analysis Period: 2026-04-05 → 2026-04-25 (20 days)
Total PRs Analyzed: 1,000
Clusters Identified: 8
Overall Merge Rate: 78.0%
Silhouette Score: 0.048 (NLP clustering of short code-change descriptions tends to produce low silhouette scores — the clusters are nonetheless semantically coherent)

The copilot agent produced 1,000 pull requests over the past 20 days. TF-IDF vectorization + K-Means (k=8) reveals eight distinct work categories, from code-quality improvements and workflow automation to security hardening and engine/CLI maintenance. Success rates range from 59% to 84%, with validation/WIP tasks being the least likely to merge and test/refactor tasks the most successful.

Cluster Overview

#	Theme	PRs	Merge Rate	Top Keywords
C1	Code Quality & Docs	195	84%	test, fix, file, refactor, functions, docs
C2	MCP / Gateway Tooling	101	78%	mcp, tool, gateway, server, tools, cli
C3	Workflow & Agent Features	183	81%	workflow, add, feat, daily, workflows, agent
C4	Firewall / Rules / Migrations	150	77%	addresses, firewall rules, blocked, triggering
C5	Validation & WIP Tasks	73	59%	run, validation, update, review, tests
C6	CI / Job / Step Fixes	106	83%	job, fix, ci, step, activation, checkout
C7	Safe Outputs & Security	89	82%	safe outputs, output, handler, config
C8	Engine / CLI / Version Mgmt	103	68%	copilot, cli, engine, awf, version, gemini

Cluster-by-Cluster Analysis

C1 — Code Quality & Documentation (195 PRs, 84% merged)

The largest cluster covers targeted improvements to tests, documentation, and code style. Tasks are well-scoped and almost always merge, indicating that the agent handles these precisely.

Representative PRs:

#27327 — Improve test quality for schedule_cron_detection_test.go
#27108 — fix: replace string concatenation loop with strings.Builder
#27387 — docs: fix mobile navigation and breakpoint conflicts

C2 — MCP / Gateway Tooling (101 PRs, 78% merged)

Tasks that add, fix, or deprecate MCP servers, the MCP gateway, and related CLI tooling. Several closed PRs in this cluster are duplicates (the agent retried the same feature task). The 78% merge rate suggests some instability in this rapidly-evolving area.

Representative PRs:

#28047 — feat: add --exclude flag to logs command and MCP tool
#28288 — feat: deprecate features.mcp-cli, enable MCP CLI by default
#28324 — fix: add missing container property to HTTP MCP servers

C3 — Workflow & Agent Features (183 PRs, 81% merged)

The second-largest cluster encompasses new agent workflow features, audit tooling, cache-memory improvements, and daily analysis workflows. Healthy merge rate reflects core product development work.

Representative PRs:

#28434 — Restrict protected files/folders collection to active engine
#24957 — feat: audit command accepts multiple run IDs for diff mode
#27948 — Formalize cache-memory location naming convention

C4 — Firewall / Rules / Migrations (150 PRs, 77% merged)

Large-scale migration tasks (e.g., migrating 24 workflows between patterns), firewall-rule updates, and dependency bumps. The slightly lower merge rate (77%) likely reflects the complexity of migration PRs that require manual coordination.

Representative PRs:

#26838 — chore: bump Copilot CLI → 1.0.36, Codex CLI → 0.125.0
#26448 — Migrate 24 workflows from daily-audit-discussion to daily-audit-base
#26017 — Refactor audit workflows with new shared/daily-audit-charts import

C5 — Validation & WIP Tasks (73 PRs, 59% merged) ⚠️

The lowest-performing cluster. It captures tasks that involve pre-flight validation, GitHub Actions version updates, and work-in-progress features. The 59% merge rate indicates these tasks need iteration or are frequently superseded. Several PRs are explicitly tagged [WIP] or [actions] automation that is replaced before merging.

Representative PRs:

#27392 — [actions] Update GitHub Actions versions - 2026-04-24
#27990 — feat: place threat detection CAUTION alert at top of PR body
#27896 — fix: remove run_id from trending cache key

C6 — CI / Job / Step Fixes (106 PRs, 83% merged)

Fixes to GitHub Actions job definitions, step names, checkout behavior, and TypeScript type errors. High merge rate (83%) — focused bug fixes that the agent resolves accurately.

Representative PRs:

#24902 — [WIP] Standardize "Config" abbreviation in step names
#27059 — fix: cross-repo create-pull-request checkout ignores triggering branch ref
#27209 — fix: TypeScript type errors in push_to_pull_request_branch

C7 — Safe Outputs & Security Hardening (89 PRs, 82% merged)

Security-focused tasks: protecting repository folders, sanitizing steganographic channels, fixing XPIA vectors, and improving safe-output handlers. 82% merge rate shows high-value security work that consistently lands.

Representative PRs:

#25378 — feat: protect any top-level folder starting with '.' in safe outputs
#26915 — Count unique files in create_pull_request patch limit
#27077 — fix: add render_template.cjs and is_truthy.cjs to SAFE_OUTPUTS_FILES

C8 — Engine / CLI / Version Management (103 PRs, 68% merged)

Tasks that update engine versions (Copilot CLI, Gemini CLI, AWF), fix engine-specific runtime issues (node not found, npm EROFS, timeout configs), and manage multi-engine routing. The 68% merge rate is the second lowest — these tasks often require environmental fixes and multiple retries.

Representative PRs:

#25577 — chore: remove allocated LLM gateway ports for OpenCode and Crush
#25499 — fix: add GEMINI_CLI_TRUST_WORKSPACE=true to unblock Gemini headless mode
#27594 — fix: resolve node: command not found in Copilot engine on GPU runners

Timeline: Task Distribution Over Time

Cluster Selection Methodology (Elbow + Silhouette)

K=8 was selected based on the highest silhouette score (0.048) across k=3...11. TF-IDF on short code-change descriptions naturally yields low silhouette scores because many PRs share overlapping vocabulary (fix, add, feat); k=8 provides the best semantic granularity without over-fragmenting.

Key Findings

Code Quality & Docs are the most common task type (C1 — 19.5% of PRs) and the highest-success cluster (84% merge rate). The agent excels at focused, well-scoped tasks like improving tests, adding docstrings, and fixing style issues.
Security hardening (C7) has a strong 82% merge rate despite its complexity. Tasks involving safe outputs protection, XPIA mitigation, and file sanitization consistently land — suggesting the security perimeter is well-defined enough for the agent to operate confidently.
Validation/WIP tasks (C5) are the weakest cluster at 59% merge rate. Many are [actions] automated updates or exploratory [WIP] branches that get closed without merging. These inflate the closed-PR count but represent normal exploratory work.
Engine & CLI version management (C8, 68% merge rate) involves environment-specific issues (GPU runners, npm permissions, model deprecations) that require iteration. Multi-retry patterns are visible in this cluster.
MCP tooling (C2, 78% merge rate) shows duplicate PRs — the same feature was attempted 2-3 times before landing. This suggests the agent benefits from clearer issue scoping when MCP server integration is involved.

Recommendations

For C5 (Validation & WIP): Tag automated dependency-update PRs separately so they don't skew clustering. Consider a distinct workflow for [actions] version bumps that auto-merges when CI passes, reducing the noise of open→close cycles.
For C8 (Engine/CLI): Environment-specific failures (GPU runners, npm EROFS, missing binaries) cause multiple retry PRs. A lightweight pre-flight environment check step in the engine bootstrap could prevent common failures before the agent commits code.
For C2 (MCP Tooling): Duplicate PRs suggest the agent loses context between retries. Persisting a "PR already attempted" flag in cache-memory for active feature tasks would prevent redundant work.
Across all clusters: 78% overall merge rate is healthy. The 22% that don't merge are mostly WIP branches, exploratory probes, and retried duplicates — not outright failures. The agent's precision on scoped tasks (C1, C6, C7) is excellent.

Full Data Table (Top 8 PRs per cluster)

PR	Title	Cluster	Outcome	Date
#28480	Improve test quality for schedule_cron_detection_test.go	C1	Merged	2026-04-25
#28479	fix: replace string concatenation loop with strings.Builder	C1	Merged	2026-04-25
#28476	docs: fix mobile navigation and breakpoint conflicts	C1	Merged	2026-04-25
#28477	feat: add --exclude flag to logs command and MCP tool	C2	Closed	2026-04-25
#28448	feat: deprecate features.mcp-cli, enable MCP CLI mounting by default	C2	Closed	2026-04-25
#28392	Add Kreuzberg document intelligence MCP shared workflow	C2	Merged	2026-04-25
#28485	Restrict protected files/folders collection to active engine	C3	Closed	2026-04-25
#28483	feat: audit command accepts multiple run IDs for diff mode	C3	Merged	2026-04-25
#28482	formalize cache-memory location naming convention	C3	Merged	2026-04-25
#28401	chore: bump Copilot CLI, Codex CLI, GitHub MCP Server	C4	Merged	2026-04-25
#28151	Migrate 24 workflows from daily-audit-discussion to base	C4	Merged	2026-04-23
#28444	[actions] Update GitHub Actions versions - 2026-04-24	C5	Merged	2026-04-25
#28429	feat: place threat detection CAUTION alert at top of PR body	C5	Merged	2026-04-25
#28304	[WIP] Add CI pre-check for stale .lock.yml files	C5	Closed	2026-04-24
#28490	[WIP] Standardize "Config" abbreviation in step names	C6	Open	2026-04-25
#28388	fix: TypeScript type errors in push_to_pull_request_branch	C6	Merged	2026-04-25
#28377	fix: resolve target repo checkout path in cross-repo push	C6	Merged	2026-04-25
#28486	feat: protect any top-level folder starting with '.' in safe outputs	C7	Merged	2026-04-25
#28472	Count unique files in create_pull_request patch limit	C7	Merged	2026-04-25
#28055	fix: close XPIA steganographic channel in allowedAliases	C7	Merged	2026-04-23
#28484	chore: remove allocated LLM gateway ports for OpenCode and Crush	C8	Merged	2026-04-25
#28475	fix: add GEMINI_CLI_TRUST_WORKSPACE=true to unblock Gemini headless mode	C8	Merged	2026-04-25
#28451	fix: resolve node: command not found in Copilot engine on GPU runners	C8	Merged	2026-04-25

References: §24939388489

Generated by Copilot Agent Prompt Clustering Analysis · ● 280.6K · ◷

expires on Apr 26, 2026, 8:18 PM UTC

2026-04-26T20:28:43Z

github-actions[bot]
Bot Apr 26, 2026
Author

This discussion has been marked as outdated by Copilot Agent Prompt Clustering Analysis.

A newer discussion is available at Discussion #28629.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prompt-clustering] Copilot Agent Prompt Clustering Analysis — 2026-04-05 to 2026-04-25 #28491

Uh oh!

{{title}}

Uh oh!

C1 — Code Quality & Documentation (195 PRs, 84% merged)

C2 — MCP / Gateway Tooling (101 PRs, 78% merged)

C3 — Workflow & Agent Features (183 PRs, 81% merged)

C4 — Firewall / Rules / Migrations (150 PRs, 77% merged)

C5 — Validation & WIP Tasks (73 PRs, 59% merged) ⚠️

C6 — CI / Job / Step Fixes (106 PRs, 83% merged)

C7 — Safe Outputs & Security Hardening (89 PRs, 82% merged)

C8 — Engine / CLI / Version Management (103 PRs, 68% merged)

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[prompt-clustering] Copilot Agent Prompt Clustering Analysis — 2026-04-05 to 2026-04-25 #28491

Uh oh!

github-actions[bot] Bot Apr 25, 2026

Summary

Cluster Overview

C1 — Code Quality & Documentation (195 PRs, 84% merged)

C2 — MCP / Gateway Tooling (101 PRs, 78% merged)

C3 — Workflow & Agent Features (183 PRs, 81% merged)

C4 — Firewall / Rules / Migrations (150 PRs, 77% merged)

C5 — Validation & WIP Tasks (73 PRs, 59% merged) ⚠️

C6 — CI / Job / Step Fixes (106 PRs, 83% merged)

C7 — Safe Outputs & Security Hardening (89 PRs, 82% merged)

C8 — Engine / CLI / Version Management (103 PRs, 68% merged)

Timeline: Task Distribution Over Time

Key Findings

Recommendations

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Apr 26, 2026 Author

github-actions[bot]
Bot Apr 25, 2026

github-actions[bot]
Bot Apr 26, 2026
Author