[prompt-clustering] Copilot Agent Prompt Clustering Analysis - 2025-12-11 #6165
Closed
Replies: 1 comment
-
|
⚓ Avast! This discussion be marked as outdated by Copilot Agent Prompt Clustering Analysis. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🔬 Copilot Agent Prompt Clustering Analysis
Analysis Date: 2025-12-11
Summary
Performed NLP-based clustering analysis on 986 copilot agent task prompts from the last 30 days using TF-IDF vectorization and K-means clustering. Identified 9 distinct clusters representing different types of tasks with an overall success rate of 77.2%.
Key Findings
Full Clustering Analysis Report
Cluster Visualization
2D visualization of task prompts using PCA dimensionality reduction. Each color represents a distinct cluster.
Detailed Cluster Analysis
Cluster 7: Testing & Quality
Size: 287 tasks (29.1% of total)
Success Rate: 75.6% (217 merged)
Top Keywords: update, add, error, comment, pull, job
Average Complexity Metrics:
Representative Examples:
Cluster 1: General Tasks
Size: 140 tasks (14.2% of total)
Success Rate: 72.9% (102 merged)
Top Keywords: cli, firewall, mcp, version, logs, aw
Average Complexity Metrics:
Representative Examples:
Cluster 3: CI/CD & Workflows
Size: 121 tasks (12.3% of total)
Success Rate: 86.0% (104 merged)
Top Keywords: pkg, pkg workflow, functions, workflow, code, function
Average Complexity Metrics:
Representative Examples:
Cluster 6: CI/CD & Workflows
Size: 110 tasks (11.2% of total)
Success Rate: 77.3% (85 merged)
Top Keywords: workflows, github, github workflows, md, gh, workflow
Average Complexity Metrics:
Representative Examples:
Cluster 8: Feature Implementation
Size: 90 tasks (9.1% of total)
Success Rate: 80.0% (72 merged)
Top Keywords: agentic workflow, agentic, workflow, update, create, add
Average Complexity Metrics:
Representative Examples:
Cluster 2: CI/CD & Workflows
Size: 73 tasks (7.4% of total)
Success Rate: 76.7% (56 merged)
Top Keywords: agent, agentic workflows, agentic, workflows, github, copilot
Average Complexity Metrics:
Representative Examples:
Cluster 5: Bug Fixes
Size: 67 tasks (6.8% of total)
Success Rate: 74.6% (50 merged)
Top Keywords: schema, json, pkg, error, field, validation
Average Complexity Metrics:
Representative Examples:
Cluster 4: Documentation Updates
Size: 49 tasks (5.0% of total)
Success Rate: 85.7% (42 merged)
Top Keywords: docs, documentation, md, reference, content, update
Average Complexity Metrics:
Representative Examples:
Cluster 9: Bug Fixes
Size: 49 tasks (5.0% of total)
Success Rate: 67.3% (33 merged)
Top Keywords: comments, issuetitle, issue, section, issuedescription, author
Average Complexity Metrics:
Representative Examples:
Success Rate by Cluster
Sample Data (50 Most Recent PRs)
interface{}toanysyntax across c...Insights & Recommendations
1. Documentation Tasks Have Highest Success Rate
Documentation-related tasks achieve 85.7% success rate with relatively low complexity (avg 291 lines). Recommendation: Documentation tasks are ideal candidates for copilot agents.
2. Task Complexity Varies Significantly
Task complexity ranges from 274 lines (Bug Fixes) to 1613 lines (Feature Implementation). Recommendation: Break down complex tasks into smaller, focused subtasks.
3. Testing & Quality Tasks Are Most Common
The largest cluster (Testing & Quality) contains 29.1% of all tasks. Recommendation: Invest in improving prompt templates and best practices for this category.
4. Review Engagement Varies by Task Type
Review engagement varies from 1.6 (Bug Fixes) to 2.1 reviews (Bug Fixes). Recommendation: Standardize review processes across task types.
5. Prompt Engineering Opportunities
3 clusters have success rates below 75%: General Tasks, Bug Fixes, Bug Fixes. Recommendation: Analyze failed PRs in these clusters to identify common issues and improve prompt templates.
Methodology: Analyzed 986 copilot-created PRs using NLP techniques (TF-IDF vectorization, K-means clustering with k=9). Prompts extracted from PR bodies, cleaned, and clustered based on semantic similarity.
Analysis Period: Last 30 days
Generated: 2025-12-11 19:25:35 UTC
Beta Was this translation helpful? Give feedback.
All reactions