📊 Agentic Workflow Lock File Statistics Analysis - December 2025 #5262
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it was created by an agentic workflow more than 3 days ago. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Agentic Workflow Lock File Statistics Analysis
Analysis Date: December 2, 2025
Repository: githubnext/gh-aw
Lockfiles Analyzed: 100
This report provides comprehensive statistical insights into the structure, patterns, and characteristics of agentic workflow lockfiles (
.lock.ymlfiles) in the gh-aw repository.Executive Summary
The gh-aw repository contains 100 lock files representing a diverse ecosystem of agentic workflows. These workflows collectively define 903 jobs executing 6,159 steps, with an average workflow containing approximately 9 jobs and 62 steps. The most common workflow trigger is
pull_request(93 occurrences), followed byissues(84) andworkflow_dispatch(79). For safe outputs,create-discussionis overwhelmingly popular (used in 36 workflows), with GitHub MCP being the dominant external integration (3,695 references).Full Statistical Analysis
File Size Distribution
Overview Statistics
Size Distribution
File Size Extremes
Smallest Lock Files:
arxiv.lock.yml- 81 KB (shared MCP configuration)context7.lock.yml- 81 KB (shared MCP configuration)test-skip-if-match-object.lock.yml- 86 KBtest-firewall-default.lock.yml- 97 KBexample-permissions-warning.lock.yml- 106 KBLargest Lock Files:
poem-bot.lock.yml- 605 KB (outlier, 2x larger than average)pr-nitpick-reviewer.lock.yml- 398 KBcopilot-session-insights.lock.yml- 396 KBq.lock.yml- 395 KBsmoke-copilot-no-firewall.lock.yml- 386 KBInsight: Over half (54%) of lock files fall in the 300-500KB range, establishing this as the "standard" size for a fully-featured agentic workflow. The
poem-bot.lock.ymlis a notable outlier at 605KB, potentially due to extensive prompt engineering or complex multi-step logic.Workflow Trigger Analysis
Trigger Type Distribution
Note: Percentages exceed 100% because workflows commonly use multiple triggers.
Common Trigger Combinations
Most workflows employ multiple trigger types to provide flexibility:
Schedule Patterns
Scheduled Workflows: 61 workflows run on cron schedules
Common Schedules:
0 9 * * *- Daily at 9:00 AM UTC (1 workflow)0 10 * * *- Daily at 10:00 AM UTC (1 workflow)Insight: The heavy use of multiple triggers (93% with pull_request, 84% with issues) indicates workflows designed for broad applicability. The high manual trigger adoption (79% with workflow_dispatch) demonstrates a preference for testability and on-demand execution.
Safe Outputs Analysis
Safe outputs define how agents communicate results back to GitHub, enabling controlled, secure interactions.
Safe Output Type Distribution
Total Workflows Using Safe Outputs: 82 (82% of all workflows)
Discussion Categories
When workflows create discussions, they use these categories:
Insight: The "audits" category (17 total including variants) is heavily favored for automated analysis results, reflecting the repository's focus on code quality and observability.
Safe Output Patterns
create-discussionto post comprehensive findingsadd-commentto respond directly to user requestscreate-issueto track problems that need attentioncreate-pull-requestfor documentation fixes, code updatesStructural Characteristics
Job and Step Complexity
Insights:
Permission Patterns
GitHub Actions permissions granted across workflows:
Insight: The high prevalence of
contents,pull-requests, andissuespermissions (80-90% of workflows) reflects the interactive, code-modifying nature of these agentic workflows.Timeout Configuration
Timeout Distribution
Statistics:
Insight: The 10-minute default timeout is most common, with longer timeouts (20-45 min) reserved for complex analysis workflows like
daily-firewall-reportanddaily-repo-chronicle.MCP Server Usage
MCP (Model Context Protocol) servers provide external capabilities to agentic workflows.
Most Used MCP Servers
Insight: The GitHub MCP server dominates with 3,695 references (94% of all MCP usage), which makes sense given these workflows operate within GitHub Actions. Playwright (210 references) is the second most popular, enabling workflows to interact with web UIs and gather data from external sources.
Concurrency Patterns
Workflows with Concurrency Control: 100 (100%)
All workflows implement concurrency groups to prevent race conditions and resource conflicts. The standard pattern is:
This ensures that only one instance of a workflow runs per issue/PR, with new runs canceling in-progress ones.
Engine Distribution
While most workflows use the standard gh-aw engine, some explicitly specify alternative engines:
glossary-maintainer,poem-bot,technical-doc-writercloclo,daily-multi-device-docs-tester,unbloat-docs,smoke-claudechangeset,daily-factInsight: While the default engine handles most workflows, specialized engines are used for specific tasks - Copilot for creative/documentation work, Claude for complex analysis, and Codex for code-focused tasks.
Interesting Findings
1. Test Workflows Are Smallest
The smallest lock files are test and example workflows (81-106 KB), which makes sense as they validate specific features rather than implementing complex logic.
2. "Poem Bot" is an Outlier
At 605 KB,
poem-bot.lock.ymlis 2x larger than the average lock file. This suggests either extensive creative prompting, complex multi-turn interactions, or comprehensive example dialogues.3. High Manual Trigger Adoption
79% of workflows support manual triggering via
workflow_dispatch, indicating a strong emphasis on testability and developer control.4. Discussion-First Philosophy
With 36 workflows using
create-discussion(44% of safe-output workflows), there's a clear preference for creating durable, threaded discussions over ephemeral comments.5. Comprehensive Concurrency Control
100% of workflows implement concurrency groups, demonstrating mature workflow orchestration practices to prevent resource conflicts.
6. GitHub MCP Dominance
The GitHub MCP server appears 3,695 times across workflows, showing how deeply these agentic workflows integrate with GitHub's ecosystem.
Average Lockfile Profile
Based on statistical medians and modes, a "typical" agentic workflow lock file has:
Recommendations
1. Standardize Discussion Categories
Consider consolidating "audits", "Audits", and "audit" into a single canonical category for consistency.
2. Investigate Large Lock Files
Review workflows >400 KB (like
poem-bot) to determine if size optimizations are possible without sacrificing functionality.3. Document Engine Selection Criteria
Create guidelines for when to use Copilot vs. Claude vs. Codex engines based on task characteristics.
4. Leverage Schedule Patterns
With 61 scheduled workflows, ensure cron schedules are distributed across time slots to avoid resource contention.
5. MCP Server Documentation
Document the 9 MCP servers in use and provide examples of when each should be employed.
Methodology
Analysis Tools:
Data Sources:
.lock.ymlfiles from.github/workflows/directoryCache Memory:
/tmp/gh-aw/cache-memory/scripts//tmp/gh-aw/cache-memory/data/Lockfile Statistics Analysis Agent | Automated Statistical Analysis of Agentic Workflow Patterns
Beta Was this translation helpful? Give feedback.
All reactions