-
Notifications
You must be signed in to change notification settings - Fork 48
Closed
Description
Q Workflow Optimization Report
Issues Found (from live data)
close-old-discussions
- Log Analysis: Run #19622681948 (2025-11-24T03:46:55Z)
- Run URL: https://github.com/githubnext/gh-aw/actions/runs/19622681948
- Issues Identified:
- Excessive MCP calls: Agent made 2+ paginated calls to
list_discussionsAPI - Large data transfer: Each call returned up to 100 discussions, sending potentially large amounts of data to LLM
- Token inefficiency: Discussion data (titles, bodies, metadata) sent to LLM even when most discussions don't match criteria
- Performance: Multiple round-trips to GitHub API during agent execution
- Excessive MCP calls: Agent made 2+ paginated calls to
Evidence from logs:
"name": "github-list_discussions",
"arguments": "{\"owner\": \"githubnext\", \"repo\": \"gh-aw\", \"perPage\": 100}"
"name": "github-list_discussions",
"arguments": "{\"owner\": \"githubnext\", \"repo\": \"gh-aw\", \"perPage\": 100, \"after\": \"Y3Vyc29yOnYyOpK0MjAyNS0xMS0xN1QwOToxNjoxOVrOAIuXUw==\"}"
Changes Made
close-old-discussions.md
Added custom step to pre-download and filter discussions:
- GraphQL query: Uses GitHub GraphQL API to fetch discussions in batches of 100
- Server-side filtering: Filters discussions by:
- Author:
github-actions[bot]only - Age: Created more than 7 days ago
- Author:
- Data reduction with jq: Reduces each discussion to only essential fields:
number,title,createdAt- Removes large fields like body, comments, reactions
- JSONL output: Saves filtered data to
/tmp/gh-aw/filtered-discussions.jsonl
Updated agent instructions:
- Agent now reads pre-filtered JSONL file instead of calling GitHub API
- No filtering logic needed in agent - data is already filtered
- Simple task: Read file and generate
close_discussionoutputs
Key optimization points:
- Pagination handled upfront: Custom step handles all pagination before agent runs
- Smart filtering with jq: Only necessary fields included in data sent to LLM
- Zero GitHub API calls from agent: All data pre-fetched and filtered
- Reduced token usage: LLM receives only matching discussions with minimal fields
Expected Improvements
Performance Metrics
- API calls reduced: From 2+ paginated calls during agent execution to 0
- Token usage reduced: Estimated 60-80% reduction by:
- Pre-filtering non-matching discussions
- Reducing data to essential fields only
- Eliminating API response overhead from LLM context
- Execution time: Faster agent execution (no API waiting)
- Reliability: More consistent performance regardless of total discussion count
Scalability
- Workflow now handles repositories with 100+ discussions efficiently
- LLM receives manageable data size regardless of total discussions
- Pagination handled once upfront vs multiple times during agent turns
Validation
✅ Workflow compiled successfully using gh aw compile:
✓ .github/workflows/close-old-discussions.md (238.0 KB)
[
{
"workflow": "close-old-discussions.md",
"valid": true,
"errors": [],
"warnings": []
}
]
Note: .lock.yml file will be generated automatically after merge.
Implementation Details
Custom Step Logic
The custom step uses a bash script that:
- Calculates cutoff date (7 days ago) using
datecommand - Iterates through discussion pages using GraphQL pagination
- Filters discussions inline using jq:
jq -r --arg cutoff "$CUTOFF_DATE" ' .data.repository.discussions.nodes | map(select( .author.login == "github-actions[bot]" and .createdAt < $cutoff )) | map({number, title, createdAt, author: .author.login}) '
- Merges results and removes duplicates
- Outputs JSONL format for easy parsing by agent
Security Considerations
- Uses
${{ github.token }}with existing permissions (discussions: read) - No elevated permissions required
- GraphQL query is safe and read-only
- Pagination safety limit: 10 pages (1000 discussions max)
References
- Issue: Addresses close discussions #4630 - "close discussions" optimization request
- Original request: Pre-download discussions with smart filtering using jq
- Log analysis:
/tmp/gh-aw/aw-mcp/logs/run-19622681948/ - Run investigated: https://github.com/githubnext/gh-aw/actions/runs/19622681948
Testing Recommendations
After merge, test the workflow with:
- Manual trigger via
workflow_dispatch - Verify discussions are correctly filtered in custom step logs
- Confirm agent successfully reads
/tmp/gh-aw/filtered-discussions.jsonl - Check that only matching discussions are closed
- Monitor token usage compared to previous runs
AI generated by Q
Note
This was originally intended as a pull request, but the git push operation failed.
Workflow Run: View run details and download patch artifact
The patch file is available as an artifact (aw.patch) in the workflow run linked above.
To apply the patch locally:
# Download the artifact from the workflow run https://github.com/githubnext/gh-aw/actions/runs/19622754502
# (Use GitHub MCP tools if gh CLI is not available)
gh run download 19622754502 -n aw.patch
# Apply the patch
git am aw.patchShow patch preview (196 of 196 lines)
From 7ed7c3a19b456a19688d6b34d9232024192c448a Mon Sep 17 00:00:00 2001
From: "github-actions[bot]" <github-actions[bot]@users.noreply.github.com>
Date: Mon, 24 Nov 2025 03:56:00 +0000
Subject: [PATCH] Optimize close-old-discussions workflow with pre-filtered
data
- Add custom step to pre-download discussions using GraphQL API
- Filter discussions by author (github-actions[bot]) and age (>7 days) before sending to LLM
- Use jq to reduce data size and prevent overwhelming the LLM
- Remove need for agent to make multiple paginated API calls
- Agent now only reads pre-filtered JSONL file
This addresses issue #4630 by reducing token usage and API calls.
---
.github/workflows/close-old-discussions.md | 146 +++++++++++++++------
1 file changed, 107 insertions(+), 39 deletions(-)
diff --git a/.github/workflows/close-old-discussions.md b/.github/workflows/close-old-discussions.md
index 3f428a8..c074db8 100644
--- a/.github/workflows/close-old-discussions.md
+++ b/.github/workflows/close-old-discussions.md
@@ -19,63 +19,131 @@ safe-outputs:
max: 100
timeout-minutes: 10
strict: true
+steps:
+ - name: Fetch open discussions
+ id: fetch-discussions
+ run: |
+ # Use GraphQL to fetch all open discussions in one query
+ # Filter to only get discussions created by github-actions[bot]
+ # Calculate cutoff date (7 days ago)
+ CUTOFF_DATE=$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ)
+
+ # Fetch discussions with pagination
+ DISCUSSIONS_FILE="/tmp/gh-aw/discussions.json"
+ echo '[]' > "$DISCUSSIONS_FILE"
+
+ CURSOR=""
+ HAS_NEXT_PAGE=true
+
+ while [ "$HAS_NEXT_PAGE" = "true" ]; do
+ if [ -z "$CURSOR" ]; then
+ CURSOR_ARG=""
+ else
+ CURSOR_ARG=", after: \"$CURSOR\""
+ fi
+
+ RESULT=$(gh api graphql -f query="
+ query {
+ repository(owner: \"${{ github.repository_owner }}\", name: \"${{ github.event.repository.name }}\") {
+
... (truncated)Reactions are currently unavailable