fix: count caps, trim-by-date, upload-asset limit, & timeout by Copilot · Pull Request #28916 · github/gh-aw

Copilot · 2026-04-28T10:59:22Z

Eight correctness bugs across agentic-optimization-kit.md, copilot-token-optimizer.md, and agentic-observability-kit.md — silent data truncation, a cooldown window that breaks under load, and a Monday schedule collision causing concurrent MCP rate-limit pressure.

`agentic-optimization-kit.md`

timeout 35 → 50 min
upload-asset declared (max: 5, allowed-exts: [.png]); Phase 4 uploads 5 charts but gateway limit was absent — silent drop on the 5th
Phase 3 count 400 → 1000; 400 runs exhaust in ~6 days on an active repo
Hardcoded example in optimization-log entry replaced with <TODAY> / <WORKFLOW_NAME> / <TOTAL_TOKENS> / etc. placeholders — the literal 2026-04-28 / daily-observability-report was today's date and copyable verbatim
Trim by date not count: "keep only last 30 entries" → keep entries newer than 30 days; count-based trim silently evicts entries still inside the 14-day cooldown window
Phase timing checkpoints — date logging at phase start/end to get empirical wall-clock data before any future timeout tuning
Jaccard — "Jaccard or cosine similarity" → "Jaccard similarity on token sets" for workflow_overlap_score

gh api example added for reading workflow source files:

gh api repos/{owner}/{repo}/contents/.github/workflows/WORKFLOW_NAME.md \
  --jq '.content' | base64 -d

`copilot-token-optimizer.md`

jq null guard — map(.tokens) | add returns null on empty input; added // 0 after every add in the pre-aggregation script (AOK already had this fix)
Fetch count -c 50 → -c 200; 50 runs covers ~3 days, biasing target selection
Trim by date — same 30-day date-window fix as AOK

`agentic-observability-kit.md`

Schedule weekly on monday → weekly on tuesday — eliminates the concurrent Monday MCP logs call collision with agentic-optimization-kit at zero structural cost

- agentic-optimization-kit: raise timeout 35→50 min - agentic-optimization-kit: add upload-asset max:5 allowed-exts:[.png] to safe-outputs - agentic-optimization-kit: Phase 3 logs count 400→1000 - agentic-optimization-kit: replace hardcoded date/workflow in optimization-log example with placeholders - agentic-optimization-kit: trim optimization-log by 30-day date window instead of 30-entry count - agentic-optimization-kit: add Phase Timing Checkpoints section - agentic-optimization-kit: specify Jaccard similarity (not "Jaccard or cosine") - agentic-optimization-kit: add gh api example for reading workflow source - copilot-token-optimizer: add // 0 null guard after jq add in pre-aggregation - copilot-token-optimizer: fetch count -c 50 → -c 200 - copilot-token-optimizer: trim optimization-log by 30-day date window instead of count - agentic-observability-kit: schedule monday → tuesday to stagger MCP rate-limit load Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d5b6bfb2-47bf-4aa2-a24c-223cd4459d6a Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Copilot

Pull request overview

Fixes multiple correctness and operational issues in the Agentic Optimization/Observability kits and the Copilot token optimizer workflows (null-safe jq aggregation, higher run caps, safer examples/log retention guidance, asset upload limits, timeout tuning, and schedule deconfliction).

Changes:

Increased gh aw logs fetch limits and made jq aggregations null-safe (add // 0) for Copilot token optimizer.
Updated Agentic Optimization Kit guardrails: timeout, upload-asset constraints, larger Phase 3 count, placeholder-based examples, date-window trimming guidance, and added phase timing checkpoints + workflow-source fetch example.
Moved Agentic Observability Kit weekly schedule from Monday to Tuesday to avoid collisions.

Show a summary per file

File	Description
.github/workflows/copilot-token-optimizer.md	Raises audit run cap, hardens jq aggregation, updates optimization-log retention guidance
.github/workflows/copilot-token-optimizer.lock.yml	Regenerated workflow lock with updated log cap and jq aggregation changes
.github/workflows/agentic-optimization-kit.md	Adds upload-asset limit + timeout increase, doc/guardrail refinements, phase timing guidance, placeholders
.github/workflows/agentic-optimization-kit.lock.yml	Regenerated lock reflecting timeout and upload-asset restriction changes
.github/workflows/agentic-observability-kit.md	Shifts weekly schedule to Tuesday
.github/workflows/agentic-observability-kit.lock.yml	Regenerated lock reflecting new Tuesday cron schedule

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 6/6 changed files
Comments generated: 4

Copilot · 2026-04-28T11:05:26Z

+To fetch a workflow file, run (replacing `WORKFLOW_NAME` with the actual filename without extension):
+
+```bash
+gh api repos/{owner}/{repo}/contents/.github/workflows/WORKFLOW_NAME.md \


The gh api repos/{owner}/{repo}/... example won’t work when copy/pasted unless {owner} and {repo} are also replaced. Consider either explicitly calling out that all placeholders must be replaced, or use an example that derives the repo from context (e.g., repos/$GITHUB_REPOSITORY/...) so only WORKFLOW_NAME needs editing.

Suggested change

To fetch a workflow file, run (replacing `WORKFLOW_NAME` with the actual filename without extension):

```bash

gh api repos/{owner}/{repo}/contents/.github/workflows/WORKFLOW_NAME.md \

To fetch a workflow file, run (replacing `WORKFLOW_NAME` with the actual filename without extension; the repository is derived from `$GITHUB_REPOSITORY`):

```bash

gh api "repos/$GITHUB_REPOSITORY/contents/.github/workflows/WORKFLOW_NAME.md" \

Copilot · 2026-04-28T11:05:26Z

          ]
          | group_by(.workflow_name)
          | map({
              workflow_name: .[0].workflow_name,


jq's group_by(.workflow_name) requires the input array to be sorted by the grouping key; otherwise runs with the same workflow name that are non-adjacent will end up in different groups and totals/averages will be wrong. Add a sort_by(.workflow_name) (or equivalent) immediately before group_by(.workflow_name) in this pipeline (as done in agentic-optimization-kit.md).

See below for a potential fix:

] | sort_by(.workflow_name)

Copilot · 2026-04-28T11:05:26Z

+        run: "set -euo pipefail\nmkdir -p /tmp/gh-aw/token-audit\n\necho \"📥 Downloading Copilot workflow logs (last 7 days)...\"\n\nLOGS_EXIT=0\ngh aw logs \\\n  --engine copilot \\\n  --start-date -7d \\\n  --json \\\n  -c 200 \\\n  > /tmp/gh-aw/token-audit/all-runs.json || LOGS_EXIT=$?\n\nif [ -s /tmp/gh-aw/token-audit/all-runs.json ]; then\n  TOTAL=$(jq '.runs | length' /tmp/gh-aw/token-audit/all-runs.json)\n  echo \"✅ Downloaded $TOTAL Copilot workflow runs (last 7 days)\"\n  if [ \"$LOGS_EXIT\" -ne 0 ]; then\n    echo \"⚠️ gh aw logs exited with code $LOGS_EXIT (partial results — likely API rate limit)\"\n  fi\nelse\n  echo \"❌ No log data downloaded (exit code $LOGS_EXIT)\"\n  echo '{\"runs\":[],\"summary\":{}}' > /tmp/gh-aw/token-audit/all-runs.json\nfi\n"
      - name: Pre-aggregate top workflows by token usage
-        run: "set -euo pipefail\nmkdir -p /tmp/gh-aw/token-audit\n\njq '{\n  generated_at: (now | todateiso8601),\n  window_days: 7,\n  top_workflows: (\n    [.runs[]\n      | select(.status == \"completed\")\n      | {\n          workflow_name: .workflow_name,\n          tokens: (.token_usage // 0),\n          cost: (.estimated_cost // 0),\n          turns: (.turns // 0),\n          action_minutes: (.action_minutes // 0)\n        }\n    ]\n    | group_by(.workflow_name)\n    | map({\n        workflow_name: .[0].workflow_name,\n        run_count: length,\n        total_tokens: (map(.tokens) | add),\n        avg_tokens: ((map(.tokens) | add) / length),\n        total_cost: (map(.cost) | add),\n        total_turns: (map(.turns) | add),\n        total_action_minutes: (map(.action_minutes) | add)\n      })\n    | sort_by(.total_tokens)\n    | reverse\n    | .[:10]\n  )\n}' /tmp/gh-aw/token-audit/all-runs.json > /tmp/gh-aw/token-audit/top-workflows.json\n\necho \"✅ Generated top workflow summary at /tmp/gh-aw/token-audit/top-workflows.json\"\njq '.top_workflows' /tmp/gh-aw/token-audit/top-workflows.json\n"
+        run: "set -euo pipefail\nmkdir -p /tmp/gh-aw/token-audit\n\njq '{\n  generated_at: (now | todateiso8601),\n  window_days: 7,\n  top_workflows: (\n    [.runs[]\n      | select(.status == \"completed\")\n      | {\n          workflow_name: .workflow_name,\n          tokens: (.token_usage // 0),\n          cost: (.estimated_cost // 0),\n          turns: (.turns // 0),\n          action_minutes: (.action_minutes // 0)\n        }\n    ]\n    | group_by(.workflow_name)\n    | map({\n        workflow_name: .[0].workflow_name,\n        run_count: length,\n        total_tokens: (map(.tokens) | add // 0),\n        avg_tokens: ((map(.tokens) | add // 0) / length),\n        total_cost: (map(.cost) | add // 0),\n        total_turns: (map(.turns) | add // 0),\n        total_action_minutes: (map(.action_minutes) | add // 0)\n      })\n    | sort_by(.total_tokens)\n    | reverse\n    | .[:10]\n  )\n}' /tmp/gh-aw/token-audit/all-runs.json > /tmp/gh-aw/token-audit/top-workflows.json\n\necho \"✅ Generated top workflow summary at /tmp/gh-aw/token-audit/top-workflows.json\"\njq '.top_workflows' /tmp/gh-aw/token-audit/top-workflows.json\n"


The pre-aggregation jq snippet uses group_by(.workflow_name) without first sorting by workflow_name. In jq, group_by only groups adjacent items, so totals can be incorrect unless the array is sorted by the same key first. Insert | sort_by(.workflow_name) before | group_by(.workflow_name) inside this jq program.

Suggested change

run: "set -euo pipefail\nmkdir -p /tmp/gh-aw/token-audit\n\njq '{\n generated_at: (now | todateiso8601),\n window_days: 7,\n top_workflows: (\n [.runs[]\n | select(.status == \"completed\")\n | {\n workflow_name: .workflow_name,\n tokens: (.token_usage // 0),\n cost: (.estimated_cost // 0),\n turns: (.turns // 0),\n action_minutes: (.action_minutes // 0)\n }\n ]\n | group_by(.workflow_name)\n | map({\n workflow_name: .[0].workflow_name,\n run_count: length,\n total_tokens: (map(.tokens) | add // 0),\n avg_tokens: ((map(.tokens) | add // 0) / length),\n total_cost: (map(.cost) | add // 0),\n total_turns: (map(.turns) | add // 0),\n total_action_minutes: (map(.action_minutes) | add // 0)\n })\n | sort_by(.total_tokens)\n | reverse\n | .[:10]\n )\n}' /tmp/gh-aw/token-audit/all-runs.json > /tmp/gh-aw/token-audit/top-workflows.json\n\necho \"✅ Generated top workflow summary at /tmp/gh-aw/token-audit/top-workflows.json\"\njq '.top_workflows' /tmp/gh-aw/token-audit/top-workflows.json\n"

run: "set -euo pipefail\nmkdir -p /tmp/gh-aw/token-audit\n\njq '{\n generated_at: (now | todateiso8601),\n window_days: 7,\n top_workflows: (\n [.runs[]\n | select(.status == \"completed\")\n | {\n workflow_name: .workflow_name,\n tokens: (.token_usage // 0),\n cost: (.estimated_cost // 0),\n turns: (.turns // 0),\n action_minutes: (.action_minutes // 0)\n }\n ]\n | sort_by(.workflow_name)\n | group_by(.workflow_name)\n | map({\n workflow_name: .[0].workflow_name,\n run_count: length,\n total_tokens: (map(.tokens) | add // 0),\n avg_tokens: ((map(.tokens) | add // 0) / length),\n total_cost: (map(.cost) | add // 0),\n total_turns: (map(.turns) | add // 0),\n total_action_minutes: (map(.action_minutes) | add // 0)\n })\n | sort_by(.total_tokens)\n | reverse\n | .[:10]\n )\n}' /tmp/gh-aw/token-audit/all-runs.json > /tmp/gh-aw/token-audit/top-workflows.json\n\necho \"✅ Generated top workflow summary at /tmp/gh-aw/token-audit/top-workflows.json\"\njq '.top_workflows' /tmp/gh-aw/token-audit/top-workflows.json\n"

Copilot · 2026-04-28T11:05:27Z


+## Phase Timing Checkpoints
+
+At the start of every phase, log a timestamp with `echo "⏱️ Phase N start: $(date -u +%H:%M:%SZ)"`. At the end of each phase, log elapsed time with `echo "✅ Phase N complete: $(date -u +%H:%M:%SZ)"`. This helps identify which phases consume the most wall-clock time so future timeouts can be sized accurately.


This section says to "log elapsed time" at phase end, but the suggested command logs only another timestamp (no duration). Either adjust the wording to “log a timestamp” or change the guidance to actually compute and print elapsed time (e.g., capture a start epoch/SECONDS and subtract at phase end).

Suggested change

At the start of every phase, log a timestamp with `echo "⏱️ Phase N start: $(date -u +%H:%M:%SZ)"`. At the end of each phase, log elapsed time with `echo "✅ Phase N complete: $(date -u +%H:%M:%SZ)"`. This helps identify which phases consume the most wall-clock time so future timeouts can be sized accurately.

At the start of every phase, log a timestamp with `echo "⏱️ Phase N start: $(date -u +%H:%M:%SZ)"`. At the end of each phase, log an end timestamp with `echo "✅ Phase N complete: $(date -u +%H:%M:%SZ)"`. These checkpoints help identify which phases consume the most wall-clock time by comparing start and end times so future timeouts can be sized accurately.

Copilot AI assigned Copilot and mnkiefer Apr 28, 2026

Copilot created this pull request from a session on behalf of mnkiefer April 28, 2026 10:59 View session

mnkiefer approved these changes Apr 28, 2026

View reviewed changes

mnkiefer marked this pull request as ready for review April 28, 2026 11:01

Copilot AI review requested due to automatic review settings April 28, 2026 11:01

Merge branch 'main' into copilot/deep-dive-agentic-optimization-kit

ac0cc4e

Copilot started reviewing on behalf of mnkiefer April 28, 2026 11:01 View session

mnkiefer changed the title ~~Fix AOK bugs: jq null propagation, count caps, hardcoded example, trim-by-date, upload-asset limit, timeout, and schedule collision~~ fix: count caps, trim-by-date, upload-asset limit, & timeout Apr 28, 2026

github-actions Bot mentioned this pull request Apr 28, 2026

[aw] No-Op Runs #28859

Closed

Copilot AI reviewed Apr 28, 2026

View reviewed changes

mnkiefer closed this Apr 28, 2026

github-actions Bot added the ai-inspected label Apr 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: count caps, trim-by-date, upload-asset limit, & timeout#28916

fix: count caps, trim-by-date, upload-asset limit, & timeout#28916
Copilot wants to merge 2 commits intomainfrom
copilot/deep-dive-agentic-optimization-kit

Copilot AI commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		## Phase Timing Checkpoints

		At the start of every phase, log a timestamp with `echo "⏱️ Phase N start: $(date -u +%H:%M:%SZ)"`. At the end of each phase, log elapsed time with `echo "✅ Phase N complete: $(date -u +%H:%M:%SZ)"`. This helps identify which phases consume the most wall-clock time so future timeouts can be sized accurately.

Conversation

Copilot AI commented Apr 28, 2026

agentic-optimization-kit.md

copilot-token-optimizer.md

agentic-observability-kit.md

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`agentic-optimization-kit.md`

`copilot-token-optimizer.md`

`agentic-observability-kit.md`