Skip to content

[WIP] Fix Daily Model Inventory Checker Copilot CLI silent startup crash#30052

Closed
Copilot wants to merge 1 commit intomainfrom
copilot/fix-copilot-cli-startup-crash
Closed

[WIP] Fix Daily Model Inventory Checker Copilot CLI silent startup crash#30052
Copilot wants to merge 1 commit intomainfrom
copilot/fix-copilot-cli-startup-crash

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 4, 2026

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.


This section details on the original issue you should resolve

<issue_title>[aw-failures] Daily Model Inventory Checker: Copilot CLI silent startup crash (exit code 1)</issue_title>
<issue_description>### Problem Statement

The Daily Model Inventory Checker workflow fails with a 100% failure rate over the last 6 hours. The Copilot CLI agent binary exits silently with code 1 — producing zero stdout and zero stderr — preventing the agent from executing any logic.

Parent report: #30042

Affected Runs

Run Trigger Conclusion Duration
§25294739769 schedule (main) failure 2.2m
§25294350506 workflow_dispatch (main) failure 2.1m

Both failures are on main branch. The pattern is consistent across different trigger types, indicating the issue is environment/config-level, not trigger-specific.

Symptoms

  • All data-collection jobs succeed: collect_anthropic_models, collect_openai_models, collect_gemini_models, collect_copilot_models all exit cleanly
  • models.json artifact: 54 bytes (effectively empty — no model data written)
  • Agent job: 0 turns, 0 tool calls, 0 tokens consumed
  • Copilot CLI binary starts (PID assigned) then immediately exits with code 1
  • Harness message: "no output produced — not retrying (possible causes: binary not found, permission denied, auth failure, or silent startup crash)"
  • Network: 2 requests to api.githubcopilot.com:443 (allowed), 0 blocked — no substantive API calls

Probable Root Cause

The harness identifies four possible causes for zero-output exit code 1:

  1. Auth failure — Copilot CLI token expired or invalid (most likely given api.githubcopilot.com was reached but no calls were made)
  2. Binary incompatibility — Copilot CLI v1.0.40 may have a runtime dependency issue in the current container image
  3. Empty input — the 54-byte models.json (empty) may cause the agent's prompt-injection step to abort before generating output
  4. Permission denied — less likely since PID was assigned

The most actionable hypothesis is Copilot CLI auth token expiry — the binary connects to the Copilot API endpoint but fails to authenticate before producing any output.

Proposed Remediation

  1. Verify Copilot CLI credentials: Check that the Copilot auth token used in the scheduled workflow is valid and not expired. Rotate if necessary.
  2. Add startup health check: Run copilot auth status or equivalent as a pre-agent step to surface auth failures with a clear error message before the agent job.
  3. Diagnose empty models.json: Add validation to data-collection jobs to fail loudly if models.json is < 100 bytes, rather than silently passing an empty artifact to the agent.
  4. Enable harness retry for startup crashes: Consider adding one retry with a forced re-auth step when exit code 1 + zero output is detected.

Success Criteria

  • Daily Model Inventory Checker completes at least one full successful run on schedule
  • models.json contains non-empty model inventory data
  • Agent produces > 0 turns and creates an output artifact
  • Root cause (auth vs binary vs empty input) confirmed from diagnostics

Generated by [aw] Failure Investigator (6h) · ● 594.5K ·

  • expires on May 11, 2026, 1:33 AM UTC

Comments on the Issue (you are @copilot in this section)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw-failures] Daily Model Inventory Checker: Copilot CLI silent startup crash (exit code 1)

3 participants