Skip to content

[aw-failures] [aw] Fix aw-portfolio-yield: placeholder npm package + empty OTel endpoint break compile and runtime #31456

@github-actions

Description

@github-actions

Problem

.github/workflows/aw-portfolio-yield.md (introduced by #31363 on 2026-05-11 ~05:35 UTC) imports .github/workflows/shared/otel-observability.md, which contains:

  • a placeholder npm package @your-org/otel-query-mcp that does not exist on the npm registry; and
  • an unconditional gateway.opentelemetry.endpoint set to ${{ secrets.OTLP_ENDPOINT }}, which resolves to the empty string in this repo (the secret is unset).

This causes two separate downstream failures:

  1. Compile-time: Every run of the Agentic Maintenance workflow's compile-workflows job now fails with:

    .github/workflows/aw-portfolio-yield.md:1:1: error: runtime package validation failed
      npx package '`@your-org/otel-query-mcp`' not found on npm registry: npm error code E404
    ✗ compilation failed
    ##[error]Process completed with exit code 1.
    

    Reproduced in §25657460521 and §25654651003; the last green Agentic Maintenance run §25653034240 was at 05:58 UTC, 5 minutes before Add Agentic Workflow Portfolio Yield workflow #31363 was pushed.

  2. Runtime: Agentic Workflow Portfolio Yield itself can't start because the MCP Gateway (v0.3.6) rejects its config:

    config:validation_schema Schema validation failed:
      jsonschema: '/gateway/opentelemetry/endpoint' does not validate with .../endpoint/minLength: length must be >= 1, but got 0
      Error: does not match pattern '^((redacted)+|\${[A-Za-z_][A-Za-z0-9_]*})$'
    failed to load config: Configuration validation error (MCP Gateway version: v0.3.6)
    ##[error]Process completed with exit code 1.
    

    Reproduced in §25654663141. Auto-issue [aw] Agentic Workflow Portfolio Yield failed #31439 tracks the symptom; this issue tracks the root cause and fix.

Affected workflows

  • Agentic Maintenance — entire compile-workflows job and everything downstream of it is now skipped on every schedule. This is high-blast-radius: maintenance includes cleanup-cache-memory, close-expired-entities, update_pull_request_branches, apply_safe_outputs, close_agentic_workflows_issues — all currently not running.
  • Agentic Workflow Portfolio Yield — 100% failure rate (can't even reach the agent step).
  • Any other workflow that imports shared/otel-observability.md is at the same risk.

Affected run IDs

  • §25657460521 — Agentic Maintenance (schedule, 07:52 UTC)
  • §25654651003 — Agentic Maintenance (push, 06:44 UTC)
  • §25654663141 — Agentic Workflow Portfolio Yield (workflow_dispatch, 06:44 UTC)

Probable root cause

PR #31363 was authored against a template/example version of shared/otel-observability.md whose values (@your-org/otel-query-mcp, hard OTLP_ENDPOINT reference) were never replaced with real ones before merge. Compile-time package validation was bypassed because the new file didn't itself fail compile — aw-portfolio-yield.md's lock file was already present in the merge commit, and the compile-workflows regeneration didn't run until the next scheduled Agentic Maintenance.

Proposed remediation

Pick whichever scope matches intent:

  1. Minimum fix (P0) — Edit .github/workflows/shared/otel-observability.md:

    • Remove the mcp-servers.otel block, or replace @your-org/otel-query-mcp with the real published MCP server package name.
    • Wrap or remove the observability.otlp.endpoint block so the gateway config is omitted when OTLP_ENDPOINT is unset (or write a literal (redacted) URL).
    • Run gh aw compile .github/workflows/aw-portfolio-yield.md locally and confirm 0 errors before committing.
  2. Defense in depth (P1) — Add a CI guard so any new shared file referencing an unpublished npm package is rejected at PR time, not at the next scheduled Agentic Maintenance. The Agentic Maintenance compile-workflows job already does this validation; gating PR merges on it would have caught Add Agentic Workflow Portfolio Yield workflow #31363 before merge.

Success criteria / verification

  • gh aw compile .github/workflows/aw-portfolio-yield.md exits 0 with no runtime.packages errors.
  • gh aw compile over the whole repo produces ✗ Compiled N workflow(s): 0 error(s) again.
  • A fresh workflow_dispatch of Agentic Workflow Portfolio Yield advances past the MCP Gateway startup step (gateway logs do not contain Configuration validation error).
  • The next scheduled Agentic Maintenance run reports compile-workflows success.

References

Related to #30961

Generated by [aw] Failure Investigator (6h) · ● 39.5M ·

  • expires on May 18, 2026, 8:17 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions