Skip to content

fix(app-platform): deployment-level component enum + visible log-fetch errors in Diagnostic#51

Merged
intel352 merged 1 commit intomainfrom
fix/component-enum-and-visible-log-errors
May 2, 2026
Merged

fix(app-platform): deployment-level component enum + visible log-fetch errors in Diagnostic#51
intel352 merged 1 commit intomainfrom
fix/component-enum-and-visible-log-errors

Conversation

@intel352
Copy link
Copy Markdown
Contributor

@intel352 intel352 commented May 2, 2026

Summary

Two issues with PR-E2's deploy-log fetch surfaced during real testing on core-dump staging (run 25259980715):

  1. deploymentComponents read from dep.Spec.<...> but dep.Spec is nil from ListDeployments. Falling through to [""] aggregate caused DO API to return no logs.
  2. GetLogs/HTTP-fetch failures wrote to os.Stderr which hashicorp/go-plugin captures at TRACE level — wfctl filters those, so operators saw an empty Troubleshoot block with no failure cause.

Fix

  • deploymentComponents now prefers dep.Services / .StaticSites / .Workers / .Jobs / .Functions (deployment-level arrays populated by both ListDeployments and GetDeployment), falls back to dep.Spec.*, then [""]. StaticSites included at deployment level since DeploymentStaticSite has a .Name field and can have build-time failures.
  • All four log-fetch failure modes (GetLogs API error, HTTP-fetch error, empty HistoricURLs, empty body) now append a brief failure note to Diagnostic.Detail so the failure cause shows up in operator-facing output. The existing stderr write is preserved for plugin-debug.

Test plan

  • go test ./internal/drivers/ — 9 new tests: 4 for deployment-level vs spec-level vs aggregate enumeration (including StaticSites) + 5 for failure-visibility paths (GetLogs error, empty HistoricURLs, HTTP 500, empty body, happy-path)
  • Updated 2 existing tests (GracefulOnGetLogsError, GracefulOnHTTPFetchError) whose "Detail should not contain log block" assertions inverted to "Detail should contain failure note"
  • go test ./... — full suite green
  • Validates against next core-dump staging deploy

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

…h errors in Diagnostic

Issue A: deploymentComponents read only from dep.Spec.* which is nil from
ListDeployments, causing the empty-aggregate fallback and DO API returning no
logs. Now prefers dep.Services/.StaticSites/.Workers/.Jobs/.Functions
(deployment-level, always populated), falls back to dep.Spec.*, then [""].

Issue B: GetLogs/HTTP-fetch failures wrote to os.Stderr captured at TRACE
level by hashicorp/go-plugin — invisible to operators. All four failure modes
(GetLogs API error, HTTP-fetch error, empty HistoricURLs, empty body) now
also append a brief failure note to Diagnostic.Detail.

Surfaced by core-dump staging deploy run 25259980715.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 2, 2026 19:56
@intel352 intel352 merged commit 66a6b71 into main May 2, 2026
5 checks passed
@intel352 intel352 deleted the fix/component-enum-and-visible-log-errors branch May 2, 2026 19:57
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes App Platform Troubleshoot deploy/build log collection by (1) enumerating components from deployment-level fields when dep.Spec is nil (common from ListDeployments) and (2) surfacing log-fetch failure reasons directly in operator-facing Diagnostic.Detail (while preserving stderr logging for plugin-debug).

Changes:

  • Update deploymentComponents to prefer deployment-level component lists (incl. StaticSites), then fall back to dep.Spec.*, then aggregate [""].
  • Make attachDeployLogs append brief, visible failure notes to Diagnostic.Detail for GetLogs errors, empty HistoricURLs, HTTP fetch errors, and empty bodies.
  • Add/adjust tests to cover enumeration precedence and the new failure-visibility behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
internal/drivers/app_platform.go Implements deployment-level component enumeration and appends visible failure notes in Troubleshoot log-fetch paths.
internal/drivers/app_platform_troubleshoot_test.go Adds focused unit tests for deploymentComponents and attachDeployLogs failure/happy paths.
internal/drivers/app_platform_test.go Updates existing Troubleshoot tests to assert visible failure notes rather than silent absence of log blocks.
CHANGELOG.md Documents the Troubleshoot log-fetch fixes under Unreleased.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CHANGELOG.md
Comment on lines +7 to +12
### Fixed

- **Troubleshoot deploy/build log fetch** — Fixed two issues that prevented log blocks from appearing in operator-facing Diagnostic output:
- `deploymentComponents` now reads component names from `dep.Services / .StaticSites / .Workers / .Jobs / .Functions` (deployment-level arrays populated by both ListDeployments and GetDeployment) before falling back to `dep.Spec.*` and finally `[""]` aggregate. Previously only Spec was inspected, which is nil from ListDeployments, so the empty-aggregate fallback was always hit and DO API returned no logs.
- GetLogs API errors, HTTP-fetch errors, empty HistoricURLs, and empty-body responses now append a brief failure note to `Diagnostic.Detail` (in addition to the existing stderr log, which is captured at hashicorp/go-plugin TRACE level and not surfaced to operators). Operators now see the failure mode in the same Troubleshoot block as the rest of the diagnostic output.

intel352 added a commit that referenced this pull request May 2, 2026
…lus closes the silent-drop bug) (#53)

Root cause: invokeDriverTroubleshoot in internal/module_instance.go serialized
Diagnostic to map[string]any but omitted the "detail" field. wfctl's
remoteResourceDriver.Troubleshoot reads m["detail"] → empty string →
emitDiagnostics never printed the Detail body. All work done in
attachDeployLogs (v0.8.3 + v0.8.4) to populate Diagnostic.Detail with deploy
log content + visible failure notes was silently dropped at the gRPC
boundary.

Symptom: core-dump staging deploy run 25261030300 (post-PR-E5 with v0.8.4)
emitted Troubleshoot block with the Cause line for each failed deployment
("Your deploy failed because your container exited with a non-zero exit
code") but no log content and no failure-mode notes — exactly the operator-
visible output we'd been working to fix for the past 4 plugin releases.

This is the structpb-boundary class of bug warned about in workspace memory
feedback_workflow_plugin_structpb_boundary: "typed slices ... in dispatch
args/Outputs are BLOCKERS under legacy compat". Detail wasn't a typed slice,
but the same pattern — fields in domain types that aren't enumerated in the
gRPC dispatch translator silently drop at the boundary.

Fix: one-line addition to the diagnostics list-builder in
invokeDriverTroubleshoot:

  diagList[i] = map[string]any{
      "id":     d.ID,
      "phase":  d.Phase,
      "cause":  d.Cause,
      "at":     d.At.Format(time.RFC3339),
      "detail": d.Detail,   // NEW
  }

Plus regression test:
TestGRPCDispatch_ResourceDriver_Troubleshoot_Detail_RoundTripFidelity uses a
new fakeTroubleshootingDriver and fakeProviderReturningDriver to exercise
the full grpcRoundTrip → InvokeMethod("ResourceDriver.Troubleshoot") →
serialized response path, asserting the "detail" field survives intact.

CHANGELOG also consolidated: removed the duplicate ### Fixed heading under
[Unreleased] that PR-E4 introduced (Copilot post-merge review on PR #51
flagged it; tracked as plugin#52). Now there is exactly one ### Added and
one ### Fixed under [Unreleased].

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants