Skip to content

Surface diagnostics when endpoint references fail to resolve#17265

Open
IEvangelist wants to merge 1 commit into
microsoft:mainfrom
IEvangelist:dapine/fix-ts-apphost-endpoint-diagnostics
Open

Surface diagnostics when endpoint references fail to resolve#17265
IEvangelist wants to merge 1 commit into
microsoft:mainfrom
IEvangelist:dapine/fix-ts-apphost-endpoint-diagnostics

Conversation

@IEvangelist
Copy link
Copy Markdown
Member

Description

Fixes #15486

When a TypeScript (or C#) AppHost calls getEndpoint('https') on a resource that only exposes 'http', the dependent resource silently transitions to FailedToStart with no useful diagnostics — no PID, no exit code, aspire describe shows an empty environment, aspire logs <resource> is empty/uninformative, and the actual underlying exception ("The endpoint 'https' is not defined for the resource 'api-boston'.") is completely lost.

Root cause

GetEndpoint(name) intentionally returns a deferred EndpointReference (so forward-references work). The missing annotation only surfaces later, when EndpointReference.EndpointAnnotation is read during environment resolution. ExecutionConfigurationGathererContext.ResolveAsync catches that InvalidOperationException per-value and packages everything into Configuration.Exception (an AggregateException). DcpExecutor then threw a parameter-less FailedToApplyEnvironmentException() — no message, no inner — and the three catch blocks published an OnResourceFailedToStartContext event with no logging and no error detail. ApplicationOrchestrator.OnResourceFailedToStart only set the state text to FailedToStart.

Fix

End-to-end diagnostic propagation:

  1. EndpointReference.cs — The EndpointAnnotation getter now throws an InvalidOperationException whose message lists the available endpoint names (or "The resource has no endpoints defined." for resources with zero endpoints).
  2. DcpExecutorEvents.csOnResourceFailedToStartContext gains an optional string? ErrorMessage = null field (no breaking change for existing subscribers).
  3. DcpExecutor.cs — The two FailedToApplyEnvironmentException throw sites now pass a flattened message and the original configuration.Exception as the inner. All three catch (FailedToApplyEnvironmentException) blocks now log via the resource logger (LogError, no stack-trace noise) and propagate ex.Message through the new ErrorMessage field on the event. A new FormatConfigurationExceptionMessage helper flattens AggregateException (1 inner → its .Message; multiple → joined with ";"), filters null/blank inner messages, and falls back to the aggregate's own message (or the exception type name) when nothing meaningful is available.
  4. ApplicationOrchestrator.cs — When OnResourceFailedToStart receives a non-empty ErrorMessage, the resulting ResourceStateSnapshot carries KnownResourceStateStyles.Error. The state Text deliberately stays KnownResourceStates.FailedToStart so existing KnownResourceStates.TerminalStates checks and other callers that compare by exact text continue to work.

After this change

  • aspire logs <consumer> includes Failed to apply environment configuration for resource 'consumer'. The endpoint 'https' is not defined for the resource 'producer'. Available endpoints: 'http'.
  • The dashboard renders the FailedToStart row with the error style.
  • The original InvalidOperationException is preserved as InnerException on FailedToApplyEnvironmentException for anything inspecting it programmatically.

Validation

  • All 89 targeted tests pass (47 EndpointReferenceTests + 16 ApplicationOrchestratorTests + 26 ExecutionConfigurationGathererTests).
  • New tests added:
    • EndpointReferenceTests — 4 tests covering the new error message (single available endpoint, multiple, no endpoints defined, custom ErrorMessage override preserved).
    • ApplicationOrchestratorTests — 2 tests verifying the Error style is applied when ErrorMessage is set and null style is preserved when it isn't.
    • ExecutionConfigurationGathererTests — 1 integration test exercising the full WithEnvironment(producer.GetEndpoint("https")) → gatherer → ResolveAsync path (verifies configuration.Exception is an AggregateException whose inner is the descriptive InvalidOperationException and that wrapping it in FailedToApplyEnvironmentException(formatted, inner) preserves the original), plus 5 unit tests for FormatConfigurationExceptionMessage covering single-inner / multi-inner / nested / non-aggregate / all-blank-inner.
  • Build clean: 0 warnings, 0 errors.

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?

Copilot AI review requested due to automatic review settings May 19, 2026 18:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves diagnostics when an EndpointReference cannot resolve (e.g. consumer calls GetEndpoint("https") but the producer only has http). Previously, the resulting FailedToApplyEnvironmentException was thrown without a message and the orchestrator silently set state to FailedToStart with no logging. Now the underlying InvalidOperationException lists available endpoint names, the DCP executor logs an error and propagates the flattened message through OnResourceFailedToStartContext.ErrorMessage, and the orchestrator marks the state snapshot with KnownResourceStateStyles.Error so the dashboard surfaces the failure.

Changes:

  • EndpointReference.EndpointAnnotation now throws an InvalidOperationException enumerating the available endpoint names (or "no endpoints defined").
  • DcpExecutor wraps FailedToApplyEnvironmentException with a flattened message + inner exception, logs via the resource logger, and propagates ex.Message through the new optional ErrorMessage field on OnResourceFailedToStartContext; ApplicationOrchestrator applies the Error style when a message is present (keeping Text = FailedToStart).
  • Tests cover the new EndpointReference message variants, the orchestrator style behavior, and FormatConfigurationExceptionMessage for single/multi/nested/blank aggregates plus an integration test through the gatherer/resolve path.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Aspire.Hosting/ApplicationModel/EndpointReference.cs Builds a descriptive missing-endpoint message that lists available endpoint names.
src/Aspire.Hosting/Dcp/DcpExecutorEvents.cs Adds optional ErrorMessage to OnResourceFailedToStartContext.
src/Aspire.Hosting/Dcp/DcpExecutor.cs Wraps FailedToApplyEnvironmentException with message+inner, logs on catch, threads message through event; adds FormatConfigurationExceptionMessage helper.
src/Aspire.Hosting/Orchestrator/ApplicationOrchestrator.cs When ErrorMessage set, applies KnownResourceStateStyles.Error to the FailedToStart snapshot.
tests/Aspire.Hosting.Tests/EndpointReferenceTests.cs New tests for descriptive endpoint-not-found error message.
tests/Aspire.Hosting.Tests/ExecutionConfigurationGathererTests.cs Integration test through gatherer + unit tests for FormatConfigurationExceptionMessage.
tests/Aspire.Hosting.Tests/Orchestrator/ApplicationOrchestratorTests.cs Tests verifying Error style with/without ErrorMessage.

Comment thread src/Aspire.Hosting/Dcp/DcpExecutor.cs Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afscrome removed this before

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davidfowl — moved the per-value logging back into ExecutionConfigurationGathererContext.ResolveAsync (restoring @afscrome's pattern from #7032 that was lost in the builder refactor) and dropped the catch-site LogError calls. The catch sites now just forward ex.Message via OnResourceFailedToStartContext.ErrorMessage so the orchestrator can apply the Error style to the FailedToStart state. Also rebased onto latest main, which moved the throw sites into ExecutableCreator/ContainerCreator.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#15475 was supposed to fix this, but I screwed up my a rebase pushed up the wrong branch, so what was merged actually fixed #15477 instead

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17265

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17265"

Fixes microsoft#15486

When a TypeScript (or C#) AppHost calls getEndpoint() with an endpoint name that doesn't exist on the target resource, the dependent resource silently transitioned to FailedToStart with no useful diagnostics. The underlying exception (e.g. `The endpoint `https` is not defined for the resource `api`.`) was completely lost.

This change surfaces the diagnostic end-to-end:

* EndpointReference.EndpointAnnotation now includes the list of available endpoint names (or `no endpoints defined`) in the InvalidOperationException message.

* ExecutionConfigurationGathererContext.ResolveAsync logs per-value errors against the resource logger when an argument or env var fails to resolve, restoring the pattern previously added by @afscrome and removed during the builder-pattern refactor.

* ExecutableCreator wraps the underlying configuration exception into FailedToApplyEnvironmentException with the inner exception preserved (matching ContainerCreator).

* OnResourceFailedToStartContext gains an optional ErrorMessage forwarded from the catch site so consumers can react to the descriptive message.

* ApplicationOrchestrator.OnResourceFailedToStart applies KnownResourceStateStyles.Error to the state snapshot when an error message is provided, so the dashboard renders FailedToStart with the error style.

The state Text remains KnownResourceStates.FailedToStart so existing checks against KnownResourceStates.TerminalStates continue to work.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@IEvangelist IEvangelist force-pushed the dapine/fix-ts-apphost-endpoint-diagnostics branch from e18178c to 149fd2c Compare May 19, 2026 19:20
@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

Copy link
Copy Markdown
Contributor

@karolz-ms karolz-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for taking care of this, David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TypeScript AppHost: invalid endpoint name in getEndpoint()/withEnvironmentEndpoint() can cause silent FailedToStart with no diagnostics

6 participants