Fix OTLP endpoint resolution in isolated mode#16367
Conversation
When running under isolated mode (DcpPublisher__RandomizePorts=true), the dashboard listens on a randomized port that differs from the statically configured OTLP endpoint URL. Resources were still using the configured URL, causing traces and metrics to not reach the dashboard. Resolve the OTLP endpoint from the dashboard resource's EndpointReference in the DistributedApplicationModel when available. EndpointReference resolves lazily to the actual allocated URL, so it picks up the correct port regardless of randomization. Extract the dashboard endpoint lookup into a shared helper used by both RegisterOtlpEnvironment (all resources) and the container-specific OtlpEndpointReferenceGatherer. Fixes #16037
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 16367Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 16367" |
There was a problem hiding this comment.
Pull request overview
Fixes OTLP exporter endpoint resolution when running in isolated mode (randomized proxy ports) by preferring the Aspire Dashboard’s allocated OTLP endpoint from the DistributedApplicationModel (via EndpointReference) and falling back to the existing config-based resolution when the dashboard isn’t present.
Changes:
- Resolve
OTEL_EXPORTER_OTLP_ENDPOINTfrom the dashboard resource’s OTLPEndpointReference(allocated URL) when available. - Extract shared dashboard OTLP endpoint lookup into
ResolveOtlpEndpointFromDashboardand reuse it from both env-var registration and DCP container gathering. - Add/update tests covering dashboard endpoint resolution, preference rules, and fallback behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| tests/Aspire.Hosting.Tests/WithOtlpExporterTests.cs | Adds coverage for resolving OTLP endpoint from dashboard allocated endpoints (including preference + fallback cases). |
| src/Aspire.Hosting/OtlpConfigurationExtensions.cs | Uses dashboard EndpointReference to set OTLP endpoint/protocol when a dashboard is in the model; adds shared resolver helper. |
| src/Aspire.Hosting/Dcp/OtlpEndpointReferenceGatherer.cs | Reuses the shared dashboard OTLP endpoint resolver and fixes doc comment typos. |
…oard Catch InvalidOperationException when accessing ServiceProvider, which throws when the DI container hasn't been built yet (e.g. env var evaluation in tests without a fully built host). Falls back to config-based resolution.
|
@karolz-ms I think this change means the OTLP endpoint gather you added can be removed. There are no failing tests, and I tests an app and saw it use a different endpoint for containers with OTLP. Can you confirm this change is ok? |
| { | ||
| model = context.ExecutionContext.ServiceProvider.GetService<DistributedApplicationModel>(); | ||
| } | ||
| catch (InvalidOperationException) |
There was a problem hiding this comment.
Take a look at ExecutionContext.ServiceProvider property. It throws if not in a completed state.
| return null; | ||
| } | ||
|
|
||
| var dashboardResource = model.Resources.SingleOrDefault(r => StringComparers.ResourceName.Equals(r.Name, KnownResourceNames.AspireDashboard)) as IResourceWithEndpoints; |
There was a problem hiding this comment.
As much as I don't want to claim performance is an issue, this will loop over all resources to find the dashboard for each resource that uses it.
There was a problem hiding this comment.
This is true but I think it is also inevitable--we do not control at which point in the model lifecycle AddOtlpEnvironment() is called and thus whether the dashboard resource is present or not. So we have to search for it each time.
There was a problem hiding this comment.
I know but I can still complain about it 😄
| var (url, protocol) = OtlpEndpointResolver.ResolveOtlpEndpoint(configuration, otlpExporterAnnotation.RequiredProtocol); | ||
| context.EnvironmentVariables[KnownOtelConfigNames.ExporterOtlpEndpoint] = new HostUrl(url); | ||
| context.EnvironmentVariables[KnownOtelConfigNames.ExporterOtlpProtocol] = protocol; | ||
| var dashboardEndpoint = ResolveOtlpEndpointFromDashboard(context, otlpExporterAnnotation.RequiredProtocol); |
There was a problem hiding this comment.
What about the otlp collector? Does this break that?
There was a problem hiding this comment.
I added a test. No it doesn't break it.
karolz-ms
left a comment
There was a problem hiding this comment.
Really like this change, thank you James
| return null; | ||
| } | ||
|
|
||
| var grpcEndpoint = dashboardResource.GetEndpoint(KnownEndpointNames.OtlpGrpcEndpointName); |
There was a problem hiding this comment.
The reason why this works and makes me happy is that GetEndpoint() returns an EndpointReference that is not tied to any particular network. That reference is subsequently resolved in the context of the network that the resource using the dashboard is connected to.
…d use StringComparison
|
🎬 CLI E2E Test Recordings — 74 recordings uploaded (commit View recordings
📹 Recordings uploaded automatically from CI run #24808814992 |
|
No documentation PR is required for this change. Reason: This is a bug fix (OTLP endpoint resolution in isolated mode) with no user-facing documentation impact:
|
Description
When running under isolated mode (
aspire start --isolated, which setsDcpPublisher__RandomizePorts=true), DCP assigns random ports to all proxied endpoints. The dashboard OTLP endpoint therefore listens on a different port than the statically configuredASPIRE_DASHBOARD_OTLP_ENDPOINT_URL. Resources were still using the configured URL viaHostUrl, causing traces and metrics to never reach the dashboard.This PR resolves the OTLP endpoint from the dashboard resource's
EndpointReferencein theDistributedApplicationModelwhen available.EndpointReferenceimplementsIValueProviderand resolves lazily to the actual allocated URL, so it picks up the correct port regardless of randomization. When no dashboard resource is present (e.g. tests, publish), it falls back to the existingHostUrl-based config behavior.The dashboard endpoint lookup is extracted into a shared helper (
ResolveOtlpEndpointFromDashboard) used by bothRegisterOtlpEnvironment(all resources) and the container-specificOtlpEndpointReferenceGatherer, eliminating duplicated logic.Fixes #16037
Checklist