Skip to content

Fix OTLP endpoint resolution in isolated mode#16367

Merged
JamesNK merged 5 commits intomainfrom
fix/otlp-endpoint-isolated-mode
Apr 23, 2026
Merged

Fix OTLP endpoint resolution in isolated mode#16367
JamesNK merged 5 commits intomainfrom
fix/otlp-endpoint-isolated-mode

Conversation

@JamesNK
Copy link
Copy Markdown
Member

@JamesNK JamesNK commented Apr 22, 2026

Description

When running under isolated mode (aspire start --isolated, which sets DcpPublisher__RandomizePorts=true), DCP assigns random ports to all proxied endpoints. The dashboard OTLP endpoint therefore listens on a different port than the statically configured ASPIRE_DASHBOARD_OTLP_ENDPOINT_URL. Resources were still using the configured URL via HostUrl, causing traces and metrics to never reach the dashboard.

This PR resolves the OTLP endpoint from the dashboard resource's EndpointReference in the DistributedApplicationModel when available. EndpointReference implements IValueProvider and resolves lazily to the actual allocated URL, so it picks up the correct port regardless of randomization. When no dashboard resource is present (e.g. tests, publish), it falls back to the existing HostUrl-based config behavior.

The dashboard endpoint lookup is extracted into a shared helper (ResolveOtlpEndpointFromDashboard) used by both RegisterOtlpEnvironment (all resources) and the container-specific OtlpEndpointReferenceGatherer, eliminating duplicated logic.

Fixes #16037

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
    • No
  • Does the change require an update in our Aspire docs?
    • Yes
    • No

When running under isolated mode (DcpPublisher__RandomizePorts=true),
the dashboard listens on a randomized port that differs from the
statically configured OTLP endpoint URL. Resources were still using
the configured URL, causing traces and metrics to not reach the
dashboard.

Resolve the OTLP endpoint from the dashboard resource's
EndpointReference in the DistributedApplicationModel when available.
EndpointReference resolves lazily to the actual allocated URL,
so it picks up the correct port regardless of randomization.

Extract the dashboard endpoint lookup into a shared helper used by
both RegisterOtlpEnvironment (all resources) and the container-specific
OtlpEndpointReferenceGatherer.

Fixes #16037
Copilot AI review requested due to automatic review settings April 22, 2026 02:56
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 16367

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 16367"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes OTLP exporter endpoint resolution when running in isolated mode (randomized proxy ports) by preferring the Aspire Dashboard’s allocated OTLP endpoint from the DistributedApplicationModel (via EndpointReference) and falling back to the existing config-based resolution when the dashboard isn’t present.

Changes:

  • Resolve OTEL_EXPORTER_OTLP_ENDPOINT from the dashboard resource’s OTLP EndpointReference (allocated URL) when available.
  • Extract shared dashboard OTLP endpoint lookup into ResolveOtlpEndpointFromDashboard and reuse it from both env-var registration and DCP container gathering.
  • Add/update tests covering dashboard endpoint resolution, preference rules, and fallback behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/Aspire.Hosting.Tests/WithOtlpExporterTests.cs Adds coverage for resolving OTLP endpoint from dashboard allocated endpoints (including preference + fallback cases).
src/Aspire.Hosting/OtlpConfigurationExtensions.cs Uses dashboard EndpointReference to set OTLP endpoint/protocol when a dashboard is in the model; adds shared resolver helper.
src/Aspire.Hosting/Dcp/OtlpEndpointReferenceGatherer.cs Reuses the shared dashboard OTLP endpoint resolver and fixes doc comment typos.

Comment thread src/Aspire.Hosting/OtlpConfigurationExtensions.cs Outdated
JamesNK added 2 commits April 22, 2026 11:22
…oard

Catch InvalidOperationException when accessing ServiceProvider, which
throws when the DI container hasn't been built yet (e.g. env var
evaluation in tests without a fully built host). Falls back to
config-based resolution.
@JamesNK JamesNK added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Apr 22, 2026
@JamesNK
Copy link
Copy Markdown
Member Author

JamesNK commented Apr 22, 2026

@karolz-ms I think this change means the OTLP endpoint gather you added can be removed. There are no failing tests, and I tests an app and saw it use a different endpoint for containers with OTLP. Can you confirm this change is ok?

{
model = context.ExecutionContext.ServiceProvider.GetService<DistributedApplicationModel>();
}
catch (InvalidOperationException)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at ExecutionContext.ServiceProvider property. It throws if not in a completed state.

return null;
}

var dashboardResource = model.Resources.SingleOrDefault(r => StringComparers.ResourceName.Equals(r.Name, KnownResourceNames.AspireDashboard)) as IResourceWithEndpoints;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as I don't want to claim performance is an issue, this will loop over all resources to find the dashboard for each resource that uses it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true but I think it is also inevitable--we do not control at which point in the model lifecycle AddOtlpEnvironment() is called and thus whether the dashboard resource is present or not. So we have to search for it each time.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know but I can still complain about it 😄

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can improve it: #16388

var (url, protocol) = OtlpEndpointResolver.ResolveOtlpEndpoint(configuration, otlpExporterAnnotation.RequiredProtocol);
context.EnvironmentVariables[KnownOtelConfigNames.ExporterOtlpEndpoint] = new HostUrl(url);
context.EnvironmentVariables[KnownOtelConfigNames.ExporterOtlpProtocol] = protocol;
var dashboardEndpoint = ResolveOtlpEndpointFromDashboard(context, otlpExporterAnnotation.RequiredProtocol);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the otlp collector? Does this break that?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test. No it doesn't break it.

Copy link
Copy Markdown
Contributor

@karolz-ms karolz-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like this change, thank you James

return null;
}

var grpcEndpoint = dashboardResource.GetEndpoint(KnownEndpointNames.OtlpGrpcEndpointName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why this works and makes me happy is that GetEndpoint() returns an EndpointReference that is not tied to any particular network. That reference is subsequently resolved in the context of the network that the resource using the dashboard is connected to.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's clean.

@JamesNK JamesNK enabled auto-merge (squash) April 23, 2026 00:01
@JamesNK JamesNK merged commit f4c6d11 into main Apr 23, 2026
562 of 565 checks passed
@github-actions github-actions Bot added this to the 13.3 milestone Apr 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🎬 CLI E2E Test Recordings — 74 recordings uploaded (commit a934c44)

View recordings
Test Recording
AddPackageInteractiveWhileAppHostRunningDetached ▶️ View Recording
AddPackageWhileAppHostRunningDetached ▶️ View Recording
AgentCommands_AllHelpOutputs_AreCorrect ▶️ View Recording
AgentInitCommand_DefaultSelection_InstallsSkillOnly ▶️ View Recording
AgentInitCommand_MigratesDeprecatedConfig ▶️ View Recording
AspireAddPackageVersionToDirectoryPackagesProps ▶️ View Recording
AspireUpdateRemovesAppHostPackageVersionFromDirectoryPackagesProps ▶️ View Recording
Banner_DisplayedOnFirstRun ▶️ View Recording
Banner_DisplayedWithExplicitFlag ▶️ View Recording
Banner_NotDisplayedWithNoLogoFlag ▶️ View Recording
CertificatesClean_RemovesCertificates ▶️ View Recording
CertificatesTrust_WithNoCert_CreatesAndTrustsCertificate ▶️ View Recording
CertificatesTrust_WithUntrustedCert_TrustsCertificate ▶️ View Recording
ConfigSetGet_CreatesNestedJsonFormat ▶️ View Recording
CreateAndRunAspireStarterProject ▶️ View Recording
CreateAndRunAspireStarterProjectWithBundle ▶️ View Recording
CreateAndRunEmptyAppHostProject ▶️ View Recording
CreateAndRunJavaEmptyAppHostProject ▶️ View Recording
CreateAndRunJsReactProject ▶️ View Recording
CreateAndRunPythonReactProject ▶️ View Recording
CreateAndRunTypeScriptEmptyAppHostProject ▶️ View Recording
CreateAndRunTypeScriptStarterProject ▶️ View Recording
CreateJavaAppHostWithViteApp ▶️ View Recording
CreateTypeScriptAppHostWithViteApp_UsesConfiguredToolchain ▶️ View Recording
DashboardRunWithOtelTracesReturnsNoTraces ▶️ View Recording
DeployK8sBasicApiService ▶️ View Recording
DeployK8sWithGarnet ▶️ View Recording
DeployK8sWithMongoDB ▶️ View Recording
DeployK8sWithMySql ▶️ View Recording
DeployK8sWithPostgres ▶️ View Recording
DeployK8sWithRabbitMQ ▶️ View Recording
DeployK8sWithRedis ▶️ View Recording
DeployK8sWithSqlServer ▶️ View Recording
DeployK8sWithValkey ▶️ View Recording
DeployTypeScriptAppToKubernetes ▶️ View Recording
DescribeCommandResolvesReplicaNames ▶️ View Recording
DescribeCommandShowsRunningResources ▶️ View Recording
DetachFormatJsonProducesValidJson ▶️ View Recording
DetachFormatJsonProducesValidJsonWhenRestartingExistingInstance ▶️ View Recording
DoListStepsShowsPipelineSteps ▶️ View Recording
DoctorCommand_DetectsDeprecatedAgentConfig ▶️ View Recording
DoctorCommand_TypeScriptAppHostReportsMissingConfiguredToolchain ▶️ View Recording
DoctorCommand_WithSslCertDir_ShowsTrusted ▶️ View Recording
DoctorCommand_WithoutSslCertDir_ShowsPartiallyTrusted ▶️ View Recording
GlobalMigration_HandlesCommentsAndTrailingCommas ▶️ View Recording
GlobalMigration_HandlesMalformedLegacyJson ▶️ View Recording
GlobalMigration_PreservesAllValueTypes ▶️ View Recording
GlobalMigration_SkipsWhenNewConfigExists ▶️ View Recording
GlobalSettings_MigratedFromLegacyFormat ▶️ View Recording
InitTypeScriptAppHost_AugmentsExistingViteRepoAtRoot ▶️ View Recording
InvalidAppHostPathWithComments_IsHealedOnRun ▶️ View Recording
LegacySettingsMigration_AdjustsRelativeAppHostPath ▶️ View Recording
LogsCommandShowsResourceLogs ▶️ View Recording
OtelLogsReturnsStructuredLogsFromStarterAppCore ▶️ View Recording
PsCommandListsRunningAppHost ▶️ View Recording
PsFormatJsonOutputsOnlyJsonToStdout ▶️ View Recording
PublishWithConfigureEnvFileUpdatesEnvOutput ▶️ View Recording
PublishWithDockerComposeServiceCallbackSucceeds ▶️ View Recording
PublishWithoutOutputPathUsesAppHostDirectoryDefault ▶️ View Recording
RestoreGeneratesSdkFiles ▶️ View Recording
RestoreGeneratesSdkFiles_WithConfiguredToolchain ▶️ View Recording
RestoreRefreshesGeneratedSdkAfterAddingIntegration ▶️ View Recording
RestoreSupportsConfigOnlyHelperPackageAndCrossPackageTypes ▶️ View Recording
RunFromParentDirectory_UsesExistingConfigNearAppHost ▶️ View Recording
SecretCrudOnDotNetAppHost ▶️ View Recording
SecretCrudOnTypeScriptAppHost ▶️ View Recording
StagingChannel_ConfigureAndVerifySettings_ThenSwitchChannels ▶️ View Recording
StartAndWaitForTypeScriptSqlServerAppHostWithNativeAssets ▶️ View Recording
StopAllAppHostsFromAppHostDirectory ▶️ View Recording
StopAllAppHostsFromUnrelatedDirectory ▶️ View Recording
StopNonInteractiveMultipleAppHostsShowsError ▶️ View Recording
StopNonInteractiveSingleAppHost ▶️ View Recording
StopWithNoRunningAppHostExitsSuccessfully ▶️ View Recording
UnAwaitedChainsCompileWithAutoResolvePromises ▶️ View Recording

📹 Recordings uploaded automatically from CI run #24808814992

@aspire-repo-bot
Copy link
Copy Markdown
Contributor

No documentation PR is required for this change.

Reason: This is a bug fix (OTLP endpoint resolution in isolated mode) with no user-facing documentation impact:

  • No new public APIs were added
  • No new configuration options or environment variables introduced
  • The fix corrects internal behavior (using EndpointReference for lazy URL resolution instead of the static HostUrl) without changing how users configure or interact with OTLP/isolated mode
  • The PR author also confirmed docs updates are not needed

Generated by PR Documentation Check for issue #16367 · ● 201.5K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When started under isolated mode, no Traces and metrics are displayed in the Aspire dashboard

4 participants