Improve Aspire logs search performance#17010
Conversation
Push log search and tail filtering into the AppHost backchannel so the CLI does not need to transfer and parse large non-matching log streams. Add JSON-RPC profiling context propagation across CLI and Hosting to measure the path end to end. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17010Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17010" |
There was a problem hiding this comment.
Pull request overview
This PR speeds up aspire logs --search/--tail by extending the auxiliary backchannel so the AppHost can apply server-side filters (resource, hidden-resource, search, and single-resource tail) before streaming logs back to the CLI, reducing JSON-RPC payload volume. It also adds profiling-only JSON-RPC spans with explicit trace-context propagation via request objects for end-to-end measurement across CLI and AppHost processes.
Changes:
- Add server-side console log filtering in the AppHost (
GetConsoleLogsRequestextended withSearch/Tail/IncludeHidden) and keep client-side filtering for compatibility/semantic parity. - Add JSON-RPC client/server profiling spans and propagate W3C trace context via backchannel request objects (
BackchannelRequest.ProfilingContext). - Update unit tests across Hosting/CLI to cover the new console log RPC behavior and the profiling constant refactor.
Show a summary per file
| File | Description |
|---|---|
| tests/Aspire.Hosting.Tests/Backchannel/BackchannelContractTests.cs | Updates contract validation lists and enforces request types derive from BackchannelRequest. |
| tests/Aspire.Hosting.Tests/Backchannel/AuxiliaryBackchannelRpcTargetTests.cs | Adds coverage for server-side console log search/tail/hidden filtering; updates ctor DI requirements. |
| tests/Aspire.Cli.Tests/TestServices/TestAppHostAuxiliaryBackchannel.cs | Extends test backchannel to support GetConsoleLogsAsync with optional override handler. |
| tests/Aspire.Cli.Tests/Telemetry/ProfilingTelemetryTests.cs | Updates tests to new ProfilingTelemetry constant groupings (EnvironmentVariables, Baggage). |
| tests/Aspire.Cli.Tests/Telemetry/ProfilingTelemetryContextTests.cs | Updates environment propagation tests to new constant groupings. |
| tests/Aspire.Cli.Tests/Projects/AppHostCandidateFinderTests.cs | Updates profiling env var names used in tests. |
| tests/Aspire.Cli.Tests/Git/GitRepositoryTests.cs | Updates profiling env var names used in tests. |
| tests/Aspire.Cli.Tests/Commands/RunCommandTests.cs | Updates profiling env var and baggage names used in tests. |
| tests/Aspire.Cli.Tests/Commands/LsCommandTests.cs | Updates profiling env var names used in tests. |
| tests/Aspire.Cli.Tests/Commands/LogsCommandTests.cs | Adds tests ensuring snapshot filters are passed to the console logs request and legacy fallback stays correct. |
| src/Aspire.Hosting/DistributedApplicationBuilder.cs | Registers Hosting ProfilingTelemetry in DI. |
| src/Aspire.Hosting/Diagnostics/ProfilingTelemetry.cs | Adds server-side JSON-RPC span support and trace-context bootstrapping from request-propagated context. |
| src/Aspire.Hosting/Backchannel/BackchannelDataTypes.cs | Adds BackchannelRequest/BackchannelProfilingContext; extends console log request filtering fields. |
| src/Aspire.Hosting/Backchannel/AuxiliaryBackchannelService.cs | Wires configuration + profiling telemetry into the per-connection RPC target. |
| src/Aspire.Hosting/Backchannel/AuxiliaryBackchannelRpcTarget.cs | Implements server-side console log filtering + streaming span instrumentation and context linking. |
| src/Aspire.Hosting/Backchannel/AppHostRpcTarget.cs | Adds server-side JSON-RPC span instrumentation for pipeline steps RPC. |
| src/Aspire.Cli/Telemetry/ProfilingTelemetry.cs | Adds JSON-RPC client span primitives; refactors env var and baggage constants; creates backchannel profiling context objects. |
| src/Aspire.Cli/Commands/LogsCommand.cs | Uses GetConsoleLogsAsync(GetConsoleLogsRequest) for snapshot + follow to enable server-side filtering. |
| src/Aspire.Cli/Backchannel/ProfilingJsonRpcExtensions.cs | New wrappers to add profiling spans around JsonRpc calls and inject profiling context into request objects. |
| src/Aspire.Cli/Backchannel/IAppHostAuxiliaryBackchannel.cs | Adds GetConsoleLogsAsync(GetConsoleLogsRequest) to the auxiliary backchannel interface. |
| src/Aspire.Cli/Backchannel/AppHostCliBackchannel.cs | Wraps JsonRpc calls with profiling helpers (including streaming). |
| src/Aspire.Cli/Backchannel/AppHostAuxiliaryBackchannel.cs | Uses profiling JsonRpc wrappers; adds fallback messaging for missing console-log RPC and switches capability probe to request objects. |
Copilot's findings
- Files reviewed: 22/22 changed files
- Comments generated: 2
JamesNK
left a comment
There was a problem hiding this comment.
4 items flagged: 1 stale dead-code discard, 1 hidden-resource filter contract clarity, 1 follow/tail asymmetry documentation, 1 queue capacity clarification. All are non-blocking observations — no bugs found.
JamesNK
left a comment
There was a problem hiding this comment.
Additional finding: activity ownership transfer in streaming profiling helper.
Search doesn't need to be kept in the client. It's new in 13.3. |
|
I'm surprised how slow this is without filtering in the app host. All other times the perf bottle neck has been DCP streaming to the app host. Is JSON-RPC really bad at streaming? |
Keep all-resource log requests on the legacy RPC for compatibility with older aux.v2 AppHosts. Clarify server-side log filtering and streaming profiling ownership, and add focused coverage for all-resource compatibility paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thanks for the overview. No code change was needed for this generated summary; I addressed the actionable review threads separately and pushed the follow-up commit. |
|
Accepted these observations. I removed the stale discard and added comments clarifying hidden named-resource requests, follow/tail behavior, and the fixed-size tail queue window. |
|
Accepted. I added a comment at the streaming profiling helper to document that activity ownership transfers to the returned enumerable and disposal depends on enumeration. |
Add an aux.v3 console log batching method so newer CLIs can reduce per-line JSON-RPC stream overhead while preserving the existing line-based and legacy fallbacks for older AppHosts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
IEvangelist
left a comment
There was a problem hiding this comment.
Approving. Three non-blocking observations from review:
- WithProfilingContext hand-copies properties in 11 request types — adding a new property without updating the override silently drops it on the wire when profiling is enabled. A reflection-based contract test would prevent this.
- Server-side raw-content MatchesSearch can drop lines that the CLI's parsed-log search would match (ANSI-escape edge case).
- GetConsoleLogsRequest.ResourceName relaxed from
equired to string? is not reflected in the spec doc.
Nice perf win — 131s → 0.9s is a great result.
Normalize ANSI before server-side log search, document console-log ResourceName compatibility, and add contract coverage for WithProfilingContext property preservation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Accepted all three observations. I added the WithProfilingContext property-preservation contract test, normalized ANSI stripping in the server and CLI search paths with regression coverage, and documented ResourceName V2/V3 compatibility in the backchannel spec. |
|
I don't know if this was an existing issue with Search with empty string returns everything: This makes sense. Search with no value returns nothing: I feel like it should either return everything, or CLI option validation should complain that a value is required. Edit: I investigated. The value of search is always the value after it, even if it matches another option, i.e. |
|
Performance oddity:
Is tail filtered in the CLI? Edit: I investigated and tail isn't filtered in the app host when there are multiple resources. Is this something we could improve? Most of the time there won't be a single resource specified. |
Move W3C trace context propagation to StreamJsonRpc activity tracing and keep Aspire profiling metadata in backchannel baggage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a reusable ChannelExtensions batching helper for IAsyncEnumerable and use it for console log batch streaming. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🎬 CLI E2E Test Recordings — 80 recordings uploaded (commit View all recordings
📹 Recordings uploaded automatically from CI run #25835514343 |
Documents the --search / -s option added in microsoft/aspire#17010, including server-side filtering behavior when connected to a v2+ AppHost auxiliary backchannel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pull request created: #948
|
|
📝 Documentation has been drafted in microsoft/aspire.dev#948 targeting Updated Note This draft PR needs human review before merging. |
* Improve Aspire logs search performance Push log search and tail filtering into the AppHost backchannel so the CLI does not need to transfer and parse large non-matching log streams. Add JSON-RPC profiling context propagation across CLI and Hosting to measure the path end to end. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address log search review feedback Keep all-resource log requests on the legacy RPC for compatibility with older aux.v2 AppHosts. Clarify server-side log filtering and streaming profiling ownership, and add focused coverage for all-resource compatibility paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Batch console log streaming Add an aux.v3 console log batching method so newer CLIs can reduce per-line JSON-RPC stream overhead while preserving the existing line-based and legacy fallbacks for older AppHosts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address log search review comments Normalize ANSI before server-side log search, document console-log ResourceName compatibility, and add contract coverage for WithProfilingContext property preservation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Use protocol trace context for backchannel profiling Move W3C trace context propagation to StreamJsonRpc activity tracing and keep Aspire profiling metadata in backchannel baggage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Share async enumerable batching helper Add a reusable ChannelExtensions batching helper for IAsyncEnumerable and use it for console log batch streaming. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Console log search is slow when an AppHost has large log streams because the CLI previously transferred every log line over JSON-RPC, parsed it, and then filtered locally. This changes the auxiliary backchannel so newer AppHosts can apply resource, search, hidden-resource, and single-resource tail filters before streaming logs back to the CLI. The CLI keeps its final client-side filtering for compatibility with older AppHosts and for parsed/display-name semantics.
This also adds profiling-only JSON-RPC spans for the CLI and Hosting backchannel path, with explicit trace context propagation in request objects so
aspire logs --searchcan be measured end to end across processes. Newer AppHosts also expose an aux.v3 batched console-log stream so the CLI can reduce per-line JSON-RPC stream overhead without changing the existing line-based contract.User-facing usage
Users keep using the same
aspire logscommand. The optimization is automatic when the connected AppHost supports the v2/v3 auxiliary backchannel:Performance
E2E harness: 75,000 emitted log lines, 75 matching
needlelines.Batching drops the filtered tail result from 75 JSON-RPC stream frames to 1 batch in this scenario. The wall clock is comparable to the line-streaming server-side filter path because command startup and AppHost log scanning dominate after the full 75,000-line transfer is removed.
Validation
dotnet build tests/Aspire.Cli.Tests/Aspire.Cli.Tests.csproj --no-restore /p:SkipNativeBuild=truedotnet build tests/Aspire.Hosting.Tests/Aspire.Hosting.Tests.csproj --no-restore /p:SkipNativeBuild=truedotnet test --project tests/Aspire.Cli.Tests/Aspire.Cli.Tests.csproj --no-build --no-launch-profile -- --filter-class "*.LogsCommandTests" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"dotnet test --project tests/Aspire.Hosting.Tests/Aspire.Hosting.Tests.csproj --no-build --no-launch-profile -- --filter-method "*.GetConsoleLogsAsync_AppliesSearchAndTailForSingleResource" --filter-method "*.GetConsoleLogBatchesAsync_AppliesSearchAndTailForSingleResource" --filter-method "*.GetConsoleLogsAsync_DoesNotApplyTailAcrossMultipleResources" --filter-method "*.GetConsoleLogsAsync_ExcludesHiddenResourcesWhenStreamingAllResources" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"dotnet test --project tests/Aspire.Hosting.Tests/Aspire.Hosting.Tests.csproj --no-build --no-launch-profile -- --filter-method "*.GetCapabilitiesAsyncReturnsCurrentCapabilities" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"dotnet test --project tests/Aspire.Hosting.Tests/Aspire.Hosting.Tests.csproj --no-build --no-launch-profile -- --filter-method "*.BackchannelTypes_FollowContractRules" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"./eng/scripts/verify-startup-otel.sh --skip-build --target-aspire-path artifacts/bin/Aspire.Cli/Debug/net10.0/aspire --profiler-aspire-path artifacts/bin/Aspire.Cli/Debug/net10.0/aspire --post-start-delay 1aspire logs producer --search needle --tail 75 --format Jsonagainst a detached AppHost returned 75 matching lines fromline 001000 needlethroughline 075000 needle; repeated batched runs measured0.93s,0.95s, and0.87swall clock.Fixes #16981
Checklist
<remarks />and<code />elements on your triple slash comments?