Skip to content

infra: publish Aspire CLI native AOT symbols (Win + Linux + macOS) to MSDL#17567

Draft
radical wants to merge 8 commits into
microsoft:release/13.4from
radical:radical/cli-aot-pdb-publish
Draft

infra: publish Aspire CLI native AOT symbols (Win + Linux + macOS) to MSDL#17567
radical wants to merge 8 commits into
microsoft:release/13.4from
radical:radical/cli-aot-pdb-publish

Conversation

@radical
Copy link
Copy Markdown
Member

@radical radical commented May 28, 2026

The Aspire CLI ships as a NativeAOT executable but its native debug symbols never reach MSDL/SymWeb. dotnet symbol --symbols against a shipped aspire (any platform) returns nothing, so customers and our own crash triage can't symbolicate stack traces from the CLI binary on any of Windows, Linux, or macOS.

$ dotnet symbol --symbols aspire.exe -o ./syms
Downloading from https://msdl.microsoft.com/download/symbols/
ERROR: Not Found

Root cause

ILC emits a symbol artifact next to the binary at artifacts/bin/Aspire.Cli/<config>/net10.0/<rid>/native/ on every NativeAOT build (aspire.pdb on Windows, aspire.dbg on Linux, aspire.dSYM/ on macOS — per Microsoft.NETCore.Native.targets), but it is then dropped on the floor:

  • <CopyOutputSymbolsToPublishDirectory>false</CopyOutputSymbolsToPublishDirectory> in src/Aspire.Cli/Aspire.Cli.csproj keeps it out of the publish dir.
  • The clipack archive (eng/clipack/Common.projitems) only stages the binary itself into the per-RID aspire-cli-<rid>.<zip|tar.gz>.
  • build_sign_native.yml only publishes packages/ as the native_archives_<rid> pipeline artifact, so the symbol artifact never reaches the downstream build stage.
  • The Windows build stage runs BuildAndTest.yml with /p:SkipNativeBuild=true, so it never produces the symbol artifact locally either.

End result: arcade's symbol-publishing infrastructure has no file to upload, even though it is fully wired up.

The fix

Arcade has two distinct symbol-publishing pipelines with different shape requirements; we use each for the platforms it supports:

  • Windows .pdb — loose-file path via FilesToPublishToSymbolServer (Publish.proj GatherPublishItems). Arcade's PrepLoosePdbsForPublish hard-filters loose files to .pdb/.dll, so this path is Windows-only.
  • Linux .dbg and macOS .dwarf — wrapped in a NuGet symbol package and routed via arcade's _ExistingSymbolPackage filter. SymbolUploadHelper.AddPackageToRequest opens the .symbols.nupkg with raw ZipFile.Open (SymbolUploadHelper.cs#L273), filters entries by an extension allowlist that includes .dbg/.dwarf/.so/.dylib (SymbolUploadHelper.cs#L37), and indexes with symbol.exe adddirectory. The symbol-server key is computed from the file's intrinsic build-id — ELF .note.gnu.build-id on Linux, Mach-O LC_UUID on macOS — giving SSQP keys <name>.dbg/elf-buildid-sym-<id>/_.debug (Linux) and _.dwarf/mach-uuid-sym-<uuid>/_.dwarf (macOS) that dotnet-symbol resolves on lookup.

dotnet/runtime uses this exact same path for CoreCLR and libraries native symbols:

The macOS lookup side is implemented in MachOKeyGenerator.cs, with the protocol documented in SSQP_Key_Conventions.md.

Three coordinated edits plumb the symbol artifacts from build_sign_native into the build stage's working directory before arcade's -publish runs:

  • eng/pipelines/templates/build_sign_native.yml
    • Windows agents stage aspire.pdb into artifacts/native-symbols-staging/<rid>/.
    • Linux agents pack aspire.dbg into Aspire.Cli.<rid>.<version>.symbols.nupkg using a minimal hand-built zip + nuspec.
    • macOS agents extract the inner Mach-O DWARF from aspire.dSYM/Contents/Resources/DWARF/aspire, ship it as aspire.dwarf in the same .symbols.nupkg shape. The file's LC_UUID matches the binary's, so dotnet-symbol's Mach-O lookup resolves to it.
    • All three platforms publish under the new per-RID pipeline artifact native_symbols_<rid>.
  • eng/pipelines/azure-pipelines.yml and azure-pipelines-unofficial.yml — Windows build job adds two DownloadPipelineArtifact@2 tasks: **/aspire.pdb into artifacts/native-symbols/ (consumed by the FilesToPublishToSymbolServer glob), and **/Aspire.Cli.*.symbols.nupkg into artifacts/native-symbol-pkgs/. A subsequent pwsh step copies the symbol packages into packages/<config>/Shipping so arcade's manifest generation picks them up as Symbols assets. The existing **/Aspire.Cli*.nupkg download is tightened to exclude .symbols.nupkg so the same file isn't downloaded twice.
  • eng/Publishing.props — project-level FilesToPublishToSymbolServer glob for the Windows pdbs only (the loose-file path is .pdb/.dll-only). Linux and macOS symbol packages flow through arcade's existing .symbols.nupkg routing without needing a separate property.

Why this approach (and not the alternatives)

  • The .dSYM directory bundle is not separately published. The Apple-native automatic symbolication path (lldb / atos / Instruments via Spotlight UUID indexing) needs the bundle form and is tracked by Distribute macOS symbols as dSYM, not .dwarf dotnet/runtime#88286. For server-mediated symbolication via dotnet-symbol — the primary CLI crash-triage workflow — the flat .dwarf we ship is the working format that dotnet/runtime itself uses.
  • AutoGenerateSymbolPackages stays false. That property controls arcade's managed-PDB → .symbols.nupkg wrapper for shipping NuGet packages, independent of the symbol publishing here.
  • Not unsetting <CopyOutputSymbolsToPublishDirectory>false</CopyOutputSymbolsToPublishDirectory>. Globbing directly from bin/<rid>/native/ avoids re-triggering the SymStore race that the comment at eng/clipack/Common.projitems warns about for the managed pdb. Keeps a clean separation between the managed-pdb path (suppressed) and the native-pdb path (this PR).
  • Hand-built .symbols.nupkg rather than a NuGet Pack invocation. Arcade's SymbolUploadHelper opens the package with raw ZipFile.Open, not NuGet OPC validation, so OPC compliance is unnecessary. A minimal .nuspec + symbol payload zip is sufficient and avoids adding a NuGet Pack task to the build_sign_native job (and the cross-platform tooling that would require).

Surprises and call-outs

  • CI structurally required build_sign_native → pipeline artifact → build job download for symbols to be on disk when -publish runs. A Publishing.props glob alone could not pick them up — the files aren't on the publishing agent without this plumbing.
  • This change cannot be PR-validated through GitHub Actions: azure-pipelines-public.yml does not run build_sign_native. Verified end-to-end on internal AzDO build 2985850 for the Windows path (an earlier revision): both 🟣Stage native AOT pdb steps succeed; the Windows build job's -publish log emits Uploading 'PdbArtifacts/Windows/native_symbols_win_<arch>/aspire.pdb' to the BAR with distinct relative paths per RID, keeping the same-named pdbs apart on MSDL via RelativePDBPath keying. A fresh AzDO build is in flight to exercise the Linux + macOS paths.
  • Path shape for the Linux .dbg is documented in Microsoft.NETCore.Native.targets (NativeOutputPath = $(OutputPath)native\, NativeSymbolExt = .dbg on Linux, StripSymbols=true default on non-Windows so debug info is split into the .dbg sidecar via objcopy --only-keep-debug + --add-gnu-debuglink).
  • macOS payload shape (aspire.dSYM/Contents/Resources/DWARF/aspire) confirmed locally on macOS NativeAOT publish; dwarfdump --uuid showed the inner file's UUID matches the binary's, which is the contract MachOFileKeyGenerator relies on for symbol-server lookup.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17567

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17567"

Copy link
Copy Markdown
Member

@joperezr joperezr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we have done a dry run and this works.

@radical radical changed the title infra: publish Aspire CLI native AOT pdbs (win-x64, win-arm64) to MSDL infra: publish Aspire CLI native AOT symbols (Windows + Linux) to MSDL May 28, 2026
@radical radical force-pushed the radical/cli-aot-pdb-publish branch 2 times, most recently from 6cc16ec to b432f9c Compare May 28, 2026 20:16
@radical radical changed the title infra: publish Aspire CLI native AOT symbols (Windows + Linux) to MSDL infra: publish Aspire CLI native AOT symbols (Win + Linux + macOS) to MSDL May 28, 2026
@radical radical force-pushed the radical/cli-aot-pdb-publish branch from b432f9c to a9ceac7 Compare May 28, 2026 21:43
@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

@radical radical force-pushed the radical/cli-aot-pdb-publish branch from 4a45433 to 9917ea1 Compare May 29, 2026 01:31
@davidfowl davidfowl added this to the 13.4 milestone May 29, 2026
@radical radical force-pushed the radical/cli-aot-pdb-publish branch 4 times, most recently from 7eb23a7 to 153b3db Compare May 29, 2026 21:49
… MSDL

The Aspire CLI ships as a NativeAOT executable but its native debug
symbols never reached MSDL/SymWeb. `dotnet symbol --symbols` against
a shipped `aspire` binary returned nothing on any platform, so
customers and our own crash triage couldn't symbolicate stack traces.

ILC emits the symbol artifact next to the binary on every NativeAOT
build (.pdb on Windows, .dbg on Linux, .dSYM/ on macOS), but it gets
dropped: CopyOutputSymbolsToPublishDirectory=false keeps it out of
publish, clipack stages only the binary, and the Windows build job
runs with SkipNativeBuild=true. Arcade's symbol-publishing
infrastructure had no file to upload.

Plumb the symbol artifacts from build_sign_native into the Windows
build job before arcade's -publish runs, using each platform's
appropriate arcade pipeline:

* Windows .pdb → loose-file path via FilesToPublishToSymbolServer in
  eng/Publishing.props. Windows-only (PrepLoosePdbsForPublish is
  .pdb/.dll only).

* Linux .dbg / macOS .dwarf → packed into
  Aspire.Cli.<rid>.<version>.symbols.nupkg by build_sign_native.yml
  and routed via arcade's _ExistingSymbolPackage filter to
  SymbolUploadHelper. macOS ships the inner DWARF extracted from
  aspire.dSYM/Contents/Resources/DWARF/aspire as a flat aspire.dwarf;
  the .dSYM directory bundle isn't shipped (dotnet/runtime#88286
  tracks the long-term bundle distribution work).

A fail-fast coverage gate in download_native_symbols.yml asserts the
expected per-RID set arrived before -publish runs. Without it, a
silent 1ES.PublishBuildArtifacts@1 failure or itemPattern typo would
let the pipeline succeed while publishing zero symbols — a post-ship,
silent, unrecoverable failure mode. The gate uses
##vso[task.logissue] so missing artifacts surface in the build
summary's issues panel, not just the log.

This change cannot be PR-validated through GitHub Actions
(azure-pipelines-public.yml doesn't run build_sign_native). See
docs/ci/cli-native-symbols.md for the full architecture (pipelines,
SSQP keys, ILC paths, upstream arcade/runtime/symstore references)
and eng/scripts/validate-cli-symbols.ps1 for the local round-trip
validator.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@radical radical force-pushed the radical/cli-aot-pdb-publish branch from 153b3db to 9626257 Compare May 31, 2026 18:17
…alidation

Companion to the pipeline changes that publish Aspire CLI native AOT
symbols (.pdb / .dbg / .dwarf) to MSDL. Validates the entire symbol
round-trip locally per-RID against a current build, without uploading
anything to MSDL.

Four checks per RID — A→D — each isolating one piece of the pipeline
so a failure points at a specific suspect:

* A. Identifier symmetry — binary's intrinsic ID (PDB GUID+Age, ELF
  BuildID, Mach-O LC_UUID) matches the symbol file's. If A holds,
  symstore's per-format key generator computes the same SSQP key
  from either side.

* B. Pack/extract round-trip — the symbol file inside our hand-built
  .symbols.nupkg is byte-identical to the source after Compress-Archive
  packing + Expand-Archive extraction. Linux/macOS only (Windows uses
  the loose-pdb path).

* C. dotnet-symbol round-trip — runs dotnet-symbol against a local
  HTTP symstore (built in-process with HttpListener; dotnet-symbol
  only accepts http(s) server paths, not file://) keyed per the SSQP
  convention. Exercises the exact lookup path customers hit against
  MSDL. Install/server-infrastructure failures fail Check C loudly
  rather than silently downgrading to SKIP, so a broken validator
  can't be mistaken for a clean run.

* D. Resolver-readable content — the platform symbolicator
  (atos / addr2line / llvm-symbolizer) resolves the binary's
  entry-point VA against the file Check C downloaded. Proves the
  bytes are usable debug info, not just bytes that happen to pass
  an SSQP round-trip.

Skipped checks (genuinely-missing platform tools like dwarfdump
without Xcode CLT) don't fail the script — the SKIP message
identifies what's missing. Detailed usage, parameters, and examples
live in the script's comment-based help; run
`Get-Help eng/scripts/validate-cli-symbols.ps1 -Detailed`.

docs/ci/cli-native-symbols.md is the operating doctrine: when to run
(file-level triggers), how to triage a failed check, the per-RID
baseline for what a clean run looks like, the mapping from each
check to its production-pipeline counterpart, the pipeline
architecture reference (referenced from comments in
eng/Publishing.props and eng/pipelines/templates/build_sign_native.yml),
and the criteria under which the script can be retired.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@radical radical force-pushed the radical/cli-aot-pdb-publish branch from 9626257 to db9e79e Compare May 31, 2026 18:25
@davidfowl
Copy link
Copy Markdown
Contributor

While you're here, we also want to upload symbols for aspire-managed.exe

@radical radical force-pushed the radical/cli-aot-pdb-publish branch from cb08c3d to c2224b6 Compare June 1, 2026 00:49
@radical
Copy link
Copy Markdown
Member Author

radical commented Jun 1, 2026

aspire-managed.exe already ships with embedded PDBs — the repo defaults to DebugType=embedded (Directory.Build.props:23), so each .dll carries its portable PDB inside the PE debug directory (entry type 17). That's the alternative to MSDL upload, not a precursor; debuggers find the symbols directly in the binary without a server round-trip.

Verified against ~/.aspire/versions/13.4.0_9c260c29a6.../: 4 Aspire-owned assemblies have both an embedded PDB (debug-dir type 17) and a CodeView record (type 2) — aspire-managed.dll, Aspire.Dashboard.dll, Aspire.Hosting.RemoteHost.dll, Aspire.TypeSystem.dll. Runtime BCL .dlls use DebugType=portable + MSDL (CodeView-only), which is the route this PR sets up for the NativeAOT CLI binary (no embedded option exists for native AOT).

@radical radical modified the milestones: 13.4, 13.4.x Jun 1, 2026
@DamianEdwards
Copy link
Copy Markdown
Member

DamianEdwards commented Jun 1, 2026

aspire-managed.exe already ships with embedded PDBs — the repo defaults to DebugType=embedded (Directory.Build.props:23), so each .dll carries its portable PDB inside the PE debug directory (entry type 17). That's the alternative to MSDL upload, not a precursor; debuggers find the symbols directly in the binary without a server round-trip.

We should consider changing this for the bundle arguably, i.e. optimize for distribution/layout size rather than ease of debuggability. For NuGet packages I think embedded debug symbols is generally the right trade-off, but we don't expect end-users to be debugging the managed host or other parts of the bundle, e.g. dashboard, DCP, etc., and even if they do, they can download the symbols from the symbol store.

Adds a per-RID MSBuild target _PackNativeAotSymbols (in
eng/clipack/Common.projitems, running AfterTargets=PackDotnetTool) that
produces the platform-specific debug-info payload as a first-class build
output of the per-RID clipack agent:

* Windows: copies aspire.pdb to artifacts/native-symbols-staging/ for
  arcade's FilesToPublishToSymbolServer loose-pdb glob.
* Linux/macOS: invokes a tiny helper project
  (eng/clipack/Aspire.Cli.NativeSymbols.proj) that uses NuGet's
  PackTask with TfmSpecificDebugSymbolsFile +
  AllowedOutputExtensionsInSymbolsPackageBuildOutputFolder to route the
  .dbg / .dwarf into a legacy-format .symbols.nupkg (NOT .snupkg) so
  arcade's _ExistingSymbolPackage filter classifies it as a Symbols
  asset and routes it via SymbolUploadHelper to MSDL. The empty
  companion .nupkg the helper produces alongside is discarded in a
  scratch dir so its filename doesn't collide with the real per-RID
  tool nupkg in Shipping/.

The AOT output path is hardcoded to Release/ to mirror _PublishProject's
Configuration=Release pin on the AOT publish, so a local
`build.sh -c Debug -pack` doesn't fail the existence guards looking in
the wrong directory.

Mirrors the prior-art pattern from dotnet/runtime
(runtime.native.System.IO.Ports), so symbol production is exercised on
every local pack rather than only on the internal pipeline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
radical and others added 3 commits June 1, 2026 15:16
Deletes the two PowerShell pack steps in build_sign_native.yml ("Stage
native AOT pdb" and "Pack native AOT symbols nupkg", ~95 lines of
inline heredoc that built .symbols.nupkg via raw ZipFile and staged
aspire.pdb). Production of these artifacts is now an MSBuild output of
the per-RID clipack project on the same agent, so the YAML only needs
to publish the staging directory.

The 1ES.PublishBuildArtifacts step is unchanged - it consumes the same
artifacts/native-symbols-staging/<rid>/ directory the MSBuild target
writes to.

Refreshes inline comments in eng/Publishing.props and
download_native_symbols.yml to reference the new MSBuild producer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The local validation script's pack/extract round-trip check existed
because the .symbols.nupkg construction was hand-rolled in PowerShell;
that's no longer Aspire's contract to prove. NuGet's PackTask now owns
the format, so symmetry against arcade's _ExistingSymbolPackage /
SymbolUploadHelper is no longer something the script needs to verify.

Retires Check B and renames the remaining checks (old C -> B, old D ->
C). Updates synopsis ("Four checks" -> "Three checks") and the
"Required before merging" trigger list to reference
eng/clipack/Common.projitems and eng/clipack/Aspire.Cli.NativeSymbols.proj
instead of the deleted YAML heredoc.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Re-describes the pipeline-architecture subsection so .symbols.nupkg
production is attributed to eng/clipack/Aspire.Cli.NativeSymbols.proj
(invoked from Common.projitems's _PackNativeAotSymbols target on the
per-RID build agent) rather than a pipeline-side PowerShell heredoc.
Links to dotnet/runtime's runtime.native.System.IO.Ports as the
prior-art pattern.

Updates the baseline table (3 checks, not 4), the production-to-script
mapping, and the validate-script trigger list to match the retirement
of Check B.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants