Fix MST anonymous auth + plugin ALC type identity, add AAS perf parity#201
Fix MST anonymous auth + plugin ALC type identity, add AAS perf parity#201
Conversation
Three layered bugs surfaced when running `mst_register` against an
unauthenticated MST test ledger after AAS signing:
1. MissingMethodException in CodeTransparencyClientHelper
`CoseSign1.Transparent.MST` is in the host AssemblyLoadContext (it
is in the shared "CoseSign1" prefix list), but
`Azure.Security.CodeTransparency` was plugin-local. The two
`CodeTransparencyClientOptions` types had distinct identities so
`ConfigureMstPerformanceOptimizations` would not bind at runtime.
Added Azure.Core, Azure.Security.CodeTransparency, Azure.CodeSigning
and Azure.Developer.ArtifactSigning to PluginLoadContext shared
prefixes so they always load alongside the shared CoseSign1.* libs.
2. Implicit DefaultAzureCredential fallback broke unauthenticated
instances
With no MST_TOKEN set the helper silently fell back to
DefaultAzureCredential, hanging on developer credentials or sending
bogus tokens to test ledgers. New auth contract:
explicit `--token-env`/`MST_TOKEN` > `--azure-auth` > anonymous
Anonymous (no Authorization header) is now the default. `--azure-auth`
is a new boolean flag that opts into `DefaultAzureCredential`.
Explicit `--token-env <NAME>` now fails fast if the env var is
missing or whitespace. The `--timeout` CTS now wraps credential
acquisition so DefaultAzureCredential cannot exceed user-set timeouts.
3. AAS performance parity with MST
Added `CoseSign1.Certificates.AzureArtifactSigning.Extensions.AasClientOptionsExtensions`
in the shared library (mirroring the MST extension layout). Provides:
* `ConfigureAasPerformanceOptimizations(this CertificateProfileClientOptions)`:
250 ms x 8 fixed retries instead of the SDK default 800 ms x 3
exponential, bounding interactive signing latency at ~2 s.
* `ConfigureAasSigningPerformance(taskRetryCount, taskTimeoutSeconds)`:
tunable `AzSignContextOptions` factory.
No retry-After header stripping (that is an MST-specific workaround
for the ledger eventual-consistency window; AAS honours Retry-After
correctly).
Wired into AzureArtifactSigningCertificateProviderPlugin.CreateProvider.
Required adding CoseSign1.Certificates.AzureArtifactSigning as a
ProjectReference to CoseSignTool.csproj so the shared library loads
in the host context, matching the MST pattern.
Tests
-----
* CodeTransparencyClientHelperTests: rewritten to cover anonymous
default, --azure-auth opt-in, fail-fast on explicit --token-env with
missing/whitespace value, token-wins-over-azure-auth precedence.
* MstAzureAuthFlagWiringTests: new, asserts both RegisterCommand and
VerifyCommand expose --azure-auth in Options and BooleanOptions so
the CLI parser accepts a bare flag.
* AasClientOptionsExtensionsTests: new, covers default + override paths
for both extension methods plus null-options guard.
* PluginLoadContextBasicTests: extended shared-assembly list to include
the four Azure SDK assemblies.
Verified end-to-end against the unauthenticated test ledger:
AAS sign (private trust) -> mst_register -> mst_verify all succeed.
Full solution test run: 1195 passed, 5 skipped, 0 failed.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
789cd17 to
699cbeb
Compare
The previous plugin AssemblyLoadContext design relied on prefix matching
(`CoseSign1`, `CoseHandler`, `CoseIndirectSignature`, plus four Azure
SDK prefixes) to decide which assemblies fall back to the host context. Two
real smells:
1. Prefix collisions. A 3rd-party plugin DLL whose name happened to start
with `CoseSign1` (e.g. a hypothetical `CoseSign1Plus.Plugin.dll`)
would be silently absorbed into the host context, breaking isolation
without warning.
2. Load-anchor coupling. `CoseSignTool.csproj` carried ProjectReferences on
`CoseSign1.Transparent.MST` and
`CoseSign1.Certificates.AzureArtifactSigning` purely to satisfy the
prefix match. Verified the host had zero `using` statements and zero
call-sites into either library — they existed in the host bin only as
load anchors so plugins could `share` them via the prefix list. Adding
any new transitive dep to those shared libs (e.g. another Azure SDK
package) silently broke plugins until someone noticed at runtime — that
is exactly how the four Azure SDK prefixes ended up in the shared list
in the first place.
Architectural fix (three coordinated phases in this commit)
-----------------------------------------------------------
Phase 1: Curated exact-name shared list with shared-first short-circuit.
`PluginLoadContext` is rewritten to use:
* a tiny framework prefix list (`System.`, `Microsoft.Extensions.`,
`Microsoft.NETCore.`) plus exact framework names (`System`,
`mscorlib`, `netstandard`, etc.) — these namespaces are owned by
Microsoft and have no third-party collision risk.
* an explicit `HashSet<string>` of repo-curated cross-boundary contract
assemblies, exact-name match only.
The shared decision now SHORT-CIRCUITS the per-plugin probing logic. If
a name is host-shared, `Load` returns null immediately — it does NOT
touch the plugin directory or `AssemblyDependencyResolver`. This
prevents a plugin-shipped duplicate of the same DLL from silently
splitting type identity.
Removed from sharing (these are now plugin-local):
* CoseSign1, CoseHandler, CoseIndirectSignature (loose prefixes)
* Azure.Core, Azure.Security.CodeTransparency, Azure.CodeSigning,
Azure.Developer.ArtifactSigning (prefix workarounds for the
type-identity bug Phase 2 fixes structurally)
* Newtonsoft.Json (leftover prefix that introduced unwanted coupling)
The new shared contract is exactly:
* CoseSignTool.Abstractions — plugin contracts
* CoseSign1.Abstractions — ICoseSigningKeyProvider, ISupportsScittCompliance
* CoseSign1.Headers — ICoseHeaderExtender (returned by CoseHeaderHelper)
Phase 2a: Unify `CertificateProviderPluginManager` onto
`PluginLoadContext`.
`LoadPluginFromAssembly` previously used `Assembly.LoadFrom`, which
loaded cert-provider plugins (AAS) into the host's default ALC. After
unification, AAS plugins get the same isolation as command plugins (MST,
IndirectSignature) — each in its own collectible ALC with its own
private dependency graph.
Phase 2b: Drop the host's load-anchor ProjectReferences on
`CoseSign1.Transparent.MST` and
`CoseSign1.Certificates.AzureArtifactSigning`. Verified host bin no
longer contains these libraries or any Azure SDK assembly; they live
only in their respective plugin subdirectories where they belong.
Also removed the dead `CoseHandler` ProjectReference from the MST
plugin csproj (no source code referenced it).
Phase 3: Replace the host's concrete-type downcast with a shared
capability interface.
`SignCommand` previously did:
if (provider is CoseSign1.Certificates.CertificateCoseSigningKeyProvider certProvider)
certProvider.EnableScittCompliance = ...;
That coupled the host to the concrete plugin-local type. Replaced with:
if (provider is ISupportsScittCompliance scittProvider)
scittProvider.EnableScittCompliance = ...;
`ISupportsScittCompliance` is a new interface in `CoseSign1.Abstractions`
implemented by `CertificateCoseSigningKeyProvider`. The host now sees
plugin-returned providers strictly through the shared abstraction.
Original triggering bug
-----------------------
The prior 699cbeb commit added Azure SDK prefixes to the shared list as
a workaround for `MissingMethodException` when MST was used after AAS
signing — caused by `CodeTransparencyClientOptions` having distinct
type identities in host vs plugin ALCs. Phase 2 makes that bug
STRUCTURALLY IMPOSSIBLE: the extension method's home assembly
(`CoseSign1.Transparent.MST`) and the Azure SDK type it extends are
now both plugin-local, co-located in the same ALC. No type-identity
split is reachable.
Test changes
------------
* `PluginLoadContextAdvancedTests`: replaced reflection-based probing
of the private `IsSharedFrameworkAssembly` with calls to the public
`PluginLoadContext.IsHostShared(string)`. Added negative tests for
the formerly-shared prefixes (`CoseSign1`, `CoseHandler`,
`CoseIndirectSignature`, Azure SDK, `Newtonsoft.Json`) and a
prefix-collision regression test (`CoseSign1Plus`,
`CoseHandlerExtensions`, etc. must not be shared).
* `PluginLoadContextBasicTests`: updated the shared-assemblies list to
reflect the curated set.
* `PluginLoadContextIntegrationTests`: updated the comprehensive
shared-list test; added two new regression tests:
- `Integration_PrefixCollisionImpostor_LoadsPluginLocallyNotHostShared`
proves a stub `CoseSign1Plus.dll` IS probed plugin-locally
(warning observed in stderr).
- `Integration_SharedNameWithDuplicateInPluginDir_ShortCircuitsResolver`
proves a stub `CoseSignTool.Abstractions.dll` in the plugin dir
is NOT probed when the name is host-shared (no warning emitted).
* `CertificateCoseSigningKeyProviderEnableScittTests`: added two tests
for the new `ISupportsScittCompliance` capability — interface
implementation check + round-trip through the interface mirroring
`SignCommand`'s host-side pattern. Includes assertion that the
interface lives in `CoseSign1.Abstractions` (catches accidental
moves to a plugin-local assembly).
Verified
--------
* dotnet build CoseSignTool.sln: 0 warnings, 0 errors.
* Full test suite: 1204 passed, 5 skipped, 0 failed (was 1195 before).
* Host bin no longer contains MST/AAS shared libs or any Azure SDK.
* AAS plugin subdir contains its full Azure SDK closure.
* CoseSignTool.exe help discovers all 4 command plugins and 1 cert
provider plugin correctly.
* Local sign + validate end-to-end pipeline works (1230-byte signature
produced from self-signed cert via host's local cert path).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous commit (d6afc6a) localized the AAS plugin's Azure SDK closure into its plugin subdirectory. Manual e2e then surfaced a real bug that the prefix removal exposed: COSE Sign failed. Could not load file or assembly 'System.ClientModel, Version=1.9.0.0, ...' Two coupled defects: 1. `CoseSignTool.csproj` plugin-deploy target had a blanket `!StartsWith('System.')` filter, stripping every System.* DLL from the plugin's deployed subdirectory on the assumption they were all framework assemblies. That is wrong for out-of-band NuGet packages that legitimately ship via plugins (System.ClientModel, System.Memory.Data, System.IO.Pipelines, System.Text.Encodings.Web, polyfill builds of System.Text.Json, etc.). 2. `PluginLoadContext.FrameworkPrefixes` had `System.` listed as a shared-prefix, so even when a plugin DID ship System.ClientModel.dll the ALC short-circuited the local probe and asked the default ALC, which had nothing (the host doesn't reference it after the load-anchor cleanup). Fixed both: * Plugin deployment now copies all System.* files. Only the truly framework BCL assemblies (mscorlib, netstandard, Microsoft.CSharp, WindowsBase, Microsoft.VisualBasic{.Core}, Microsoft.Win32.Primitives, Microsoft.Win32.Registry) are excluded. * FrameworkPrefixes drops `System.`. Bare `System` stays in FrameworkExactNames (BCL only). True BCL System.* assemblies still resolve correctly via the runtime's natural default-context fallback when the plugin-local probe finds nothing in the plugin directory. This intentionally allows the plugin to load its private System.* NuGet packages from its subdirectory without colliding with TPA, while preserving the host-shared identity for everything in the BCL. Tests ----- * `IsHostShared_WithSystemAssemblies_ShouldReturnTrue` was overstated — trimmed to `IsHostShared_WithBareSystemName_ShouldReturnTrue` (the bare `System` assembly is the only System-prefixed thing still on the hard-coded shared list). * `IsHostShared_WithSystemDotPrefix_ShouldReturnFalse` is new — documents that System.Collections, System.IO, System.ClientModel, etc. are no longer prefix-shared and asserts the contract change explicitly. BCL names still load via runtime fallback; NuGet System.* names are now plugin-local. * `IsHostShared_WithBareFrameworkNames_ShouldReturnTrue` updated to include bare `System` alongside mscorlib/netstandard/etc. * `IsHostShared_WithDifferentCasing_ShouldReturnTrue` swapped System.Collections / System.IO (no longer auto-shared) for plain `system` (still bare-name shared). * All other PluginLoadContext.Load behavioral tests continue to pass: BCL System.* still loads via fallback; plugin-private System.* now loads plugin-locally. Verified -------- * dotnet build CoseSignTool.sln: 0 warnings, 0 errors. * Full test suite: 1205 passed, 5 skipped, 0 failed (one new test). * AAS plugin subdirectory now contains its full System.* NuGet closure (System.ClientModel.dll, System.Diagnostics.DiagnosticSource.dll, System.IO.Pipelines.dll, System.Memory.Data.dll, System.Text.Json.dll, etc.). * Manual end-to-end against the AAS Private Trust + unauthenticated MST test ledger: AAS sign -> 8457-byte signature mst_register -> Registration completed successfully mst_verify --authorized-domains aasmsttest.confidential-ledger.azure.com -> Verification result: VALID, exit 0 All three steps succeed with the refactored isolated plugin loading. No MissingMethodException class of bug remains reachable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CodeQL flagged two PluginLoadContextIntegrationTests methods that allocated a StringWriter and disposed it manually in a finally block. Convert each to a 'using' declaration and drop the manual Dispose call. Behaviour unchanged: Console.SetError(originalError) still runs in the finally before the using-scope disposes the writer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ptions) CodeQL's NEUTRAL run on PR #201 surfaced `Generic catch clause` (cs/catch-of-all-exceptions, severity: note) on the two LoadFromAssemblyPath failure handlers in PluginLoadContext.Load. The pattern was preserved verbatim from the pre-refactor code, but the surrounding restructure made CodeQL re-flag it as introduced. Replaced the broad catch (Exception ex) filters with the same exception-set the repo already uses in `PluginLoader.LoadPluginWithContext`: catch (Exception ex) when (ex is FileNotFoundException or BadImageFormatException or FileLoadException or ArgumentException) This is the documented exception surface of `AssemblyLoadContext.LoadFromAssemblyPath` (corrupt or missing DLL, strong-name mismatch, invalid argument). Anything outside this set — `OutOfMemoryException`, `ThreadAbortException`, `StackOverflowException`, security failures from a hostile environment — should now propagate to the caller rather than be silently swallowed and masked behind a stderr warning. Behavioural impact: the only catch path that observably changes is the hostile / impossible cases above, which would have failed downstream anyway once the assembly was needed. The standard load-failure recovery paths (missing DLL, bad image, file-load conflict, bad argument) are unchanged. Verified: build green, `CoseSignTool.Abstractions.Tests` 122/122 pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Commit 7407b4a narrowed the two LoadFromAssemblyPath catch clauses to (FileNotFoundException, BadImageFormatException, FileLoadException, ArgumentException) — the documented exception surface of AssemblyLoadContext.LoadFromAssemblyPath. That filter is correct for direct API failures, but LoadFromAssemblyPath also transitively triggers module initializers and static constructors inside the assembly being loaded. Those can throw arbitrary exception types (TypeInitializationException, PlatformNotSupportedException, DllNotFoundException, etc.) which then propagate out of Load() and break plugin command dispatch. CI confirmed the regression: macOS x64 PluginIntegrationTests.IndirectSignCommand_WithValidArgs_ShouldSucceed and IndirectSignCommand_WithPayloadLocation_ShouldSucceed both flipped from green to ExitCode.UnknownError after 7407b4a. Restoring the broad `catch (Exception ex)` is correct at this plugin loader boundary: a misbehaving plugin dependency must be logged and swallowed so the runtime can fall through to the next probe path (default ALC, etc.) rather than crashing the host. CodeQL's `cs/catch-of-all-exceptions` (severity: note) is suppressed inline with the standard repo pattern (// CodeQL [SM02184] <rationale>) used elsewhere in the codebase for SM02196 / SM05137. Verified: build green, CoseSignTool.Tests PluginIntegrationTests 121/121 pass, CoseSignTool.Abstractions.Tests 122/122 pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Update on the CodeQL The narrow exception filter introduced in 7407b4a cleared the CodeQL annotation, but it broke Root cause: Fix (9bd0d3a): Restored the broad 🧑🎓 Human Learning — Why the broad catch is correct hereWhat changed: Reverted the narrow exception filter and added an inline CodeQL suppression with rationale. Why: The documented exception surface of Priority personas that drove this: Reliability Engineer (host stability under untrusted plugin behavior), Software Tester (regression caught by integration tests on macOS x64 — proving the change wasn't behavior-preserving as commit 7407b4a claimed), Support Engineer (a swallowed warning is debuggable; a host crash from a bad plugin is not). Learn more:
Verified locally:
CodeQL on the new HEAD will be re-checked once CI completes. |
📝 PR description Section 1 is stale — contradicts the actual codeThe Section 1 header describes an approach that the code no longer implements:
It reads like the description was written for an initial approach (force everything into the host ALC), then the code evolved to the cleaner design (keep plugins fully isolated, share only the three contract assemblies via exact-name list), but Section 1 was never updated. Sections 2–3, tests, and verification all accurately reflect the code — just Section 1 needs a rewrite to match the actual approach. |
Two earlier attempts to satisfy CodeQL's cs/catch-of-all-exceptions on the LoadFromAssemblyPath try/catch blocks failed: 1. Commit 7407b4a narrowed to (FileNotFoundException, BadImageFormatException, FileLoadException, ArgumentException) - the documented exception surface of LoadFromAssemblyPath. macOS x64 PluginIntegrationTests regressed (ExitCode.UnknownError) because LoadFromAssemblyPath transitively triggers module/type initializers that throw arbitrary types. 2. Commit 9bd0d3a restored a broad catch with a // CodeQL [SM02184] suppression marker. Tests passed but GitHub's OSS CodeQL does not honor Microsoft-internal SM-prefixed suppression IDs (those are filtered by a downstream Microsoft pipeline like Guardian/PoliCheck, not by the CodeQL query engine itself). Result: NEW alerts #881 and #882 (cs/catch-of-all-exceptions) surfaced on PR #201 at the new HEAD's merge ref. Local repro on Windows uncovered the actual exception leaking from the narrowed catch: 'Could not load file or assembly ... An operation is not legal in the current state. (0x80131509)' - HRESULT COR_E_INVALIDOPERATION = InvalidOperationException, which is NOT in the documented exception surface of LoadFromAssemblyPath but is in fact what the runtime throws when a transitive load conflict occurs. Enumerating-by-type is fragile (we already missed InvalidOperationException; who knows what other CLR-internal types can surface from initializer chains). The safest pattern at a plugin loader boundary is to catch everything EXCEPT the genuinely fatal / unrecoverable exceptions: catch (Exception ex) when (ex is not OutOfMemoryException and not StackOverflowException and not ThreadAbortException and not AccessViolationException) This is a *filtered* catch, which CodeQL's cs/catch-of-all-exceptions does not flag (the rule targets unconditional catch-all). It also matches the canonical 'is this exception fatal?' set used widely in CLR boundary code. Plugin-internal failures get logged and the runtime falls through to probe alternate paths; truly fatal conditions still propagate. Verified: build green, CoseSignTool.Tests PluginIntegrationTests 121/121 pass, CoseSignTool.Abstractions.Tests 122/122 pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Resolution status — HEAD All CI is green and CodeQL is clean.
The aggregate CodeQL check shows NEUTRAL solely because of the pre-existing infrastructure warning "1 configuration not found" and Node.js 20 / CodeQL Action v3 deprecation notices in Lessons from the three iterations on
🧑🎓 Human Learning — When to use the "exclude fatal" filter patternThe pattern: When it's appropriate: At trust boundaries where you must keep the host process alive in the face of misbehaving downstream code (plugin loaders, message handlers, cross-process serialization boundaries, request pipelines). The host's contract is "I run; downstream failures get logged, not propagated". Why it satisfies CodeQL: Why it's safer than enumeration: APIs document what they throw, but in .NET, exceptions can also surface from:
You cannot enumerate this exhaustively. You can enumerate what's truly unrecoverable. When not to use it: Anywhere downstream of the boundary. Application logic should catch specific exceptions for specific recovery actions. The exclude-fatal pattern is a last-resort log-and-continue at a process boundary, not a general-purpose error handler. Priority personas that drove this final design: Reliability Engineer (host stability), Software Tester (regression caught the over-narrow filter twice — in 7407b4a on macOS, in the second attempt on Windows locally), Software Architect (the loader IS the boundary; that's where defensive error handling belongs), Support Engineer (logged warning is debuggable; silent catch-all is not — and the message includes assembly name + path + ex.Message). PR #201 is ready for human review and merge. |
Addresses reviewer feedback from elantiguamsft on
`CodeTransparencyClientHelper.cs:95` ("should we add .ConfigureAwait(false)?").
Repo convention is to call `.ConfigureAwait(false)` on every await in
library code. Confirmed across the existing surface:
CoseHandler/CoseHandler.cs: 4 awaits, all ConfigureAwait(false)
CoseSign1.Transparent.MST/MstTransparencyService.cs: 5 awaits, all ConfigureAwait(false)
CoseSign1.Transparent.MST/MstPerformanceOptimizationPolicy.cs: 3 awaits, all ConfigureAwait(false)
The MST plugin (added in 699cbeb) was the outlier — 11 awaits, none with
`ConfigureAwait(false)`. The reviewer flagged the credential-acquisition
call specifically, but for consistency every await in the plugin is
updated:
CodeTransparencyClientHelper.cs (1)
line 95 — DefaultAzureCredential.GetTokenAsync
MstCommandBase.cs (5)
line 99 — File.ReadAllBytesAsync (signature)
line 138 — File.WriteAllTextAsync (json result)
line 268 — ReadAndDecodeCoseMessage
line 279 — CreateCtsClient
line 282 — ExecuteSpecificOperation
line 289 — WriteJsonResult (conditional)
RegisterCommand.cs (1)
line 64 — transparencyService.MakeTransparentAsync
VerifyCommand.cs (3)
line 112 — File.ReadAllBytesAsync (receipt)
line 114 — transparencyService.VerifyTransparencyAsync (with receipt)
line 120 — message.VerifyTransparencyAsync (embedded)
Why the reviewer is right: even though CoseSignTool.exe is a console app
with no SynchronizationContext (so the DEFAULT continuation behaviour is
already on the threadpool), the MST plugin is library-shaped code that
flows through `CoseSign1.Transparent.MST.MstTransparencyService`
(loaded in the plugin AssemblyLoadContext) and is also reachable from
async callers that may run inside other hosts in the future. Saying
`ConfigureAwait(false)` consistently:
* keeps the plugin safe to host inside a UI / ASP.NET sync context
that legacy callers might add,
* removes the per-await context-capture overhead (small, but real),
* matches the rest of the codebase so a future scan won't ratchet
against this plugin alone.
Verified: full solution build clean, `CoseSignTool.MST.Plugin.Tests`
56/56 pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Updated — thanks for catching this. The PR description was indeed written for the interim fix (commit Section 1 rewrite (PR description) now describes the actual final implementation, including a side-by-side before/after table:
The new Section 1 also adds a callout note up top so future readers see the "this was rewritten" context, and Section 4 records the in-PR review follow-ups (your 🐤 Human Learning — why PR description sync rot is dangerousA stale PR description is a worse problem than missing one because reviewers anchor on it. They read Section 1, build a mental model of the change, and then evaluate the diff against that model. If the description says "we add prefixes" but the code says "we remove prefixes," the reviewer either:
This is especially insidious on long-running PRs where the approach legitimately evolves mid-review. The right discipline is to treat the PR body like a checked-in design doc: every architectural commit should be paired with a description amend, and reviewers should explicitly call out drift (which is exactly what you did here). A small mitigation is to put a |
|
This PR resolves the issue reported in #203 |
Summary
Fixes three coupled bugs surfaced when running
mst_registeragainst an unauthenticated MST test ledger after AAS signing, then takes the architectural cleanup all the way down rather than working around the original ALC type-identity smell with another shared-prefix.MissingMethodExceptionfrom cross-ALC type identity mismatch —CoseSign1.Transparent.MST.dllwas loaded in the host AssemblyLoadContext (sharedCoseSign1prefix) butAzure.Security.CodeTransparencywas plugin-local, soCodeTransparencyClientOptionshad two distinct identities andConfigureMstPerformanceOptimizationswould not bind at runtime.DefaultAzureCredentialfallback broke against unauthenticated endpoints — silently fell back toDefaultAzureCredentialwhenMST_TOKENwas unset, hanging on developer credentials or sending bogus tokens to test ledgers.Changes
1. Plugin AssemblyLoadContext — eliminate prefix-collision smell and host load-anchor coupling
The previous prefix-matching shared list (
"CoseSign1","CoseHandler","CoseIndirectSignature", plus"Azure.*"workarounds) had two real problems:CoseSign1Plus.dllwould have been silently absorbed into the host context.CoseSignTool.csprojcarriedProjectReferencesonCoseSign1.Transparent.MSTandCoseSign1.Certificates.AzureArtifactSigningpurely so the prefix match could find them. Verified zerousingstatements or call-sites from host code into either library — they existed inbin\only as load anchors.The fix replaces prefix matching with an explicit, exact-name allow-list and lets the offending libraries become fully plugin-local:
StartsWithoverstring[]HashSet<string>+ small framework prefix listCoseSignTool.Abstractions,CoseHandler,CoseSign1,CoseIndirectSignature(all loose prefixes)CoseSignTool.Abstractions,CoseSign1.Abstractions,CoseSign1.Headers(exact names)CoseSign1.Transparent.MSTProjectReferencewas the load anchor)CoseSign1.Certificates.AzureArtifactSigningProjectReferencewas the load anchor)Azure.Core/Azure.Security.CodeTransparency/Azure.CodeSigning/Azure.Developer.ArtifactSigningis CoseSign1.Certificates.CertificateCoseSigningKeyProviderdowncast inSignCommandCoseSign1.Certificatesto be host-sharedis ISupportsScittCompliance(new shared interface inCoseSign1.Abstractions); concrete type can be plugin-localCertificateProviderPluginManager.LoadPluginFromAssemblyAssembly.LoadFrominto the default ALC (no isolation)PluginLoadContext— AAS plugin gets the same isolation as MST and IndirectSignatureSystem.*files (brokeSystem.ClientModeletc.)mscorlib,netstandard,Microsoft.CSharp, …)Load()exception handlercatch (Exception)catch (Exception ex) when (ex is not OutOfMemoryException and not StackOverflowException and not ThreadAbortException and not AccessViolationException)— accepts every assembly-load failure mode (incl. surpriseInvalidOperationExceptionon macOS arm64) while letting truly fatal process-killers propagate; CodeQLcs/catch-of-all-exceptionsdoes not flag filtered catchesNet result of this section: zero Azure SDK assemblies in the host bin, full plugin isolation, no host process changes required to support new plugin-private deps, no prefix-collision risk for 3rd-party plugins.
2. MST auth contract (BREAKING)
CodeTransparencyClientHelper.CreateClientAsyncprecedence is now:Authorizationheader is sent. Appropriate for unauthenticated MST instances (test ledgers).--azure-authis a new boolean flag that opts intoDefaultAzureCredential.--token-env <NAME>with missing/whitespace value now throwsInvalidOperationException(no silent downgrade).MST_TOKENand--azure-authare set, the token is used (no DefaultAzureCredential acquisition).--timeoutnow wraps credential acquisition soDefaultAzureCredential.GetTokenAsynccannot exceed user-set timeouts.3. AAS performance parity
New shared-library extension
CoseSign1.Certificates.AzureArtifactSigning.Extensions.AasClientOptionsExtensions(mirrors the MST extension layout):ConfigureAasPerformanceOptimizations(this CertificateProfileClientOptions)— Fixed mode, 250 ms × 8 retries (vs SDK default Exponential 800 ms × 3, ~2 s vs ~5 s ceiling).ConfigureAasSigningPerformance(taskRetryCount, taskTimeoutSeconds)— tunableAzSignContextOptionsfactory; defaults match SDK baseline.No retry-After header stripping (that is an MST-specific workaround for the ledger eventual-consistency window; AAS honours
Retry-Aftercorrectly).Wired into
AzureArtifactSigningCertificateProviderPlugin.CreateProvider.4. Code review follow-ups
.ConfigureAwait(false)to all 11 awaits in the MST plugin (matches the convention used inCoseHandler.cs,MstTransparencyService.cs,MstPerformanceOptimizationPolicy.cs). Resolves the ConfigureAwait review thread.try { … } finally { …; consoleOutput.Dispose(); }blocks to useusingdeclarations (CodeQLcs/missed-using-opportunityalerts #879/#880).Tests
CoseSignTool.MST.Plugin.TestsCodeTransparencyClientHelperTestsfor the new auth contract; addedMstAzureAuthFlagWiringTests(--azure-authis inOptions+BooleanOptionson both commands).[DoNotParallelize]+[TestInitialize]/[TestCleanup]snapshot/restore env vars for hermetic execution.CoseSign1.Certificates.AzureArtifactSigning.TestsAasClientOptionsExtensionsTestscovers default + override paths plus null-options guard.CoseSignTool.Abstractions.TestsIsSharedFrameworkAssemblytests with calls to the new publicPluginLoadContext.IsHostShared. New tests cover: curated cross-boundary contracts, formerly-shared prefixes (now NOT shared), Azure SDK / Newtonsoft.Json (NOT shared), 3rd-party prefix-collision impostors (NOT shared),System.*resolution via default-context fallback, prefix-collision regression (CoseSign1Plus.dllloads plugin-locally), shared-name resolver-bypass guarantee.CoseSign1.Certificates.TestsISupportsScittCompliancecross-boundary contract tests (interface inCoseSign1.Abstractions, implemented byCertificateCoseSigningKeyProvider, round-trip toggling via the interface).Verification
End-to-end against the unauthenticated test ledger
https://aasmsttest.confidential-ledger.azure.com:bin\audit on the host confirms the architectural change: zeroAzure.*, zeroCoseSign1.Transparent.MST, zeroCoseSign1.Certificates.AzureArtifactSigning. All Azure SDK and plugin-implementation assemblies live exclusively inbin\plugins\<PluginName>\.Breaking-change note
Users who previously relied on the implicit
DefaultAzureCredentialfallback (noMST_TOKENset) must now pass--azure-authexplicitly.docs/MST.mdanddocs/Plugins.mdare updated.Pre-PR review
Hey Jeromy / code-review pass surfaced one real bug pre-push:
CodeTransparencyClientHelperTestsmutated process-wide env vars without[TestCleanup], leaking state if a test threw mid-flight. Fixed before pushing.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com