Surface silent integration-assembly load failures (#16729)#16733
Surface silent integration-assembly load failures (#16729)#16733IEvangelist wants to merge 1 commit intomicrosoft:mainfrom
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 16733Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 16733" |
| // across the ALC boundary. | ||
| try | ||
| { | ||
| return Default.LoadFromAssemblyName(new AssemblyName(SharedAssemblyName)); |
There was a problem hiding this comment.
I don't think this is correct. The TypeSystem assembly needs to be loaded into the default ALC. See the comment that was deleted.
There was a problem hiding this comment.
Thanks Eric — want to make sure I understand which invariant you're flagging, because I think we may actually agree and the comment in the diff is what's misleading.
The "TypeSystem must live in Default ALC" invariant is satisfied here. AssemblyLoadContext.Default.LoadFromAssemblyName(...) is an instance method on the Default ALC, so the assembly returned is owned by Default — it is not loaded into IntegrationLoadContext. The new regression test asserts this directly:
var resolved = alc.LoadFromAssemblyName(mismatched);
Assert.Same(hostTypeSystem, resolved);
Assert.Same(AssemblyLoadContext.Default, AssemblyLoadContext.GetLoadContext(resolved));tests/Aspire.Hosting.RemoteHost.Tests/IntegrationLoadContextTests.cs — AspireTypeSystem_IsSharedFromDefaultContext_EvenWhenRequestedVersionDoesNotMatch. Same Assembly instance, same Default ALC ownership, so type identity for ICodeGenerator / ILanguageSupport / AtsContext is preserved across the ALC boundary just like before.
The only behavioral difference between this and the old return null path is the version constraint the runtime applies when it asks Default to resolve:
| Path | What happens for an integration that references Aspire.TypeSystem, Version=42.42.42.42 while the bundled copy is 13.3.0.0 |
|---|---|
Old (return null) |
CLR re-runs the bind through Default using the requested AssemblyName (carrying Version=42.42.42.42). Default's binder enforces strict version match → FileLoadException → ReflectionTypeLoadException from GetTypes() → silently swallowed by the resolver. This is exactly #16729. |
New (explicit Default.LoadFromAssemblyName(new AssemblyName("Aspire.TypeSystem"))) |
Default resolves by simple name only and returns its own 13.3.0.0 copy. Same Default ALC, same type identity — just no strict version gate. Canonical "host-shared assembly" pattern for plugin loaders. |
So the change is not "load TypeSystem somewhere other than Default" — it's "stop letting Default's strict version match veto a bind we know is safe by simple name."
Could you confirm which of the following is the actual concern, so I can address it correctly?
- You read it as loading into IntegrationALC — in that case the test above should resolve it, and I'll rewrite the inline comment to make that explicit (frame it like the original deleted comment did).
- You want strict version match preserved as an invariant and prefer fixing the upstream build that ships
Aspire.Hosting.CodeGeneration.*libs stamped against42.42.42.42while the bundledAspire.TypeSystem.dllis13.3.0.0. If that's the position, I'm happy to drop this commit, keep only the diagnostic hardening (warnings, better error messages,WarnIfSharedAssemblyMismatch, log-level promotions), and file a separate build-infra issue. That would still convert Harden diagnostics when integration-assembly resolver discovers zero contributors #16729 from a silent failure into a loud, self-explanatory one without the loader compensating for build drift. - Something else entirely — happy to be told.
Either way I'll wait for your call before pushing more.
|
I'm still not fully following how does someone reach this in real life. What is the main scenario here of a realistic app that would be broken by this? |
When `aspire new` is run with a TypeScript template against a CLI bundle whose `Aspire.TypeSystem` version doesn't match the integration assemblies on disk, `ReflectionTypeLoadException` thrown by the resolver is swallowed at LogDebug level and the user sees only: No code generator found for language: TypeScript No language support found for: typescript/nodejs This makes the binary mismatch invisible. This change improves the diagnostic chain at every layer without altering the success path: * `CodeGeneratorResolver` / `LanguageSupportResolver` now log `ReflectionTypeLoadException` at Warning level and include `LoaderExceptions` text in the message. * When an assembly named `Aspire.Hosting.CodeGeneration.*` produces zero contributors, log a Warning so the silent-failure case is visible. * `AssemblyLoader.LoadAssemblies` performs an `Aspire.TypeSystem` version sanity check at startup against the libs directory and warns on mismatch. * `LanguageService` / `CodeGenerationService` error messages now list the available languages, or point at the apphost server log + binary mismatch when zero are discovered. * `PrebuiltAppHostServer` promotes apphost-server stdout/stderr capture from Trace to Debug/Information, so warnings emitted by the apphost server reach the default file log. Adds `CodeGeneratorResolver.GetSupportedLanguages()` and an internal test-only constructor on both resolvers that takes a synthetic assembly list, along with covering tests. Closes microsoft#16729 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fe44250 to
ac298d4
Compare
|
Status update — rebased onto The originally-reported So I've:
Net effect: the branch is now a single commit on top of Thanks @eerhardt for pushing on the binding question — the back-and-forth is what made the "this isn''t a real shipped-build bug" conclusion clear. |
Description
This PR hardens the diagnostic chain so that when the integration-assembly load path does fail — for any reason: build-pipeline version skew, stale local NuGet cache, transitive dependency mismatch, partially populated probe directory, etc. — the failure is loud and self-explanatory instead of surfacing only as:
The original report (#16729) was a cryptic CLI failure on
aspire newwith TypeScript templates. After investigation, that specific symptom traced to a stale local NuGet cache anomaly on the reporter''s machine (a mix of package contents from switching between PR / staging / stable13.3.0builds) and is not reproducible after clearing~/.nugetand~/.aspire. So this PR no longer attempts a runtime binding fix — it is diagnostic hardening only, which is the part that has lasting value regardless of which upstream condition produced the resolver miss.Changes
CodeGeneratorResolver/LanguageSupportResolvernow logReflectionTypeLoadExceptionat Warning level and include theLoaderExceptionstext in the message (was previouslyLogDebug, which is below the file logger''s default Information threshold and never reached the user log).Aspire.Hosting.CodeGeneration.*is loaded but contributes zeroICodeGenerator/ILanguageSupporttypes, log a Warning so the silent-failure case is visible.AssemblyLoader.LoadAssembliesperforms anAspire.TypeSystemversion sanity check at startup against the libs directory and warns when the bundled and probed versions diverge — surfacing this lets us catch build-pipeline regressions early even if the runtime tolerates the skew.LanguageService/CodeGenerationServiceerror messages now list the available languages, or point at the apphost-server log + binary mismatch when zero are discovered.PrebuiltAppHostServerpromotes spawned-process stdout/stderr capture fromTracetoDebug/Information, so warnings emitted by the apphost server actually reach the default file log.Internal-only API additions
CodeGeneratorResolver.GetSupportedLanguages()so the service can list available languages in error messages.Func<IReadOnlyList<Assembly>>, used by the new tests.IntegrationLoadContext.GetSharedAssemblyNames()so the assembly loader''s version sanity check stays in sync with the actual load context policy.Tests
New coverage in
tests/Aspire.Hosting.RemoteHost.Tests/:ResolverDiagnosticsTests(4) — proves the resolvers log Warnings on zero-contributor assemblies and onReflectionTypeLoadException.ServiceErrorMessageTests(4) — proves the user-visible error messages list available languages or point at the binary-mismatch scenario.All
Aspire.Hosting.RemoteHost.Testspass locally.Notes
release/13.3tomain. The branch was reset toupstream/mainand the diagnostic commit was cherry-picked clean.IntegrationLoadContext.Loadchange that resolvedAspire.TypeSystemby simple name from the default ALC, intended to tolerate42.42.42.42↔13.3.0.0version skew. That change has been dropped along with its regression test, since the originally-reported failure turned out to be a local cache anomaly rather than a build-pipeline skew, and shipping a binding-policy change without a confirmed root cause to defend against would be premature. The hardening here keeps that scenario visible if it ever does occur.Closes #16729
Checklist
ResolverDiagnosticsTests(4) andServiceErrorMessageTests(4) covering the new diagnostic paths.CodeGeneratorResolver.GetSupportedLanguages(), the additional resolver constructors, andIntegrationLoadContext.GetSharedAssemblyNames()are allinternal; the resolvers themselves areinternal sealed.