Skip to content

Fix spurious type errors for provided types under parallel compilation#19969

Merged
abonie merged 10 commits into
dotnet:mainfrom
T-Gro:fix/tp-tcimport
Jul 2, 2026
Merged

Fix spurious type errors for provided types under parallel compilation#19969
abonie merged 10 commits into
dotnet:mainfrom
T-Gro:fix/tp-tcimport

Conversation

@T-Gro

@T-Gro T-Gro commented Jun 18, 2026

Copy link
Copy Markdown
Member

Two type-provider failures that appear only in full parallel builds, not in the IDE:

  • The same provided type used from multiple files was checked into separate entities, so identity comparisons failed and reported spurious FS0001 type mismatches. Provided-type entities are now interned by name, so all files linking a given provided type share one entity and parallel checking stays enabled.

  • Hosting a TypeProviders-SDK provider under an unoptimized compiler failed with "a field called tcImports must exist in the systemRuntimeContainsType closure", because the closure field the SDK reflects on was named by codegen and only matched in optimized builds. It is now captured stably in every configuration.

Intern provided-type entities by name so the same provided type linked from
multiple files under graph-based parallel checking yields one entity, avoiding
spurious FS0001 type mismatches from identity comparisons.

Capture the systemRuntimeContainsType closure stably so TypeProviders-SDK
providers load under an unoptimized compiler (the SDK reflects on a field
named tcImports whose name was previously codegen-dependent).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… fixes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@T-Gro T-Gro requested a review from abonie June 18, 2026 13:15
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

✅ No release notes required

@T-Gro

T-Gro commented Jun 18, 2026

Copy link
Copy Markdown
Member Author

CC @smoothdeveloper — thanks for the report and the debugging that pinned this down.

@smoothdeveloper

smoothdeveloper commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@T-Gro thanks for getting to the bottom of it!

I'll check the changes solves the issue in my context (edit: change to CompilerImports.fs is in the PR already).

@github-actions github-actions Bot added ⚠️ Affects-Design-Time Tooling check: PR touches type providers or dependency manager ⚠️ Affects-Compiler-Output Tooling check: PR touches IL emission or codegen labels Jun 18, 2026
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Tooling Safety Check — Affects-Compiler-Output, Affects-Design-Time
Affects-Compiler-Output: ProviderGeneratedTypeRoots sort order changed; closure IL shape differs
Affects-Design-Time: type-provider interning/hosting infrastructure modified

Generated by PR Tooling Safety Check · opus46 5.4M ·

@smoothdeveloper

Copy link
Copy Markdown
Contributor

Confirming the build issue is fixed in my context!

…esthost hang)

All 47 jobs passed except IcedTasks_Test_Debug, which timed out at 120min after all
but one test target completed. The identical suite passed in IcedTasks_Test_Release (8m)
and IcedTasks does not use type providers, so this PR's type-provider-only changes
cannot affect its build or test runtime. The CI agent reported 95% memory usage and
one net9.0 testhost hung. Re-running to clear the transient failure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@T-Gro

T-Gro commented Jun 25, 2026

Copy link
Copy Markdown
Member Author

The only failing check was IcedTasks_Test_Debug (regression test), which timed out at the 120-minute job limit. All other 46 jobs passed, including IcedTasks_Build and IcedTasks_Test_Release (the identical test suite in Release completed in ~8 minutes). The agent log shows 95% memory usage and a single net9.0 testhost hung after every other target had already passed, so this is a transient resource-starvation timeout rather than a code issue — IcedTasks does not use type providers, and this PR only changes type-provider code paths.

/azp run fsharp-ci

@T-Gro T-Gro left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by AI (@expert-reviewer agent). Findings may contain inaccuracies — please verify independently.

Solid, well-documented fix. The name-based interning, the lock-free CAS append, the ConcurrentDictionary static-link map, and the explicit-lambda capture for the SDK reflection contract are all sound, and the two added tests are good guards (the Lazy + GetOrAdd pattern correctly runs create exactly once). Two inline notes below, plus one that could not be anchored because the file isn't in this diff:

  • AddProvidedTypeEntity is still called directly, bypassing interning, at src/Compiler/Checking/CheckDeclarations.fs:3136 (embedding nested generated types). If that path can ever target the same ModuleOrNamespaceType (same mangled name) as a GetOrInternProvidedEntity caller, the intern table and the entities queue diverge: a second Entity is appended for a name the intern table believes it already owns, reintroducing exactly the identity-mismatch class of bug this PR fixes. If embedding always operates on a freshly-built module being compiled (disjoint from the referenced provided-type modules), this is a non-issue — but it's worth a confirming comment, or routing that path through GetOrInternProvidedEntity too for safety.

Comment thread src/Compiler/TypedTree/TypedTree.fs Outdated
Comment thread src/Compiler/Driver/CompilerImports.fs
@T-Gro T-Gro added the AI-reviewed PR reviewed by AI review council label Jun 30, 2026
…then guard test, document embedding path

- TypedTree.fs: serialize the four lookup caches AddProvidedTypeEntity invalidates against concurrent provided-type interning via a lazily-created lock gated on a volatile hostsProvidedTypes flag (cacheOptByrefWithLock), so non-type-provider modules keep the lock-free fast path. Replaces the CAS append with a lock-guarded append+invalidate, closing the read-compute-store vs invalidation race.

- lib.fs/lib.fsi: add cacheOptByrefWithLock inline helper.

- ManglingNameOfProvidedTypes.fs: strengthen the systemRuntimeContainsType guard test (also assert no synthesized 'objectArg' receiver capture) and document/verify it in Debug and Release; add a concurrency stress test for cache coherence under interning.

- CheckDeclarations.fs: comment confirming the embedding path builds a fresh local module disjoint from interned provided-type modules.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@T-Gro

T-Gro commented Jun 30, 2026

Copy link
Copy Markdown
Member Author

Addressed all three.

The cache-coherence race is real. AddProvidedTypeEntity now appends and invalidates under a lazily-created lock, and the four tables it clears (TypesByMangledName, TypesByAccessNames, TypesByDemangledNameAndArity, AllEntitiesByCompiledAndLogicalMangledNames) run their read/compute/store under the same lock once a module has actually hosted a provided type. Modules that never host provided types stay on the lock-free cacheOptByref path, so there's no cost on the hot resolution path. The stale "only ever accessed from the compiler thread" comment is updated.

The guard test now also asserts the regression symptom is absent — no synthesized objectArg receiver capture on the closure — instead of only asserting tcImports is present, so it bites in the codegen-sensitive configuration rather than only where the SDK happens to still find the field. Verified passing against both the Debug and the optimized Release build of FCS.

The direct AddProvidedTypeEntity at the embedding site is safe by construction and now carries a comment saying so: it targets a freshly-built local module (mkLocalTyconRef) for the generated-type set, disjoint from the referenced provider modules interning serves and built on a single file's checking thread, so there is no shared ModuleOrNamespaceType for the intern table and the entities queue to diverge on.

Also added a concurrency stress test that interns many entities while readers hammer a guarded lookup table and asserts the derived view stays consistent with the entity set.

Copilot and others added 4 commits June 30, 2026 12:29
Replace the lock-based coherence scheme for ModuleOrNamespaceType's
provided-type-mutated lookup tables with a lock-free version-stamped
approach. 'entitiesVersion' is bumped (release) whenever 'entities' is
appended to, and each cached lookup table is tagged with the version it
was built from, so a reader on another graph-based-checking thread
recomputes after a concurrent provided-type append instead of trusting
a stale memo.

- lib: replace 'cacheOptByrefWithLock' with 'cacheOptByrefByVersion'.
- TypedTree: drop 'providedTypeCacheLock'/'hostsProvidedTypes' and the
  'ProvidedTypeCacheGate'; 'AddProvidedTypeEntity' now appends via a CAS
  loop and bumps 'entitiesVersion' last; the version-stamped caches need
  no explicit invalidation.
- Tests: exercise all four version-stamped lookup tables with more
  concurrent readers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The concurrency guard spun 32 hot reader threads against 100 writers over
20 iterations, continuously rebuilding the version-stamped lookup tables
under Server GC. That pegs CPU and piles up allocations on memory-limited
CI agents. Reduce to 6 cooperative (yielding) readers and 40 writers over
5 iterations: still interleaves many version bumps with concurrent reads to
guard the coherence invariant, without the resource spike.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/Compiler/TypedTree/TypedTree.fs
@github-project-automation github-project-automation Bot moved this from New to In Progress in F# Compiler and Tooling Jul 1, 2026
@abonie abonie enabled auto-merge (squash) July 2, 2026 07:21
@abonie abonie merged commit 95ff08f into dotnet:main Jul 2, 2026
56 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in F# Compiler and Tooling Jul 2, 2026
T-Gro added a commit to T-Gro/fsharp that referenced this pull request Jul 2, 2026
Follow-up to dotnet#19969, which interned provided *type* entities behind a
lock-free CAS append + entitiesVersion design. The namespace entities on a
provided type's path never got the same treatment: AddModuleOrNamespaceByMutation
appended non-atomically (losing updates that race AddProvidedTypeEntity's CAS
append) and the two namespace-materialization sites deduped with a non-atomic
check-then-act, so two threads could each build a disjoint namespace subtree and
strand the provided types interned under the orphan (spurious FS0001/FS0039).

Make the namespace append atomic (shared AppendEntityByMutation CAS loop) and
add GetOrInternNamespaceEntity, reusing the single providedEntitiesByMangledName
intern table (one mangled name -> one entity per parent). Its factory reuses any
pre-existing module/namespace of that name before creating, so a provider
extending a real namespace no longer forks a duplicate. Both injection sites route
through it. modulesByDemangledNameCache is now version-stamped since namespace
appends can run concurrently with ModulesAndNamespacesByDemangledName reads.

Fixes dotnet#20020

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚠️ Affects-Compiler-Output Tooling check: PR touches IL emission or codegen ⚠️ Affects-Design-Time Tooling check: PR touches type providers or dependency manager AI-reviewed PR reviewed by AI review council

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants