Skip to content

shard sema and codegen#18

Merged
dylan-conway merged 67 commits intoupgrade-0.15.2from
ali/sanic-gotta-go-fast
Apr 9, 2026
Merged

shard sema and codegen#18
dylan-conway merged 67 commits intoupgrade-0.15.2from
ali/sanic-gotta-go-fast

Conversation

@alii
Copy link
Copy Markdown
Member

@alii alii commented Apr 7, 2026

image

…threads

Replace the single llvm.Object with a PartitionSet of N parallel Builders.
Each Nav is assigned to one shard via hash(nav.fqn)%N (with the InternPool
nav index appended to disambiguate non-unique fqns). updateFunc/updateNav
route to the owning shard under a per-shard mutex; separate_thread is
enabled for stage2_llvm so IR construction overlaps Sema and runs N-wide.
Each shard emits its own object via parallel toBitcode + opt + emit, and the
linkers consume N partition objects (MachO/ELF self-hosted, Lld ELF).

Cross-shard linkage: owned navs become external+hidden definitions, others
emit matching declarations. Anonymous constants are linkonce_odr so comptime
pointer identity is preserved across generic instantiations in different
shards. Singletons (error name table, lt_errors_len, module asm) live in
shard 0; exports become aliases so cross-shard fqn references stay valid.
Extern declarations are deduped by name since genDecl's collision-replace
only runs in the owning shard.

zig_llvm.cpp: factor the optimization pipeline into runOptimizationPipeline
and split the unoptimised module before opt+emit so the existing
SplitModule path runs N-wide on the partitions instead of after a serial
whole-module opt pass. Hoist target registration before the emit fanout.

InternPool: fix three latent races/encodings exposed by Sema running on
non-main tids — (1) the cancel/re-acquire pattern in get() for slice
ptr_type/ptr/aggregate didn't re-check .existing after re-locking;
(2) Key.Func.zir_body_inst_extra_index is borrowed from the generic owner
but was indexed via the instance's tid; add zir_body_inst_tid;
(3) CaptureValue.idx:u30 cannot hold a Nav.Index encoded with tid_shift_32;
repack nav_val/nav_ref via tid_shift_30.

Also adds the scaffolding to spawn .analyze_func to worker threads under a
recursive sema_lock with per-AnalUnit claims (gated on ZIG_PARALLEL_SEMA;
currently serialized pending the WipNamespaceType publish-before-finish
window), --llvm-shard-stats, ZIG_JOB_STATS profiling, and -Dllvm-has-polly
for linking against an LLVM built with Polly.
coderabbitai[bot]

This comment was marked as outdated.

coderabbitai[bot]

This comment was marked as outdated.

alii added 2 commits April 7, 2026 22:33
The parallel-sema fast-path in ensureMemoizedStateUpToDate probed
.Type/.panic/.assembly, which are written before the rest of their stage,
so a thread could observe the stage as 'done' while later entries were
still .none. Probe the last entry of each stage instead and pair an
acquire load with release stores in analyzeMemoizedState.

In zirStructDecl/zirUnionDecl/zirOpaqueDecl and the reify equivalents,
wip_ty.finish now runs before remaining failable operations, but the
errdefer cancel/destroyNamespace were still active and would tear down
an already-published type on a later error. Gate them on a 'published'
flag set immediately after finish.
@alii
Copy link
Copy Markdown
Member Author

alii commented Apr 8, 2026

@coderabbitai off

@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
@oven-sh oven-sh deleted a comment from coderabbitai bot Apr 8, 2026
alii added 28 commits April 8, 2026 08:16
…s_mutex, inline_ref_mutex, claim-before-lock)
… IP writes); gate removeDependenciesForDepender
…ld_new rejects r_extern relocs to local-range symbols)
…s (Apple ld_new rejects r_extern relocs to local-range syms)
@dylan-conway dylan-conway merged commit 6093f93 into upgrade-0.15.2 Apr 9, 2026
0 of 12 checks passed
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

This PR introduces LLVM backend partitioning for parallel codegen through a new LlvmPartitionSet container, adds comprehensive parallel semantic analysis infrastructure with synchronization primitives and work queue management, extends build/linker support for multi-partition LLVM object handling, improves type resolution under concurrent execution with WIP types and namespace completion, and adds performance diagnostics.

Changes

Cohort / File(s) Summary
Build System & Configuration
build.zig, src/main.zig
Added llvm-has-polly build option for Polly library linking, --llvm-shard-stats CLI flag for per-shard codegen diagnostics, increased Zig compiler memory limit, and instrumented startup with phase-timing markers.
Compilation & Work Management
src/Compilation.zig
Added llvm_shard_stats option, mutex-protected work queues, switched to LlvmPartitionSet for LLVM emission, parallelized .analyze_func jobs under ZIG_PARALLEL_SEMA, and introduced per-shard statistics dumping via dumpLlvmShardStats.
Intern Pool & Type Metadata
src/InternPool.zig
Introduced seqlock-style writing flag for Nav.Repr.Bits, added tid/index repacking helpers, extended concurrency helpers for field/type mutations, added function-analysis queue state (is_queued), introduced namespace publish/spin mechanism with sentinels, and refactored getOrPutKey to prevent ABBA deadlocks via lockShardsSorted.
Type System & Resolution
src/Type.zig, src/Air/types_resolved.zig
Added eagerResolved predicate, hardened parallel sema with childComptimeOnly and .parallel_sema handling, added namespace completion waits (awaitNamespaceTypeFinished), improved resolution gating with pre-checks, and introduced resolveTypesFully to force resolution during parallel analysis.
Semantic Analysis Core
src/Sema.zig, src/Zcu.zig, src/Zcu/PerThread.zig
Added comprehensive parallel sema synchronization: recursive sema_lock, per-unit claim/wait mechanism (claimOrWait/releaseClaim), deferred reference buffering, additional mutexes (compile log, cimport errors), updated analysis failure tracking, improved namespace scanning with atomic generation marking, and replaced analysis_in_progress with aipPut/aipRemove helpers.
LLVM Backend Partitioning
src/codegen/llvm.zig, src/zig_llvm.cpp
Introduced PartitionSet container for sharded LLVM emission with per-shard locking and parallel emit via worker threads, extended Object with partition_id and partition_set, added sharded naming helpers (shardedNavName), refactored debug type lowering for unresolved types using eagerResolved checks, and extracted optimization pipeline into runOptimizationPipeline helper for per-partition execution.
Linker Object Handling
src/link.zig, src/link/Lld.zig, src/link/MachO.zig, src/link/MachO/relocatable.zig
Added resolveZcuObjectPaths to compute partitioned object file paths based on llvm_codegen_threads, updated Lld to initialize zcu_object_partition_count and iterate over object paths, added appendZcuObjectInputs helper to MachO, and updated linker APIs to accept arena allocator for path construction.
MachO Symbol & Export Handling
src/link/MachO/Object.zig, src/link/MachO/Symbol.zig, src/link/MachO/file.zig
Modified symbol visibility handling to preserve .hidden visibility in tentative definition conversion (emitting N_PEXT instead of unconditional .global), updated setOutputSym to set N_PEXT for hidden symbols, and changed export marking to include non-.local symbols (not just .global).
Target & Repository
src/target.zig, src/Air.zig, .gitignore
Changed .stage2_llvm backend to support .separate_thread feature (returns true instead of false), re-exported resolveTypesFully from Air, and added .claude/ and bun-cache/ to ignore list.
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title 'shard sema and codegen' directly and concisely describes the main objective of the changeset, which implements sharded semantic analysis and code generation infrastructure.
Description check ✅ Passed While the PR description consists only of an image reference (a Sonic 'Gotta go FAST' illustration), it is thematically related to the performance optimization goals of the changeset and conveys the intent of the work.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants