v2: finish cleanc flat AST self-host migration by medvednikov · Pull Request #27390 · vlang/v

medvednikov · 2026-06-08T08:17:58Z

Summary

remove cleanc weak generic specialization emission and weak generic bookkeeping
rely on normal nested generic discovery for strong concrete specializations
fix flat-AST cleanc cases for interface defaults, slice bytestr lowering, and fixed-array map keys

Tests

./v -g -keepc -o ./vnew cmd/v
./vnew -nocache -silent test vlib/v2/gen/cleanc/flag_enum_codegen_test.v
./vnew -nocache -silent test vlib/v2/gen/cleanc/result_option_codegen_test.v
./vnew -nocache -silent test vlib/v2/gen/cleanc/
./vnew -nocache -silent test vlib/v2/transformer/transformer_test.v
git diff --check
5-level cleanc self-host chain with -nocache --no-parallel

The flat-codegen migration drops b.files for the cleanc backend (the post-transform FlatAst is the source of truth), but the cleanc cached-module build path still enumerated modules/files from b.files. With b.files empty, has_module('builtin') returned false, so the entire cached-core split-compilation path was silently skipped: no per-module .vh headers were written and the slower non-cached fallback ran instead. This regressed module_storage_cache_test (passed at b6bd49b, failed after the flat commits). Make the cleanc cache enumeration flat-aware. When b.files is empty and the post-transform FlatAst is present (uses_flat_module_enumeration), source module names, file paths and imports from the flat cursors: - has_module, collect_modules_excluding, user_entry_module_names, expand_type_modules_with_imports (raw imports via read_file_imports), has_external_cache_module_name_collision - import_modules_for_cached_modules (comptime-aware active_file_imports_from_flat_with_options) - collect_virtual_main_modules (+ flat_file_declares_executable_main) - source_files_for_module_name — required for external (non-vlib) modules whose location the vlib disk-scan fallback cannot resolve; this is what left ext.vh unwritten. Each helper keeps its existing b.files loop for the legacy (.v/.eval) path and normalizes module names through flat_file_module_name (dots to underscores, 'main' default) to stay bit-identical to ast_file_module_name. Cache bundles additionally need correct emission: the flat cleanc gen does not yet scope _option_/_result_ wrapper-typedef emission to a type-module subset, so it emits a type-module struct/prototype without the wrapper typedef it references (undefined-type errors in builtin.c). For restricted bundle generation in flat mode, rehydrate just the bundle's type-module files from flat and drive the proven legacy gen, which filters by physical file set. Bounded to cached-module files; the .o is cached so this only runs on a cache miss. The unrestricted main translation unit still uses the flat gen. Fully flat-native restricted bundle codegen remains the follow-up. Verified: module_storage_cache_test passes; cleanc self-build of cmd/v2/v2.v produces a working compiler; cache-hit (.vh-parse) rebuild works; parsed-AST stats report legacy=0; builder/transformer/cleanc/ markused/ssa/ast suites green; the three pre-existing cleanc codegen failures are unchanged.

Compiling cmd/v2/v2.v via cleanc (flat codegen) spent ~28s in Transform, ~6.5x slower than the legacy path. Two causes, both fixed: 1) lookup_imported_var_type rebuilt its work on every call: it took the current module scope, allocated objects.keys(), SORTED them, then scanned the whole symbol table for Module (import) objects and resolved each import's scope — all to find one var across the module's imports. Called per-identifier during type propagation, this was one of the hottest functions (post_pass alone was 18s). The import-scope set is stable for a transform run, so precompute it once in cache_env_maps (build_cached_imported_module_scopes -> cached_imported_module_scopes) and just iterate the short import list. The dropped per-call sort does not change the result (the "found in exactly one import" / ambiguity semantics are order-independent). post_pass: 18s -> 2.9s, and the per-identifier lookup is now O(imports) instead of O(objects log objects). This speeds up both the flat and legacy propagation paths. 2) The parallel build branch ran the SEQUENTIAL transform_flat_to_flat_direct for flat-codegen backends, so the per-file transform never used worker threads (the legacy driver fans it out). The flat AST has no thread-safe merge primitive for workers to append into one builder, so route the parallel branch through transform_files_parallel_to_flat_via_driver and drop b.files immediately for flat-codegen backends — codegen stays flat-only; the legacy files are live only transiently during the parallel transform. The memory-critical arm64 self-host runs --no-parallel, so it takes the sequential branch and keeps the allocation-minimal flat-direct path untouched. Result: Transform on cmd/v2/v2.v 28.1s -> ~4.2s, on par with the legacy parallel path (~4.1s). Verified: cleanc self-build of cmd/v2/v2.v produces a working compiler; module_storage_cache_test, all 19 transformer tests (incl. the propagation parity guard), cleanc_test, and the builder suite pass; parsed-AST stats stay legacy=0; arm64 sequential and parallel smokes work; the three pre-existing cleanc codegen failures are unchanged.

Restricted cleanc cache-bundle generation was the last codegen path that still materialized legacy ast.File: it rehydrated the bundle's type-module files via b.flat.to_files_range(i, i+1) and drove the LEGACY cleanc gen, because the flat gen, given the whole b.flat, emitted foreign-module structs/prototypes (e.g. term__ColorConfig, os__open_file) whose _option_/_result_/Array_ wrapper typedefs are registered only while emitting that module's bodies — which a bundle does not do — producing undefined-type C errors. Fix it the same way the legacy gen did: by scoping the INPUT to the bundle's type modules. The legacy gen filtered its gen_files; the flat gen drives every one of its ~65 emission passes off flat.files, so hand it a FlatAst whose file list is restricted to the bundle's type_module_names (flat_scoped_to_modules). The node/edge/string arena is shared, not copied (V shares the array buffer on struct-literal field assignment), so this is a cheap file-list filter, not an AST rehydrate. Every pass then sees exactly the files the legacy gen saw. This removes the last to_files_range / legacy-ast.File materialization on the cleanc codegen path AND moves bundle generation onto the flat gen (previously legacy). Memory: warm cache (bundles reused) is unchanged (5.7GB); a cold self-build is +1.6GB (12.5 vs 10.9GB) because bundles now run through the flat gen — acceptable on the cleanc C-backend path (the memory-critical arm64 self-host is --no-parallel/flat-direct and never touches this), and only on a cache miss. Verified: cleanc self-build of cmd/v2/v2.v produces a working compiler with zero undefined-type errors in builtin/vlib/v2compiler/imports bundles; module_storage_cache_test and cleanc_test pass; the cleanc codegen suite is 6 passed / 3 failed (the 3 are pre-existing, unchanged); arm64 smoke unaffected.

…lat backends) For flat-codegen backends (cleanc/c/x64/arm64), the parallel transform (transform_files_parallel_to_flat_via_driver) ran the full legacy file-mutating post-pass on the transformed []ast.File and threaded the result back to the caller, which then immediately dropped it. That work is redundant: post_pass_to_flat already applies the same edits to the flat, and the type-propagation tail can run against the flat (apply_post_pass_tail_from_flat) instead of the legacy files. This is exactly what the sequential transform_flat_to_flat_direct already does. Add a `keep_files` parameter: - flat backends pass keep_files=false -> skip post_pass_files_with_ generated_parts, run apply_post_pass_tail_from_flat, return [] (the transient `result` is freed in the function instead of being returned and dropped). - .v/eval still consume the files, so they pass keep_files=true (the previous behavior, unchanged). Removes the last []ast.File consumer in the post-pass tail on the flat codegen path (advances toward deleting legacy ast) and skips a redundant whole-program legacy post-pass + propagate_types on every default (parallel) cleanc/c/x64 build. Verified: cleanc self-build of cmd/v2/v2.v -> working compiler; eval backend (keep_files=true path) prints correct output; module_storage_cache_test, cleanc_test, all 19 transformer tests (incl. propagate_types_from_flat parity + transformer_flat_diff), cleanc_target_e2e, native, target_os, type_check_parallel, flat_streaming pass; -stats legacy=0; arm64 smoke ok.

…kers The parallel transform always streams per-file from the post-parse FlatAst: transform_files_parallel_no_post_pass_impl was only ever called with stream_from_flat=true (both wrappers passed true), so the entire `stream_from_flat == false` machinery — which transformed a legacy []ast.File input — was unreachable. Remove it: - the `stream==false` branches in the fan-out (Windows + non-Windows, single-thread + worker spawn), folding the impl into a flat-only transform_files_parallel_no_post_pass (drop the bool param); - the top-level-stmt splitting subsystem (transform_files_parallel_top_level_stmts, TransformStmtJob/Result/ ChunkArgs, transform_stmt_chunk_thread, file/stmt_can_split_top_level, file_transform_cost, lpt_stmt_buckets); - the LPT cost-walkers that recursed over legacy ast.Stmt/Expr/Type (transform_stmt_cost, transform_stmts_cost, transform_expr_cost, transform_type_cost, top_level_transform_stmt_cost, lpt_buckets); - the uncalled wrappers transform_files_parallel, transform_files_parallel_from_flat, transform_files_parallel_no_post_pass_from_flat; - the non-flat branch of transform_chunk_thread and the now-unused TransformChunkArgs.files field. Net -653 lines, and removes a large cluster of legacy-ast.Stmt/Expr-walking code from the hot transform path (advances toward deleting legacy ast). Verified: cleanc self-build of cmd/v2/v2.v -> working compiler; cleanc_target_e2e_test (compiles AND runs real programs end-to-end through the parallel transform), all 19 transformer tests, module_storage_cache_test, type_check_parallel_test, cleanc_test pass; -stats legacy=0; arm64 + eval (keep_files=true) smokes ok. A pre-existing cleanc bug with struct methods in small standalone programs reproduces identically on HEAD (not a regression).

First step of making the transform read from cursors instead of decoding each whole top-level statement subtree to legacy ast at the transform_cursor_stmts_to_flat_direct loop (flat_write.v:1835) — the last big legacy-AST consumer and the blocker for deleting ast.Stmt/Expr. Add transform_stmt_list_item_cursor_to_flat(c ast.Cursor, ...): it dispatches on c.kind() and falls back, in one line, to the proven legacy list-item path for unconverted kinds (reusing the full guard chain + pending-stmts drain). The loop seam now passes the cursor (stmts.at(i)) instead of stmts.at(i).stmt(). First converted set is the nine TRUE-passthrough top-level kinds that carry no try_expand_* guard and that transform_stmt_to_flat emits verbatim (flat_write.v:3890): stmt_import, stmt_module, stmt_directive, stmt_empty, stmt_enum_decl, stmt_interface_decl, stmt_type_decl, stmt_asm, stmt_flow_control. They route through append_transformed_stmt_to_flat exactly as the fallback would after its guards fail — bit-equal, just skipping the always-false-for-these-kinds guard checks. This is the dispatcher foundation, not yet a decode win: the converted kinds still decode via c.stmt() (dropping that needs flat-to-flat subtree copy). The point is the seam that later stages — const/global, then the FnDecl body (the real per-function whole-body decode that dominates) — extend arm by arm, each gated by the existing parity harness. Verified: transformer_flat_diff_test parity (cursor output bit-equal to the decode path) + all 19 transformer tests; cleanc self-build -> working compiler; cleanc_target_e2e_test (compiles+runs real programs); -stats legacy=0; arm64 + cleanc smokes exercising import/enum/type-decl top-level kinds both correct.

…ge 2) Second step of the cursor-native transform (slice 4). Convert the stmt_const_decl and stmt_global_decl arms of transform_stmt_list_item_cursor_to_flat to read straight from the cursor instead of decoding the whole statement to legacy ast first. - transform_const_decl_cursor_to_flat / transform_global_decl_cursor_to_flat mirror the ast.ConstDecl / ast.GlobalDecl arms of transform_stmt_to_flat exactly (same emit order, same field-init/field-decl encoding) but read is_public, the field list, and per-field name/flags/typ/value/attrs from the cursor. The ConstDecl/GlobalDecl wrapper + FieldInit/FieldDecl structure are no longer rehydrated as legacy structs; only the field value/typ exprs still decode (via the existing transform_expr_to_flat), which the later expr-arm stages eliminate. - Factor append_transformed_stmt_id_to_flat out of append_transformed_stmt_to_flat so the cursor arms drain t.pending_stmts ahead of the emitted stmt identically (bit-equal ordering). Verified: transformer_flat_diff_test parity (cursor output bit-equal to the decode path) + all 19 transformer tests; cleanc self-build of cmd/v2/v2.v (hundreds of top-level const decls) -> working compiler; a const+__global smoke compiles+runs correctly on cleanc, arm64, and the self-built compiler (hi 49); cleanc_target_e2e_test and module_storage_cache_test pass; -stats legacy=0.

Behavior-preserving refactor that splits the 264-line transform_fn_decl_parts into three reusable pieces, preparing the cursor-native FnDecl body streaming (slice 4, stage 4): - enter_fn_body_transform(decl) ?FnBodyTransformCtx — the early-exit checks (uninstantiated-generic skip, comptime-attr elision, @[live] detection) and the full prologue (fn scope from cached_fn_scopes or fallback seeding, param/receiver seeding, return-type-name resolution, per-fn state set, smartcast reset). Returns none on early-exit, else a FnBodyTransformCtx holding the ~17 saved-state locals + live_fn_detected + scope keys + has_return_type/fn_return_type. - restore_fn_body_transform_state(mut ctx) — restores the per-fn state saved around the body transform. - finish_fn_body_transform(decl, mut ctx) — cached_fn_scopes writeback, scope restore, and the @[live] noinline attr; returns the final attribute list. transform_fn_decl_parts now reads: enter -> transform_stmts -> restore_state -> lower_defer_stmts -> finish, identical to before. The split lets a future streaming variant run enter -> stream body via the cursor body driver -> restore_state -> finish (skipping defer, which is a no-op for no-defer fns). Verified behavior-identical: transformer_flat_diff_test parity + all 19 transformer tests; cleanc self-build of cmd/v2/v2.v (every function routes through the helpers) -> working compiler that compiles+runs a defer'd fn correctly; -stats legacy=0.

…te fix Stage 4 of the cursor-native transform, the real decode-reduction step: top-level functions whose body has no defers now transform their body directly from the cursor instead of decoding the whole FnDecl (header + entire body + every nested expr) to legacy ast at once. flat_write.v: - transform_fn_decl_streaming_to_flat: read the signature via c.fn_decl_signature() (body-less, no whole-decl decode), run the shared enter_fn_body_transform prologue, stream the body from c.list_at(3) via transform_cursor_stmts_to_flat_direct (one statement decoded at a time), then restore_fn_body_transform_state + finish_fn_body_transform and emit. Peak transform memory per function drops from whole-body to one statement. - flat_subtree_has_defer / flat_body_has_defer: cheap recursive cursor scan (no decode); functions with a defer anywhere fall back to the whole-decl decode path, because lower_defer_stmts needs the complete body. Over- detection (e.g. a defer in a nested closure) only costs a fallback. - emit_fn_decl_flat: shared FnDecl encoder used by both the legacy arm and the streaming path, so they emit identically. - wired stmt_fn_decl into transform_stmt_list_item_cursor_to_flat. cursor.v: fix a latent bug the streaming exposed — Cursor.attribute() decoded comptime_cond (`@[if cond ?]`) with the limited attribute_expr() (ident/string subset only), so any non-trivial condition silently became empty_expr. Decode it with the full expr() (mirrors FlatReader.read_attribute). Without this, streamed @[if X ?] functions (track_heap/trace_error/ debug_strconv) were never elided. Verified BIT-EQUAL to the decode path: diffing the --no-parallel cleanc C of the whole compiler with streaming on vs off shows zero divergence in any compiled function (only the new streaming helpers themselves differ). transformer_flat_diff_test parity + all 19 transformer tests + ast cursor test; cleanc self-build of cmd/v2/v2.v (every fn streams or falls back) -> working compiler that compiles+runs recursion/loops/maps/defer correctly; module_storage_cache_test (attributes), cleanc_test, cleanc_target_e2e_test pass; -stats legacy=0. (The arm64 native self-host is pre-existing-broken on this branch — unresolved libc/captured-fn-literal symbols on the baseline too — so the bit-equal C diff is the stronger gate.)

Stage 5 of the flat-AST transform migration: stream the body of plain `for` loops (cond / classic / bare) cursor-native instead of decoding the whole loop to legacy AST. Loops nested inside stream recursively, so a function's entire control-flow body materialises one statement at a time. The dispatcher gains a `.stmt_for` arm. for-in loops (init is a ForInStmt) keep the whole-decl decode fallback — their range/array/ string/map/untyped lowering is the monolithic `transform_for_stmt` for-in branch plus the `try_expand_for_in_map` guard. The init-kind check mirrors `transform_for_stmt`'s own `stmt.init is ast.ForInStmt` branch exactly, so nothing for-in changes. `transform_for_stmt_streaming_to_flat` mirrors `transform_for_stmt`'s non-for-in tail (open scope, push cond `is`-check smartcasts, transform body, pop, transform init/cond/post, close scope). The body is transformed first to keep the legacy transform order (identical synth positions); the for-node edges are still [init, cond, post, body...], so output is structurally identical to the decode path. ast: CursorList gains an `offset` field (default 0, fully backward compatible) so a node's trailing edges can be viewed as a list; `Cursor.for_body_list()` views a stmt_for node's body (edges 3+). Verified: bit-identical generated C across all 214,948 lines of `cmd/v2/v2.v` compiled by a streaming vs a forced-fallback compiler (complete cleanc correctness proof); parity test, ast 10/10, transformer 19/19, cleanc_target_e2e, module_storage_cache, legacy=0; nested-loop and while-style programs run correctly.

…lice 6) Slice 6 groundwork for the flat-native parallel transform: a primitive that concatenates a whole src FlatAst (nodes, edges, strings, file roots) into a builder, relocating every node id / edge target and re-interning every string so the merged result decodes identically to src standalone. Returns the node-id offset applied to src's nodes. This will replace the per-worker legacy `ast.File` rehydrate-then-flatten: each worker emits into its own FlatBuilder, the main thread concatenates the outputs with append_flat. The hard part is FlatNode.extra (variant-specific). Audited against the canonical decoder (flat_reader.v) + every emit() call site: extra holds an interned string id for EXACTLY three kinds — .file (mod), .stmt_directive (value), .stmt_import (alias) — and those are re-interned on merge. Every other kind packs ints/counts/flags/list-boundaries in extra (assign lhs_len, map_init keys_len, fn_literal captured.len, string_inter width/precision, expr_lock lock/rlock, aux_int value, stmt_empty enum) and is copied verbatim. aux is always a token/sub-kind enum; pos.id is globally unique across merged inputs (and selector_names is keyed by pos.id), so both copy verbatim. Not yet wired into any path (behavior unchanged) — it's the standalone, unit-tested foundation. Test covers: node/edge relocation + offset accounting, name_id string remap across a deliberately-shifted destination intern table, all three string-extra kinds re-interned (verified by decoding the merged import/directive and reading the merged .file node's extra), two independent sources staying structurally intact, and the empty-source no-op. Verified: ast 11/11, transformer parity, v2 self-build + runs.

…ile rehydrate (slice 6) Wires the append_flat merge primitive into the parallel transform. Flat-codegen backends (cleanc/c/x64/arm64, i.e. !keep_files) now take a flat-native parallel path: each worker transforms its contiguous file range cursor-native straight into its OWN FlatBuilder (transform_file_index_with_extra_to_flat), then the main thread concatenates the per-worker flats in file order via FlatBuilder.append_flat and runs the same post_pass tail. No legacy ast.File is materialised on the default build path — this closes the gap the old comment named ("the flat AST has no thread-safe merge primitive to let workers append to one builder"). The `.v`/eval backends (keep_files) keep the legacy rehydrate-then-flatten path, which still needs the transformed []ast.File. Reuses the proven worker machinery unchanged: new_worker_clone (per-worker synth_pos_counter offset -worker_idx*100_000 keeps synth ids disjoint), merge_worker (output-format-independent), contiguous file ranges merged in spawn order to preserve file order. The per-file transform is the same cursor-native code the sequential --no-parallel path already uses (transform_flat_to_flat_direct). Verified: a forced-old-path vs new-path C diff of the whole cmd/v2/v2.v differs only in synthetic temp-variable names (_tuple_tmp_<id> / _st<id> — the cursor transform advances the synth counter in a different order); after normalizing those, new-parallel matches the PROVEN sequential flat-direct's symbol content (and even fixes a pre-existing old-parallel discrepancy where it dropped a few Array_f64/builtin symbols the sequential path keeps). Output is deterministic (run1==run2). Functional: new-path v3 self-compiles to a working v4, both compile and run programs correctly; cleanc_target_e2e + module_storage_cache + transformer (30) + ast (incl. append_flat) all pass.

collect_decl_type_aliases reads only signature-level type references (receiver / param / return types for FnDecl, field types for structs), yet the call site decoded the whole stmt — rehydrating every fn body just to read its signature. Decode the body-less signature (fn_decl_signature) for the fn_decl case instead. This was the only fn-body decode left on the DEFAULT cleanc build path: pass5_file_cost only runs under -stats, live-reload scan only under -live, and consts/globals have no bodies. So the default parse->transform-> markused->ssa->cleanc pipeline now decodes no fn body in cleanc gen. Verified: bit-identical generated C across all of cmd/v2/v2.v (the same type aliases are collected, the body is simply never decoded); cleanc_target_e2e and module_storage_cache (cache-bundle type-alias path) pass.

Comptime-guarded (`$if dbg_sel ?`) eprintln tracing for diagnosing `missing X.Y` selector errors / `write_string` method resolution: traces selector_expr's missing-symbol path, find_field_or_method, the lookup_method_direct intrinsic vs loop branches, lookup_method_for_type_name (env.methods presence/count), and error_with_pos. Inert in normal builds; enable with `-d dbg_sel`.

chatgpt-codex-connector · 2026-06-08T08:18:04Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

medvednikov added 22 commits June 8, 2026 04:31

v2: preregister flat fn signatures by cursor

c18630f

v2: keep flat ast on production paths

1c628de

v2: continue flat ast migration

aef3c18

v2: continue flat ast migration

c5ff295

more flat ast

d5228d9

cleanc fix no generics in v self

eb8e818

in t

02b141f

clean up nice

c929735

medvednikov merged commit d358576 into master Jun 8, 2026
65 of 83 checks passed

This was referenced Jun 8, 2026

v2: fix CleanC self-host C output GGRei/v#12

Closed

v2: cleanc selfhost c output #27376

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2: finish cleanc flat AST self-host migration#27390

v2: finish cleanc flat AST self-host migration#27390
medvednikov merged 22 commits into
masterfrom
codex/v2-flat-ast-next

medvednikov commented Jun 8, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

medvednikov commented Jun 8, 2026

Summary

Tests

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant