Skip to content

enable ast type-annotator pass (phase-e step 5, pr1)#573

Merged
cs01 merged 1 commit intomainfrom
refactor/typed-ast
Apr 20, 2026
Merged

enable ast type-annotator pass (phase-e step 5, pr1)#573
cs01 merged 1 commit intomainfrom
refactor/typed-ast

Conversation

@cs01
Copy link
Copy Markdown
Owner

@cs01 cs01 commented Apr 20, 2026

Summary

Enables the pre-codegen AST type-annotator pass introduced (but disabled) in the phase-e scaffolding. The annotator walks every expression in the AST once, resolves its type via the memoized TypeInference.resolveExpressionTypeRich, and stores the result in an expression-keyed cache. Codegen consumers can now read canonical types via ctx.typeOf(expr) instead of re-deriving from stringly-typed LLVM value names.

This PR is purely additive: it populates the cache. No codegen consumer reads from it yet. Consumer migrations land in follow-up PRs, one site at a time.

Why users benefit

Foundations work that retires whole classes of silent-wrong bugs. Today different codegen sites disagree about an expression's type because each re-derives it from whatever happens to be in scope; the annotator makes type info authoritative and shared.

Key changes

  • src/semantic/type-annotator.ts — new file. Post-order walker over the full AST, calls sink.resolveExpressionTypeRich then sink.appendExpressionType. Skips expressions with unknown base (resolver gaps remain on-demand for now).
  • BaseGenerator/IGeneratorContext — replaced expressionTypes: Map<Expression, ResolvedType> with parallel arrays expressionTypeNodes / expressionTypeValues. Native self-hosted Map lacks pointer-identity hashing (segfaults — see native-map-object-key-unsupported.md); linear-scan is mandatory until that's fixed.
  • expressionType* arrays are NOT cleared in reset() — they're keyed by AST identity which outlives per-function state.
  • New appendExpressionType fast-path skips dedup for the annotator, which guarantees each node is visited once.
  • typeOf() now reads the parallel-array cache first, falls back to on-demand resolution for unknown/missing-base expressions.

Test plan

  • npm run verify (full, stage 2 included) green locally.
  • Stage 0 / 1 / 2 self-hosting all pass. Stage 2 is the byte-exact oracle — proves the annotator doesn't perturb codegen output.

Follow-ups (not in this PR)

  • Migrate one codegen consumer from getVariableType(valueName) to typeOf(expr) to prove the pattern.
  • Expand migrations incrementally; each prior CLAUDE.md rule about stringly-typed side-channels retires as a consumer moves over.

…) cache pre-codegen; parallel-array storage avoids native map<object,v> segv
@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results (Linux x86-64)

Benchmark C ChadScript Go Node Place
Binary Trees 1.355s 1.256s 2.735s 1.174s 🥈
Cold Start 0.9ms 0.8ms 1.2ms 26.4ms 🥇
Fibonacci 0.814s 0.763s 1.565s 3.177s 🥇
File I/O 0.119s 0.093s 0.089s 0.206s 🥈
JSON Parse/Stringify 0.004s 0.005s 0.017s 0.016s 🥈
Matrix Multiply 0.440s 1.000s 0.613s 0.376s #4
Monte Carlo Pi 0.389s 0.410s 0.404s 2.248s 🥉
N-Body Simulation 1.666s 2.122s 2.202s 2.408s 🥈
Quicksort 0.214s 0.247s 0.212s 0.262s 🥉
SQLite 0.354s 0.369s 0.409s 🥈
Sieve of Eratosthenes 0.016s 0.028s 0.020s 0.039s 🥉
String Manipulation 0.008s 0.019s 0.016s 0.036s 🥉

CLI Tool Benchmarks

Benchmark ChadScript grep node xxd Place
Hex Dump 0.555s 0.999s 0.129s 🥈
Recursive Grep 0.020s 0.010s 0.103s 🥈

@cs01 cs01 merged commit f5c4496 into main Apr 20, 2026
13 checks passed
@cs01 cs01 deleted the refactor/typed-ast branch April 20, 2026 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant