Skip to content
This repository was archived by the owner on Apr 29, 2026. It is now read-only.

⏺ Part A (compile-time) is done end-to-end:#428

Merged
navicore merged 3 commits intomainfrom
variant-type
Apr 25, 2026
Merged

⏺ Part A (compile-time) is done end-to-end:#428
navicore merged 3 commits intomainfrom
variant-type

Conversation

@navicore
Copy link
Copy Markdown
Owner

  • Negative case: "alpha beta gamma" 0 variant.field-at now produces a clear, line-numbered compile error: at line 4: variant.field-at: stack type mismatch. Expected (..a$0 Variant Int), got (..rest String Int).
  • Positive case: a real union Message { ... } program with Make-Get, is-Get?, Get-chan still compiles and runs correctly (exit 42).

Summary of changes:

  • crates/compiler/src/types.rs — added Type::Variant arm + Display
  • crates/compiler/src/unification.rs — Variant ~ Variant and the Union(_) <: Variant relaxation rule, mirroring the existing Closure/Quotation pattern
  • crates/compiler/src/typechecker/{freshen,validation}.rs — covered the new arm
  • crates/compiler/src/builtins/macros.rs — Variant in the ty! macro
  • crates/compiler/src/builtins/adt.rs — all variant.* and wrap-* signatures now use Variant instead of free V
  • crates/lsp/src/completion.rs, crates/repl/src/ir/stack_effects.rs — display the new arm
  • crates/compiler/src/typechecker/tests.rs — 6 regression tests (5 negative, 1 positive)

  - Negative case: "alpha beta gamma" 0 variant.field-at now produces a
  clear, line-numbered compile error: at line 4: variant.field-at: stack
  type mismatch. Expected (..a$0 Variant Int), got (..rest String Int).
  - Positive case: a real union Message { ... } program with Make-Get,
  is-Get?, Get-chan still compiles and runs correctly (exit 42).

  Summary of changes:
  - crates/compiler/src/types.rs — added Type::Variant arm + Display
  - crates/compiler/src/unification.rs — Variant ~ Variant and the
  Union(_) <: Variant relaxation rule, mirroring the existing
  Closure/Quotation pattern
  - crates/compiler/src/typechecker/{freshen,validation}.rs — covered the
  new arm
  - crates/compiler/src/builtins/macros.rs — Variant in the ty! macro
  - crates/compiler/src/builtins/adt.rs — all variant.* and wrap-*
  signatures now use Variant instead of free V
  - crates/lsp/src/completion.rs, crates/repl/src/ir/stack_effects.rs —
  display the new arm
  - crates/compiler/src/typechecker/tests.rs — 6 regression tests (5
  negative, 1 positive)
@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

Code Review — PR #428: Part A (compile-time) variant type safety

Overview

This PR closes a real type-safety hole: previously all variant.* builtins accepted any type because they used a free type variable V, which happily unified with String, Int, etc. The fix introduces Type::Variant as a concrete nominal type that acts as a supertype of all Union(_) values. The approach is well-chosen (small blast radius, mirrors the existing Closure <: Quotation pattern) and the implementation is clean throughout.


Strengths

  • Minimal, targeted change. Only the files that needed updating got updated — freshen.rs, validation.rs, unification.rs, macros.rs, the two builtin signature files, and the two display sites in lsp and repl. The codegen wildcard _ arm already handles Type::Variant correctly via the generic value path, so no codegen changes needed.
  • Consistent pattern. The Union(_) <: Variant relaxation rule in unification.rs mirrors Closure <: Quotation line-for-line in structure. Future maintainers will immediately understand the intent.
  • Design document. VARIANT_OP_TYPE_SAFETY.md is thorough and explicitly scopes Part B (locatable panics) as follow-up. Keeping the decision record in the repo is good practice.
  • 6 regression tests covering the primary use cases.

Issues

1. Acknowledged unsoundness is bidirectional (medium severity)

// unification.rs:174
(Type::Union(_), Type::Variant) | (Type::Variant, Type::Union(_)) => Ok(Subst::empty()),

The comment acknowledges that the symmetric arm (Type::Variant, Type::Union(_)) lets a bare Variant value flow into a slot typed as a specific Union(name). Concretely this means code like:

variant.make-1 ... some-match-on-Message

…where variant.make-1 returns Type::Variant and some-match-on-Message expects Type::Union("Message") would typecheck even if the variant tag is wrong. The PR says "we accept for now" — that's a reasonable pragmatic call, but it should be tracked. Suggest adding a // TODO: tighten to Union(_) → Variant direction only or filing a follow-up issue so it doesn't silently become load-bearing.

2. Test message assertions are inconsistent (low severity)

test_variant_field_at_rejects_string verifies both the builtin name and the type appear in the error message:

assert!(err.contains("variant.field-at") && err.contains("String"), ...);

But the other four negative tests only check assert!(result.is_err()). If the error fires for the wrong reason (e.g., an unrelated stack underflow), these tests would still pass. Recommend at minimum checking err.contains("variant.tag") etc., matching the pattern set by the first test.

3. variant.last has no negative test (low severity)

Every variant.* builtin listed in adt.rs has a negative test except variant.last. The PR description says "5 negative tests, 1 positive" — adding variant.last would make coverage complete for the symmetric set.

4. No test for Union(_) → variant.* (i.e., the relaxation rule) (low severity)

The positive test covers Variant → variant.*, but the design doc states a key goal is that "user values typed as Union(name) [are] accepted." There's no test that passes a Union(name)-typed value to a variant.* builtin and verifies it typechecks. That rule is the main soundness-preserving half of the relaxation — it should have explicit coverage.


Minor nits

  • check_word_with_body and variant_call are test helpers defined inline rather than in a shared test utilities module. Not a problem for this PR, but worth extracting if more test modules need them.
  • The design doc's status header Status: design · 2026-04-25 reads as a live design doc but the implementation is already done. Updating to Status: implemented (Part A) · 2026-04-25 would reduce future confusion.

Summary

The core implementation is correct and well-structured. The two actionable items are:

  1. Track the bidirectional unsoundness — either a TODO comment or a follow-up issue. The symmetric arm is fine for now but should not be forgotten.
  2. Strengthen the three weak negative test assertions to match the pattern in test_variant_field_at_rejects_string.

The variant.last gap and missing Union → variant.* test are nice-to-haves that would make the test suite complete for the stated goals.

  Part B implementation:
  - New module crates/compiler/src/codegen/debug_info.rs —
  DICompileUnit/DIFile/DISubprogram/DILocation emitter with a clean
  lifecycle (dbg_init_program / dbg_open_subprogram / dbg_call_suffix /
  dbg_close_subprogram / dbg_emit_module_metadata).
  - crates/compiler/src/codegen/state.rs — added 6 fields for debug info
  state plus a public set_source_file() API.
  - crates/compiler/src/codegen/mod.rs — added the new module.
  - crates/compiler/src/codegen/words.rs — opens/closes a subprogram
  around each user word; the QuotationScope now snapshots
  current_dbg_subprogram_id so quotation bodies don't bleed metadata
  across function boundaries.
  - crates/compiler/src/codegen/statements.rs — every emitted
  call/musttail/ret carries a !dbg record when the statement has a span.
  - crates/compiler/src/codegen/program.rs — flushes the metadata buffer
  at the tail of every IR emit path.
  - crates/compiler/src/lib.rs — passes the source path into codegen and
  -g to clang.

  Runtime cost: none. DWARF lives in object/dSYM files (macOS) or
  non-loaded ELF sections (Linux); the .text and the actual instruction
  stream are unchanged. The 50M tail-call benchmark runs in ~120ms user
  time. just ci is fully green.

  End-to-end, the original repro now fails at compile time with line 4
  named (Part A), and any legitimate runtime panic (FFI failures, OOM,
  division by zero, etc.) resolves the Seq frames in its backtrace to
  .seq:line:col (Part B).
@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

Code Review — PR #428: Variant-op Type Safety + DWARF Debug Info

Overview

This PR delivers two related improvements in one shot:

  • Part A (type safety): Introduces Type::Variant as a first-class opaque type so all variant.* builtins reject non-variant inputs at compile time. Union(_) <: Variant is a one-way relaxation so named unions continue to work with these ops.
  • Part B (locatable panics): Emits LLVM !DISubprogram and !DILocation metadata in the codegen pass, attaching DWARF locations to every user-defined word and every call site so runtime panics resolve back to .seq:line:col via the standard backtrace path.

The type-system changes are clean and well-reasoned. The DWARF work is architecturally sound. A few issues worth addressing before merging.


Issues

🐛 Metadata ID collision for large programs (debug_info.rs)

dbg_emit_module_metadata reserves IDs cu_id + 1000 and cu_id + 1001 for the module-flag tuples:

let _ = writeln!(ir, "!llvm.module.flags = !{{!{}, !{}}}", cu_id + 1000, cu_id + 1001);
let _ = writeln!(ir, "!{} = !{{i32 7, ...}}", cu_id + 1000);
let _ = writeln!(ir, "!{} = !{{i32 2, ...}}", cu_id + 1001);

cu_id is 1002 (file=1000, cu=1001, sub_ty=1002 at init). The counter keeps incrementing — one DISubprogram per word, one DILocation per call site. At ~1000 metadata records the counter overtakes cu_id + 1000 = 2002 and you get duplicate IDs, which is an LLVM IR verifier hard error.

Fix: allocate the two module-flag IDs with dbg_alloc_id() during dbg_init_program and store them, just like dbg_file_id / dbg_cu_id / dbg_subroutine_type_id.


⚠️ !dbg on ret inside a musttail block may be fragile

In statements.rs, the same dbg suffix is appended to both the musttail call and the following ret:

writeln!(&mut self.output, "  %{} = musttail call tailcc ptr @{}(ptr %{}){}",
    result_var, function_name, stack_var, dbg)?;
writeln!(&mut self.output, "  ret ptr %{}{}", result_var, dbg)?;

musttail in LLVM requires the ret to be an exact mirror of the call with no intervening instructions or metadata records changing the control flow. LLVM's verifier is strict about this in some versions; attaching !dbg to the ret has been a source of miscompiles when it causes the verifier to treat the block as no longer a valid musttail pair. The safe pattern is to put !dbg only on the call and leave the ret bare. Worth adding an IR-level smoke test to catch this if it regresses.


⚠️ Union(_) → Variant direction is acknowledged unsoundness — needs a tracking issue

The comment in unification.rs says:

the symmetric form is a minor unsoundness in the reverse direction (a Variant flowing back into a Union(name) slot) that we accept for now.

This means a bare Variant satisfies a Union("Message") constraint, silently dropping the type name. Since this is intentionally deferred, it deserves a tracking issue so it doesn't get lost. If it's already tracked somewhere, a link here would help.


ℹ️ Debug info is unconditional — design doc said "gated behind --debug if needed"

lib.rs unconditionally calls codegen.set_source_file(...) and passes -g to clang. The design doc (VARIANT_OP_TYPE_SAFETY.md, Part B) mentions gating it behind a flag. This is fine as a starting point, but it silently increases binary size for every compile. Either gate it, or explicitly update the doc to say the flag is always on.


Suggestions

Missing positive test for Union(_)variant.*

All five negative tests confirm rejection, but there is no test that a value typed as a named Union("Foo") is accepted by variant.field-at (exercising the relaxation rule directly). test_variant_make_then_field_at_typechecks covers Variant → variant.* via make-1, but not Union(_) → variant.*. Worth a short test.

Minor: dbg_call_suffix checks dbg_enabled() after the pattern match

pub(super) fn dbg_call_suffix(&mut self, span: Option<&Span>) -> String {
    let (Some(scope), Some(sp)) = (self.current_dbg_subprogram_id, span) else {
        return String::new();
    };
    if !self.dbg_enabled() {  // ← this is dead code when dbg disabled,
        return String::new(); //   since current_dbg_subprogram_id is None
    }
    ...

When debug info is disabled current_dbg_subprogram_id is always None (because dbg_open_subprogram guards on dbg_enabled()), so the second guard is never reached. No bug, but it reads as if it could be — moving dbg_enabled() first would match the pattern in the other methods and clarify the intent.


What's good

  • Type::Variant is the right abstraction. Introducing it as an opaque ground type (not a kinded variable) keeps the change minimal. The Union <: Variant one-way rule mirrors the existing Closure <: Quotation pattern consistently.
  • escape_md is correct. Path strings can contain backslashes on Windows and quotes in unusual directories; both are handled.
  • QuotationScope::dbg_subprogram_id save/restore is correct. Quotations emit into their own LLVM functions with no parent DISubprogram, so clearing current_dbg_subprogram_id during quotation emission and restoring it afterward prevents invalid scope references.
  • dbg_init_program idempotent guard (if self.dbg_cu_id.is_some() { return }) and dbg_open_subprogram's bootstrap call mean callers don't have to coordinate init order — that's good defensive design.
  • Test coverage for negative cases is solid. Five targeted rejection tests with clear assertions cover the main variant.* ops.

Summary

The core of the PR — Type::Variant + unification rule + signature updates — is correct, well-tested, and ready. The DWARF work is mostly solid but the metadata ID collision in dbg_emit_module_metadata is a real bug that will surface on any non-trivial program; the module-flag IDs need to be allocated through the counter, not as fixed offsets. The !dbg on the tail-call ret is worth verifying against LLVM's verifier before shipping.

  Real bug fixed:
  - crates/compiler/src/codegen/state.rs + debug_info.rs — module-flag IDs
   now allocated through dbg_alloc_id() and stored in dbg_module_flag_ids:
   Option<(usize, usize)>. The cu_id + 1000 collision class is gone.

  Correctness/soundness:
  - crates/compiler/src/codegen/statements.rs — !dbg removed from the ret
  after a musttail call. The call's !dbg is sufficient for backtrace
  resolution (confirmed: 4-deep panic trace still resolves all frames).
  - crates/compiler/src/typechecker/tests.rs — three weak negative tests
  (field_count, init, append) now assert the builtin name appears in the
  error message, matching the strong-test pattern.
  - crates/compiler/src/typechecker/tests.rs — added
  test_variant_last_rejects_string (closes the missing variant.*
  coverage).
  - crates/compiler/src/typechecker/tests.rs — added
  test_union_value_accepted_by_variant_field_at directly exercising the
  Union(_) <: Variant relaxation rule.

  Polish:
  - crates/compiler/src/unification.rs — TODO note on the bidirectional
  symmetric arm explaining the intentional unsoundness and the path to
  tighten it.
  - crates/compiler/src/codegen/debug_info.rs — dbg_enabled() check moved
  before the pattern match in dbg_call_suffix (clarifies intent).
  - docs/design/VARIANT_OP_TYPE_SAFETY.md — status header updated to
  implemented (Part A and Part B).

  just ci is green; the deep-panic e2e check still resolves all four Seq
  frames to .seq:line:col.
@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

Code Review — PR #428: Variant-op type safety + locatable runtime panics

This PR bundles two related but distinct improvements: Part A (compile-time Type::Variant safety) and Part B (DWARF debug info for locatable panics). Both are well-motivated and generally well-implemented. Notes below, roughly ordered by severity.


Overview

Part A introduces Type::Variant as a nominal type for the variant.* / wrap-* builtins, replacing the free type variable V that previously let String, Int, etc. unify silently. Part B emits LLVM !DICompileUnit / !DISubprogram / !DILocation metadata from codegen so runtime-panic backtraces resolve to .seq:line:col. The design doc (VARIANT_OP_TYPE_SAFETY.md) closely tracks the implementation, which is helpful.


Concerns

1. -g is unconditionally passed to clang (Part B — notable)

crates/compiler/src/lib.rs unconditionally calls codegen.set_source_file(...) and always passes -g to clang:

codegen.set_source_file(source_path.to_path_buf());
// ...
clang.arg("-g")

The design doc explicitly hedges: "gated behind a --debug flag if needed". In practice:

  • Binary size increases (DWARF sections can be substantial for generated IR with many call sites).
  • Compile time increases — clang's DWARF generation isn't free.
  • Users can't opt out of debug info for production/release binaries.

Recommendation: gate set_source_file and -g behind a --debug / --emit-debug-info flag (or at minimum a debug_info: bool field in whatever compile config struct drives compile_file_with_config). The zero-runtime-cost claim holds, but binary-size cost is real.

2. Bidirectional Union <: Variant relaxation is acknowledged unsound (Part A — design note)

The TODO comment in unification.rs is admirably honest:

// TODO: tighten to a directional rule once the typechecker tracks
// which side of a unification is "expected" vs "actual". Today a
// `Variant` (e.g. the result of `variant.append`) silently
// satisfies a `Union(name)` constraint without checking the tag —
// intended pragmatic loophole, not a permanent stance.
(Type::Union(_), Type::Variant) | (Type::Variant, Type::Union(_)) => Ok(Subst::empty()),

The consequence: a word typed ( -- Message ) can be implemented with a bare variant.make-1 returning Variant, and the typechecker won't complain — the Variant flows into the Union("Message") slot without checking the tag. This is the intended behavior for now, but it's worth a test that demonstrates the loophole so future tightening has a clear regression target:

#[test]
fn test_variant_silently_satisfies_union_slot() {
    // Documents the known unsoundness: Variant can satisfy Union(name).
    // Remove / update when the TODO in unification.rs is addressed.
    let result = check_word_with_body(
        "loophole",
        StackType::Empty,
        StackType::singleton(Type::Union("Msg".to_string())),
        vec![
            Statement::Symbol("Get".to_string()),
            variant_call("variant.make-0"),
        ],
    );
    assert!(result.is_ok(), "expected Variant to satisfy Union(Msg) via relaxation");
}

3. variant.append return type changed from V2 to Variant (Part A — subtle semantic change)

Previously variant.append was typed (a V T -- a V2) where V2 was a distinct free variable. Now it's (a Variant T -- a Variant). This means callers that had V2 unified with a Union(name) output constraint would previously have gotten a V2 ~ Union("Msg") binding; now they silently succeed via the relaxation rule. This is consistent with the relaxation, but worth calling out explicitly in the test suite or a comment — the semantic is "we've appended a field; the resulting type is still Variant, not the original union, even if the input was Union(name)."

4. Missing test for variant.append returning Variant when input is Union(name)

There are good negative tests for rejecting non-variant inputs, and a positive test for variant.make-1 → variant.field-at. But there's no test verifying:

  • Union("Foo") → variant.append → Variant (loses union info, correct behavior)
  • Union("Foo") → variant.field-at accepted (the Union <: Variant positive path for non-make builtins — test_union_value_accepted_by_variant_field_at covers field-at specifically, which is good)

variant.last, variant.tag, variant.field-count, variant.init with a Union input could each use a positive test mirroring test_union_value_accepted_by_variant_field_at.

5. dbg_emit_module_metadata guard order (Part B — minor)

In debug_info.rs:

pub(super) fn dbg_emit_module_metadata(&self, ir: &mut String) {
    let (Some(cu_id), Some((dwarf_ver_id, debug_info_ver_id))) =
        (self.dbg_cu_id, self.dbg_module_flag_ids)
    else {
        return;
    };
    if !self.dbg_enabled() {   // ← checked after destructuring
        return;
    }

dbg_enabled() is checked after the destructuring. Since dbg_cu_id is None when debug is disabled, the early-return in the let-else covers it — so this is not a bug. But swapping the order (check dbg_enabled() first) would be clearer and consistent with the other methods.

6. escape_md doesn't handle control characters in paths (Part B — low risk)

escape_md escapes \ and " for LLVM IR metadata strings, which is correct for the common case. Linux paths can technically contain arbitrary bytes except / and NUL. LLVM IR metadata strings don't have a well-defined encoding for non-printable chars. This is unlikely to cause issues in practice, but a note or a debug_assert!(!s.contains('\0')) would surface the edge case early.

7. PR title/description mismatch (minor)

The title says "Part A (compile-time) is done end-to-end", but both Part A and Part B are implemented. The description body does enumerate Part B changes, so this is just a title inconsistency — worth a one-word fix for future git log clarity.


What's done well

  • dbg_init_program is idempotent — safe to call from dbg_open_subprogram as a bootstrap, clean API.
  • Quotation scope handling — saving/restoring current_dbg_subprogram_id in QuotationScope correctly prevents !dbg annotations inside quotation bodies from referencing an out-of-scope subprogram. The inline comment explains why.
  • musttail + !dbg — putting the debug suffix on the call and leaving the ret bare is correct per LLVM's musttail constraints.
  • Module-flag IDs allocated at program init — avoids any collision with per-call-site location IDs on large programs.
  • let _ = writeln!(...) throughout — suppresses the spurious must_use lint on String's Write impl without silencing real errors. Idiomatic.
  • Test helper check_word_with_body + variant_call — reduces boilerplate nicely; the helpers are scoped to the test module and don't bleed into production code.
  • Design doc maps directly to the implementation and the checkpoints are useful acceptance criteria.

Summary

The core type-safety improvement (Part A) is sound and well-tested. The main action items are:

  1. Gate -g / set_source_file behind a compile option (avoids unconditional binary-size cost).
  2. Add a documented test for the Variant → Union loophole so future tightening has a regression target.
  3. Consider positive-path tests for the remaining variant.* ops with Union inputs.
  4. Fix the dbg_emit_module_metadata guard order (cosmetic).

@navicore navicore merged commit e492d07 into main Apr 25, 2026
2 checks passed
@navicore navicore deleted the variant-type branch April 25, 2026 16:51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant