Skip to content

Type aliases: storage, resolution, protocol unions (progress on #15127)#15366

Closed
cgarvis wants to merge 2 commits into
elixir-lang:mainfrom
cgarvis:feature/set-theoretic-type-aliases
Closed

Type aliases: storage, resolution, protocol unions (progress on #15127)#15366
cgarvis wants to merge 2 commits into
elixir-lang:mainfrom
cgarvis:feature/set-theoretic-type-aliases

Conversation

@cgarvis
Copy link
Copy Markdown

@cgarvis cgarvis commented May 13, 2026

Progress on #15127's "Type aliases" bullet.

What ships

@type t :: %__MODULE__{name: binary()} declarations resolve through a new Module.Types.Typespec AST-to-Descr converter and persist via the ExCk chunk. The checker reads them from Of.struct_instance/7 and emits a new :badstructfield diagnostic when a value doesn't match its declared field type. Function-type fields, remote type references, and protocol t/0 (union of impls, written at consolidation) all resolve precisely. Was dynamic() before.

The decision worth flagging: protocol t/0 as closed-world per build artifact

Elixir protocols are open-world by design. Consolidation closes the world for a given build. Three approaches considered:

  • Rust dyn Trait: keep the type opaque. Gives up refinement entirely.
  • Sealed protocols (Scala, Java, Dart, Swift): author opts in. Cleanest, but a language addition.
  • This PR: closed-world per artifact. Union is correct for the build. Downstream code sees it on compilations following consolidation.

@fallback_to_any true protocols (most stdlib) collapse to term() because Of.impl(Any, :open) is term(). Third-party protocols without the fallback get the full benefit.

Not in scope (separate follow-ups)

  • Same-run downstream re-check after consolidation. Precision arrives on subsequent compiles.
  • Parametric t(element).
  • INIT-bootstrap modules' missing :types.

Verification

  • Full checker test suite: 508/508
  • make test: 14/8362 failures, all OTP-28.0 stderr-banner contamination (lib/elixir/src/elixir.erl:134), identical on origin/main. Not branch-related.
  • make format && make clean compile: clean.

Hex bench: top 100 packages

0 new compile errors. 14 distinct upstream typespec violations surfaced (Postgrex.Result, Absinthe.Resolution, nimble_options.error_tuple, Bandit.Adapter, Finch.HTTPError, etc.). First upstream fix already filed: dashbitco/nimble_options#151.

Performance: +1.53% wall-clock across all 100 packages. Concentrated in typespec-heavy libraries (ecto_sql +18%). fetch_remote_types memoization in the second commit recovered about half the regression from the first.

AI disclosure

Implementation assisted by Claude. Human review and architectural decisions throughout. Sign-off carries DCO responsibility.

cgarvis added 2 commits May 12, 2026 22:09
…lixir-lang#15127

This branch advances the Type aliases bullet of elixir-lang#15127. The work is
split across four logical pieces; they're combined here as one PR
since each piece depends on the previous one and they all share a
test surface in lib/elixir/test/elixir/module/types/integration_test.exs.

## 1. Foundation — typed struct fields via @type t

A new `Module.Types.Typespec` converter walks typespec ASTs and
produces `Module.Types.Descr` values. Built-ins, literals, unions,
struct literals (%__MODULE__{...} and %Mod{...}), local references,
and self-qualified references are all handled. Memoized via a
`:pending` sentinel so self-cycles surface as an error; cycles in
real code degrade the whole alias to `dynamic()` rather than break
the build.

`Kernel.Typespec.translate_typespecs_for_module/2` invokes the
converter and stores the result in the module's data tables under
`{:elixir, :types_descr}`. `Module.ParallelChecker.cache_from_module_map`
snapshots it into the long-lived checker ETS table during
`spawn_parallel_checker`, so the async checker can read it after the
per-module data tables are torn down. The chunk-write path in
`elixir_erl:compile/2` adds an optional `:types` key to the ExCk
chunk; older readers ignore it (additive, no version bump).

`Module.Types.Of.struct_instance/7` consults `User.t/0` to type-check
field values, emitting a new `:badstructfield` diagnostic on
mismatch. Default values injected by defstruct are skipped to avoid
noise.

To make this available to stdlib, `module/types/typespec.ex` is
added to bootstrap MAIN — without this, modules compiled before
Module.Types.Typespec is loaded would never get `:types` in their
chunks.

## 2. Function types and remote type references

The converter is extended to handle two more typespec shapes:

  * Function types `(args -> result)` parse as a single-element list
    containing `{:->, _, [args, return]}`. The handler calls
    `Descr.fun/2` (or `Descr.fun/0` for variadic `... -> result`).

  * Remote type references `Mod.t()` are resolved by reading the
    target's ExCk chunk via `:beam_lib.chunks` on the in-memory
    binary (`:code.get_object_code/1`, not the disk path, which
    fails when the beam isn't written yet). `@opaque` is dynamic
    from outside the defining module; parametric arity > 0 degrades
    silently because the offending typespec lives in another module.

## 3. Same-compile-unit ordering for typespec refs

When module B's typespec references `A.t()`, B needs A's beam to be
loaded by the time the converter runs. Without help, the parallel
compiler doesn't see typespec references as deps and may schedule B
before A is loaded.

`Kernel.Typespec.typespec/4`'s remote-call branch now calls
`Kernel.ErrorHandler.ensure_compiled` to block until A's beam is
available — the same mechanism struct expansion uses. A
`:type_reference` trace event is also emitted via `:elixir_env.trace`
so external tools (mix xref, custom tracers) can consume it; the
lexical tracker itself is intentionally not wired, since existing
tests assert that typespecs don't add compile/runtime deps. To make
`fetch_remote_types` see the in-memory checker ETS from each spawn
subprocess, `elixir_module.erl` propagates `:elixir_checker_info`
into the subprocess's process dictionary.

## 4. Protocol t/0 union

Protocol consolidation now writes the union of every implementing
struct/built-in type into the consolidated chunk's
`:types[{:t, 0}]`. `Protocol.consolidate/5` extracts the domain
already computed by `new_signatures/5` (which uses
`Module.Types.Of.impl/2` for each impl) and stores it. The result is
that references like `Enumerable.t()` in downstream typespecs become
semantically meaningful: a value typed as `Foo.t()` is statically
known to be one of the implementing types.

### Trade-off rationale for the protocol approach

Three alternatives were considered:

  * **Rust `dyn Trait`** — keep the type opaque even after link-time
    vtable construction. Gives up refinement entirely.

  * **Scala/Java/Dart/Swift sealed traits** — author opts into a
    closed-world definition. Cleanest answer overall but requires a
    language addition.

  * **Closed-world per-build-artifact** (this commit). Open by
    language design, closed-by-consolidation for the current
    artifact. The union is correct for this build and useful for
    downstream type-checking on subsequent compilations.

For protocols with `@fallback_to_any true` (most stdlib protocols),
`Of.impl(Any, :open)` is `term()` and the union collapses to
`term()`. Technically correct — those protocols runtime-accept any
value — but no useful refinement. Third-party protocols without
the fallback get the full benefit.

## What this branch does NOT address

  * **Same-run downstream re-check after consolidation.**
    Consolidation runs after the rest of compile, so dependents
    type-checked during the same `mix compile` saw the
    unconsolidated chunk. Precision arrives on subsequent compiles.

  * **Parametric `t(element)` of protocols and aliases.** Consolidator
    writes only `{:t, 0}`. Parametric protocol types are a separate
    bullet in elixir-lang#15127.

  * **INIT bootstrap modules' missing `:types`.** Modules in the very
    early bootstrap (Macro, Range, Keyword, etc., before
    `kernel/typespec.ex`) still don't get `:types`. Moving
    `Module.Types.Descr` to INIT would unlock them but is a much
    larger reshuffling.

## Verification

  * 508 checker tests pass (+15 from new typed-struct, function-type,
    remote-ref, dep-tracking, and protocol-union tests).
  * 79 top-level typespec tests pass.
  * 26 Protocol tests pass.
  * 36 lexical tracker tests pass.
  * 55 Mix.Dep tests pass.
  * `make clean && make compile` produces no new stdlib warnings.

Signed-off-by: Chris Garvis <cgarvis@gmail.com>
Follow-up commit on top of the type-aliases foundation. Three categories
of change, all stylistic or performance — no new behavior.

## Bug fix: false-positive "protocol for missing module" warnings

`Protocol.__impl__!/3`'s missing-module check was treating transient
parallel-compiler deadlock states (`Code.ensure_compiled` returns
`{:error, :unavailable}`) as genuinely-missing modules and emitting a
misleading warning. The new typespec `ensure_compiled` calls
introduced earlier in this series caused more of these transient
states, surfacing ~33 false positives across the top 100 Hex
packages.

Fix: distinguish `:unavailable` (transient — compiler may resolve)
from `:nofile`/`:badfile`/`:embedded` (genuinely missing). Only the
latter group warrants the warning.

## Refactors (no behavior change)

  * `Module.Types.Of.struct_instance/7`: extract a small helper for
    the typed-field-vs-expected fallback (`expected_field_type/2`).
  * `Of.format_diagnostic({:badstructfield, ...})`: change message to
    "but expected type:" to match the convention used by `:badmap`
    and sibling diagnostics.
  * `Module.Types.Typespec`: function-spec clause uses two pattern
    heads instead of a single-branch `cond`; collapsed an intermediate
    `case pairs` in `fetch_remote_types_from_checker`; reordered
    `do_to_descr/2` clauses so the hot remote-call shape matches
    earliest.
  * `Kernel.Typespec.convert_type_to_descr/2`: inlined a bare `arity =`
    binding from a `with` chain.
  * `Protocol.consolidate/5`: replaced 3-line Map.get/put/put dance
    with `update_in` matching the adjacent `update_in(checker.exports)`
    style.

## Performance (measurable)

  * `Module.Types.Typespec.fetch_remote_types/1`: memoize via process
    dictionary, scoped per compile worker. Eliminates redundant
    `:beam_lib.chunks` + `:erlang.binary_to_term` calls for the same
    remote module within one typespec block. Highest-payoff
    optimization for typespec-heavy modules (Ecto schemas reference
    `String.t()` etc. 30+ times per module).
  * `Kernel.Typespec.typespec/4` remote handler: dedupe
    `Kernel.ErrorHandler.ensure_compiled` calls per worker process via
    `:"$elixir_typespec_ensured"` process-dict set. Eliminates redundant
    `:waiting` round-trips to the parallel compiler.

## Bootstrap safety

`Kernel.Typespec.collect_defined_type_pairs/1`'s use of `Enum.any?`
was crashing during the bootstrap of `Range` (Enum isn't compiled yet
at that point). Replaced with `:lists.any/2`, an Erlang BIF that's
always available. Pure bootstrap fix; no runtime behavior change.

## Tests

  * Updated 6 `integration_test.exs` assertions to match the new
    `:badstructfield` diagnostic wording.
  * Full checker suite: 508/508 passes on clean rebuild.
  * Protocol suite: 26/26.
  * Top-level typespec: 79/79.
  * `make clean && make compile` clean.

Signed-off-by: Chris Garvis <cgarvis@gmail.com>
cgarvis added a commit to cgarvis/nimble_options that referenced this pull request May 13, 2026
The struct already stores a list of atoms for the "unknown options"
error (lib/nimble_options.ex:524 passes a list of keys; test
"unknown options" asserts `key: [:not_an_option1, :not_an_option2]`).
The @type for the field said `atom()` only, so the new set-theoretic
checker in elixir-lang/elixir#15366 flags every caller of error_tuple
at that site and cascades into every package that depends on
nimble_options.

Widening to `atom() | [atom()]` matches actual runtime behavior. Doc
updated to call out the multi-key case. No code change, no test
change.

Signed-off-by: Chris Garvis <cgarvis@gmail.com>
@josevalim
Copy link
Copy Markdown
Member

Do not use coding agents to tackle existing issues unless they have the "Contributions Welcome" label.

@josevalim josevalim closed this May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants