-
Notifications
You must be signed in to change notification settings - Fork 24
Inference
Audience: readers who want to understand Hydra's type inference and checking — both how it works and why it is shaped this way.
Hydra performs Hindley-Milner (HM) type inference, and along the way elaborates untyped lambda terms into fully-typed System F terms — readers familiar with HM should note that the output is not just a type but also a typed term.
The work is split across two kernel modules that cooperate bidirectionally:
-
hydra.inferencedrives the synthesis direction: given an untyped term and a graph, it computes a type scheme for the term. This is the Algorithm-W half. -
hydra.checkingdrives the verification direction: given a term and a known type, it verifies that the term has that type. This is the half that consumes the explicit type information already present in the source — ascriptions, nominal-type contexts, expected argument types, and so on — and guides inference where partial type information is available.
Synthesis recovers types for ordinary terms; checking guides inference where partial type information is already known.
Inference takes place in the context of a particular lexical environment, which
Hydra (and the LambdaGraph paper) calls a Graph. The
Graph record threads through every inference judgment:
-
boundTermsandboundTypescarry the term variables already in scope and their type schemes. -
lambdaVariablesandtypeVariablesmark the variables introduced by the current chain ofλandΛbinders, so inference can distinguish "introduced here" from "imported". -
schemaTypescarries every nominal type scheme reachable from the current scope. -
primitivescontributes the type signature of every built-in function and constant. -
classConstraintsaccumulatesequality/orderingconstraints as inference encounters operations that require them.
This is richer than the textbook bare typing context Γ: by carrying the graph, inference can resolve nominal type names, primitive type signatures, and class constraints from one place, rather than threading several environments separately.
Hydra inherits the standard HM generalization discipline:
-
Let-polymorphism: variables bound by
letare generalized to type schemes after inference, then re-instantiated at each use site. This is the power that lets a let-boundidbe used at multiple types in one program. - Monomorphic lambdas: lambda-bound variables remain monomorphic within the body of the lambda. Lifting this restriction would require higher-rank inference, which is undecidable in the general case — HM is the sweet spot that keeps inference decidable while supporting practical polymorphism.
Polymorphism that does not fit this restriction is expressed explicitly via
type abstraction Λα.e and type application e ⟨τ⟩. Inference emits these
System F operations in the elaborated term; the user does not usually have to
write them.
The synthesis side is Algorithm W with the standard Damas-Milner extensions:
- Unification — Robinson's algorithm for solving type equations.
-
Generalization — at each
letbinding, free type variables in the inferred type that do not occur in the surrounding graph are bound by an outer ∀. -
Instantiation — at each use of a
let-bound variable, fresh type variables stand in for the generalized ones. - Substitution — type variables are progressively replaced as unification succeeds.
The checking side runs after — and sometimes during — synthesis to verify the result against any explicit type information present in the source.
A type class in Hydra is a constraint attached to a type variable that
propagates through inference: most commonly equality (the type supports
structural equality) or ordering (the type supports comparison). When
inference encounters an operation that requires one of these, it accumulates
the constraint on the relevant type variable in Graph.classConstraints.
Constraints are resolved at use sites where the type variable is instantiated
to a concrete type; if no satisfying instance exists, inference reports an
error.
This is a lightweight implementation of the type-class machinery and not the full dictionary-passing translation found in Haskell — Hydra's constraints are informational guards rather than runtime evidence.
The split into inference + checking, with elaboration to typed System F, gives Hydra three properties that matter for a translingual language:
- Multi-target soundness. Downstream coders consume the typed term directly and never have to re-run inference.
-
Cross-language type fidelity. Languages with first-class generics
(Java, Scala, TypeScript) translate the explicit
Λand type-application points directly; languages without (Python, Lisps, Go) use the same annotations for runtime checks or erasure under known schemes. -
Decidable inference. Restricting polymorphism to let-bindings and
keeping lambdas monomorphic means inference terminates; the user pays the
cost of explicit
Λonly when reaching for genuinely higher-rank patterns. - No annotations required. HM constructs principal types from untyped DSL sources, which is essential for usability: the developer writes Hydra programs in any DSL flavor without needing to know what type information each downstream coder will expect.
-
Hydra/Sources/Kernel/Terms/Inference.hs— synthesis side. -
Hydra/Sources/Kernel/Terms/Checking.hs— verification side. - Concepts — broader type-system framing (System F foundation, nominal types, the term/graph correspondence).
- The LambdaGraph paper presents the inference rules formally; the Hydra kernel modules above are the corresponding implementation.
- Issue #377 — ongoing alignment work between the paper's formal presentation and the kernel implementation.