Skip to content

Inference

Joshua Shinavier edited this page May 29, 2026 · 2 revisions

Inference

Audience: readers who want to understand Hydra's type inference and checking — both how it works and why it is shaped this way.

What inference does

Hydra performs Hindley-Milner (HM) type inference, and along the way elaborates untyped lambda terms into fully-typed System F terms — readers familiar with HM should note that the output is not just a type but also a typed term.

The work is split across two kernel modules that cooperate bidirectionally:

  • hydra.inference drives the synthesis direction: given an untyped term and a graph, it computes a type scheme for the term. This is the Algorithm-W half.
  • hydra.checking drives the verification direction: given a term and a known type, it verifies that the term has that type. This is the half that consumes the explicit type information already present in the source — ascriptions, nominal-type contexts, expected argument types, and so on — and guides inference where partial type information is available.

Synthesis recovers types for ordinary terms; checking guides inference where partial type information is already known.

The graph as inference context

Inference takes place in the context of a particular lexical environment, which Hydra (and the LambdaGraph paper) calls a Graph. The Graph record threads through every inference judgment:

  • boundTerms and boundTypes carry the term variables already in scope and their type schemes.
  • lambdaVariables and typeVariables mark the variables introduced by the current chain of λ and Λ binders, so inference can distinguish "introduced here" from "imported".
  • schemaTypes carries every nominal type scheme reachable from the current scope.
  • primitives contributes the type signature of every built-in function and constant.
  • classConstraints accumulates equality / ordering constraints as inference encounters operations that require them.

This is richer than the textbook bare typing context Γ: by carrying the graph, inference can resolve nominal type names, primitive type signatures, and class constraints from one place, rather than threading several environments separately.

What gets generalized

Hydra inherits the standard HM generalization discipline:

  • Let-polymorphism: variables bound by let are generalized to type schemes after inference, then re-instantiated at each use site. This is the power that lets a let-bound id be used at multiple types in one program.
  • Monomorphic lambdas: lambda-bound variables remain monomorphic within the body of the lambda. Lifting this restriction would require higher-rank inference, which is undecidable in the general case — HM is the sweet spot that keeps inference decidable while supporting practical polymorphism.

Polymorphism that does not fit this restriction is expressed explicitly via type abstraction Λα.e and type application e ⟨τ⟩. Inference emits these System F operations in the elaborated term; the user does not usually have to write them.

Algorithms

The synthesis side is Algorithm W with the standard Damas-Milner extensions:

  • Unification — Robinson's algorithm for solving type equations.
  • Generalization — at each let binding, free type variables in the inferred type that do not occur in the surrounding graph are bound by an outer ∀.
  • Instantiation — at each use of a let-bound variable, fresh type variables stand in for the generalized ones.
  • Substitution — type variables are progressively replaced as unification succeeds.

The checking side runs after — and sometimes during — synthesis to verify the result against any explicit type information present in the source.

Class constraints

A type class in Hydra is a constraint attached to a type variable that propagates through inference: most commonly equality (the type supports structural equality) or ordering (the type supports comparison). When inference encounters an operation that requires one of these, it accumulates the constraint on the relevant type variable in Graph.classConstraints. Constraints are resolved at use sites where the type variable is instantiated to a concrete type; if no satisfying instance exists, inference reports an error.

This is a lightweight implementation of the type-class machinery and not the full dictionary-passing translation found in Haskell — Hydra's constraints are informational guards rather than runtime evidence.

Why this shape

The split into inference + checking, with elaboration to typed System F, gives Hydra three properties that matter for a translingual language:

  • Multi-target soundness. Downstream coders consume the typed term directly and never have to re-run inference.
  • Cross-language type fidelity. Languages with first-class generics (Java, Scala, TypeScript) translate the explicit Λ and type-application points directly; languages without (Python, Lisps, Go) use the same annotations for runtime checks or erasure under known schemes.
  • Decidable inference. Restricting polymorphism to let-bindings and keeping lambdas monomorphic means inference terminates; the user pays the cost of explicit Λ only when reaching for genuinely higher-rank patterns.
  • No annotations required. HM constructs principal types from untyped DSL sources, which is essential for usability: the developer writes Hydra programs in any DSL flavor without needing to know what type information each downstream coder will expect.

Cross-references

  • Hydra/Sources/Kernel/Terms/Inference.hs — synthesis side.
  • Hydra/Sources/Kernel/Terms/Checking.hs — verification side.
  • Concepts — broader type-system framing (System F foundation, nominal types, the term/graph correspondence).
  • The LambdaGraph paper presents the inference rules formally; the Hydra kernel modules above are the corresponding implementation.
  • Issue #377 — ongoing alignment work between the paper's formal presentation and the kernel implementation.

Clone this wiki locally