Skip to content
Joshua Shinavier edited this page Jun 2, 2026 · 4 revisions

Design rationale

This page records deliberate design decisions in Hydra — the "why we chose this, and not the obvious alternative" behind aspects of the language that a reader might reasonably second-guess.

It is distinct from Concepts, which describes the model as it is. This page answers why. When a choice here is settled, the relevant Concepts/section links to it rather than repeating the justification inline.

Entries are settled decisions, not open questions. In-flight design exploration belongs in GitHub issues; once a decision is made and stable, summarize the rationale here and link the issue.


Ordered map keys

Hydra requires the key type of a map to be orderable. This is a deliberate design choice, not an inherited limitation: every term in Hydra — including a map term — needs a canonical representation that survives translation into any target language. A total order on keys gives a map a single canonical form (a deterministic key/value sequence) that every host reproduces identically, regardless of how that host implements maps internally.

In practice this excludes almost nothing. Hydra terms have a canonical order for all first-order shapes, so the only key types ruled out are those with no canonical identity at all — chiefly functions, which are excluded for a stronger reason than non-orderability: function equality is undecidable, so a function couldn't serve as a key even in an unordered (hash-based) map.

Two alternatives are sometimes raised — keys ordered by a supplied ordering function (rather than a natural order on the type), or merely-hashable keys paired with a hashing function. Both are perfectly constructible in Hydra today (e.g. a list of key/value pairs plus an ordering or hashing function), but neither is a fundamental constructor in the language, because the natural-order map already provides the canonical, translation-stable representation Hydra needs.

Discussion: issue #230.


Union introduction as inject

The constructor that builds a value of a union type (selecting one variant and supplying its payload) is called inject. This is standard typed-lambda-calculus terminology: inject is the introduction form for a sum type, dual to project (the elimination form for a product/record). Alternatives like variant or construct are sometimes proposed, but inject/project is the established, symmetric pair and is kept deliberately.


Annotations as a separate type and term variant

Hydra makes annotated a distinct variant of Type and Term (a record pairing an inner type/term with a map<Name, Term> of metadata), rather than baking metadata fields into every variant or carrying a parallel metadata map alongside each module.

This choice has three load-bearing consequences:

  • Stacking. Multiple annotation layers can be applied to the same inner type or term by nesting annotated nodes; tools that need to reason about a particular annotation key can peel layers from the outside without worrying about which inner type or term they apply to.
  • Small grammar. Every non-annotation variant has exactly the fields its semantic role requires. Adding a new annotation key never modifies the grammar of Type or Term — it just goes into the annotation map of an enclosing annotated node.
  • Easy traversal of the unannotated core. A traversal that doesn't care about metadata can call a single helper (kernel Annotations.deannotateType / …Term) once at the top and then walk the variants directly. The alternative — fields-on- every-variant — would require every traversal to ignore those fields explicitly, multiplied across every host.

The accepted cost is the "peel through annotations" step that consumers of the annotated form have to perform; in practice this is one helper call per traversal and is mechanical.


Coders bundle encode and decode

A Coder in Hydra pairs an encoder (source → target) with a decoder (target → source) in a single value, rather than letting them be authored as two independent functions. Alternatives are common — many serialization stacks ship an encoder and a decoder as separate modules, sometimes from different authors.

The bundled form is preferred for three reasons:

  • Round-trip identity is a single-source claim. When an author writes a Coder, they are simultaneously asserting the pair is consistent. Round-trip tests are then a property of the Coder rather than a cross-module coincidence: decoding the result of encoding a value should yield the original (modulo a documented lossiness flag).
  • One Coder per format pair. A reader who wants to know how Hydra exchanges with Avro looks at one place, not two. Format-pair-specific invariants (which fields encode where, how nominal types map, what's lossless and what isn't) live together with the code that enforces them.
  • Coder composition is closed. Two Coders composing into a third (Avro ↔ Hydra ↔ Protobuf, for instance) is well-defined because each side is itself bidirectional. A composition of two unidirectional encoders would give you a one-way Avro → Protobuf path with no symmetric way home.

The accepted cost is that authoring a Coder is harder than authoring an encoder alone — the author has to think about the inverse direction even when only one is immediately needed. In practice this cost is paid once per format, not per use.


Adapters as the type-aware layer above Coders

A Coder transforms values between two representations that already share a type schema. An Adapter adds the type-level half: source and target types, a Coder over those types' values, and an isLossy flag. The two frameworks are deliberately kept separate.

The rationale:

  • Value-level transformations don't always need a type-level mapping. Within a single type system, encode/decode is purely value-level; folding type-level fields into every Coder would burden the simpler case.
  • Cross-language coding inherently needs both. Adapting a Hydra record into a Java class involves two different type-level objects bound together with a value-level Coder and a lossiness claim. The Adapter is where both halves meet.
  • Composition has different rules at each level. Coders compose pointwise (Avro ↔ Hydra ↔ Protobuf at the value level). Adapters compose more carefully — both type-level mappings must compose, and the combined lossiness flag is the disjunction of the pair's.

The accepted cost is one more abstraction to learn before writing a cross-language coder.

Clone this wiki locally