Skip to content

Design note: Revising the unifier

nikswamy edited this page May 18, 2018 · 10 revisions

F*'s unifier, FStar.TypeChecker.Rel, is inefficient, incomplete and more complicated than it ought to be.

  • Incompleteness: This mainly arises from unimplemented functionality, e.g., it unifies match terms only if they are syntactically identical.

  • Complex: Besides unification, Rel also attempts to solve subtyping problems while computing logical guards for each subtyping relation. Computing logical guards during subtyping is needed because F* tries to compute a VC for a term in a single pass, inferring types and building a WP at once. With 2-phase type-checking, this is no longer necessary, since type-inference can be separated from the WP computation.

But, this note is not about the above two points, which will be addressed separately.

Inefficiencies in the Unification Algorithm

A poor choice of representation

The representation chosen for meta-variables (aka unification variables) by F* is extremely conservative and leads to poor performance. Here's a brief description of how it currently works:

  1. F* introduces a meta-variable in a context G |- _ : t where a missing term is to be inferred.

  2. The strategy is to introduce a new meta-varibable ?u at the function type G -> t, abstracting over the entire context at that program point. This is represented as a Tm_uvar (?u, G -> t)

  3. The missing term is inferred to be ?u G, i.e., the newly introduced meta-variable applied to all the variables in the current context.

This representation has some benefits: e.g., the solutions chosen for ?u will always be a closed term fun G -> e. This simplifies some bookkeeping, e.g., we don't have to worry about a meta-variable escaping its scope; substitutions encountering a Tm_uvar node can just stop, etc.

But, this benefits are overwhelmed by the significant performance impact, especially when the context G is large, of always building huge abstraction and application nodes, even when the ultimate solutions to the meta-variable could be something very simple, like unit.

Worse, although building the term in this form is fast and memory efficient (there's a lot of sharing), preserving sharing is difficult and when it is inevitably broken, the size of the term explodes, with the large context G being replicated many times over. Note, the context G itself can have within it other meta-variables, e.g., G = G', x:(Tm_uvar(?ux, G' -> tx) etc.

In summary, the current representation is very pessimistic: it picks the most general form for a missing term immediately, presuming that its solution may depend on anything in the context so far. A smarter representation would be more optimistic, presuming that the solution is simple and local and unlikely to depend on much of the context, only introducing generality when it becomes necessary.

A revised representation of meta-variables

Thanks to Leonardo de Moura for many discussions that lead to this redesign.

Instead of introducing uvar at a function type abstracting over the entire context, let's introduce a uvar in a context. i.e., the representation of meta-variable is G |- ?u : t, or Tm_uvar : binders -> uvar -> typ -> term . In what follows, I'll write this as ?u_G;t, omitting either G or t when it is only distracting.

One intuition to keep in mind is that ?u_G;t is really just an optimized representation for the old ?u_(G -> t) G.

Note, with this representation, we immediately give up on the invariant that the solution to a meta-variable will always be a closed term. This means that some optimizations that rely on this invariant, notably in term substitutions, will have to be removed or revised.

Remark 1: Given a Tm_uvar bs u t, the binders bs and t are not really part of the term. For instance, when applying a substitution s to this node, we don't need to substitute in either t or in bs, since these are part of the term's context, not the term itself.

Closing substitutions and lazily introducing dependencies

Consider typing the following term in the context G : (fun (x:int) -> Nil) 17.

  1. We open the abstraction and introduce a meta-variable for the type argument of Nil, call it ?u_(G,x:int;Type).

  2. When closing the abstraction and producing its function type, we have to be careful.

If we naively represent the term as fun (x:int) -> Nil #(?u_(G,x:int; Type)) and its type as x:int -> list (?u_(G,x:int; Type)) then things will go wrong in several ways:

a. The context subscript G,x:int is no longer well-formed: contexts contain names, but the name x is not in scope in the body of the function, since our representation is locally nameless.

b. When we later type the application with argument 17, we will miss the dependence and compute the result type list list (?u_(G,x:int; Type))[17/x], but this substitution is the identity, given Remark 1, above.

Instead, as we close binders, we need to explicitly introduce dependencies in any meta-variables that remain in the body of the term.

Specifically, a closing substitution for x must turn ?u_(G,x:t; s) into (?u_(G; x:t -> s)) x. In the example above, this means:

  • Producing the term: fun (x:int) -> Nil #(?u_(G; x:int -> Type) x)
  • The type x:int -> list (?u_(G; x:int -> Type) x)

Which will allow us to type the application node (correctly tracking the dependence) as:

  • list (?u_(G; x:int -> Type) 17)

This closing substitution for meta-variables can be bundled with our existing delayed substitution machinery, avoiding repeated term traversals.

Basic moves in the unification algorithm

Some terminology:

  • A term with a meta-variable at its head ?u e1 ... en is a 'flex' term
  • Other terms are 'rigid terms
  • A 'pattern' is a flex term of the form ?u_(G;t) x1...xn where all the names in (G, x1, ... xn) are pairwise distinct
  • Note, a meta-variable ?u_(G;t) is always a pattern.

Rigid-Rigid

Unification problems of the form t =?= t' where both sides are rigid are easy to process immediately and are unchanged. We match the head symbols (after unfolding etc.) and generate sub-problems for the arguments.

Flex-Rigid Patterns (Rigid-Flex is symmetric)

Given ?u_G x1..xn = t:

If the LHS is a pattern and fvs t \subseteq G, x1...xn and ?u does not occur in t, we can solve this immediately by assigning: ?u_G <- fun x1...xn -> t

Otherwise, defer this constraint if permissible, or else fail.

Flex-Rigid Non-Pattern

Given ?u_G e1...en = t, where the LHS is not a pattern,

Defer if permissible, otherwise attempt to solve as follows:

let xs be just those elements of e1 .. en that are variables, such that G, xs are pairwise distinct.

Case Quasi-Pattern: If G, xs includes the fvs(t) and if ?u does not occur in t, then solve by setting ?u <- (fun x1...xn -> t)

Case First-orderize: If t = t' e and n > 0 then generate sub-problems ?u_G e1...e_{n-1} = t' and en = e.

Otherwise fail.

Flex-Flex

Defer if possible. Otherwise:

Case patterns: Given patterns on both sides:
?u_(G, x1..xn -> t) x1..xn = ?u'_(G', y1..ym -> t) y1..ym Let G0 be the greatest common prefix of G, x1..xm and G', y1...ym. Solve by generating a fresh ?w_(G0, t) And set ?u <- (fun x1..xn -> ?w) and `?u' <- (fun y1..ym -> ?w)

Case not patterns:

Either side is not a pattern, coerce it to a pattern first. I.e., given

?u_(G; t) e1 ... em

let xs be just those elements of e1 .. en that are variables, such that G, xs are pairwise distinct. Pick fresh ?w_(G,xs; t) and set ?u <- (fun x1...xn -> ?w)

Specifically excluded

Aside from the cases above, the current unifier includes two additional moves: imitation and projection. These are very general but can lead to large, unexpected solutions. Let's try excluding and evaluate the impact.

Imitation:

Given ?u e1..en =?= t e1'..em'

Solve: ?u <- fun x-> t 1...xn (?u1 x1..xn) ... (?um x1..xn))

New problems: ?u1 e1..en =?= e1'; .... ;?um e1..en = em'

Projection:

Given ?u e1..en =?= t e1'..em'

Solve: ?u <- fun x1...xn -> x_i (?u1 x1..xn) ... (?um x1..xn))

New problems: ei (?u1 e1..en) ... (?um e1..en) = t e1'..em'

Clone this wiki locally