Skip to content

Latest commit

 

History

History
336 lines (257 loc) · 13.5 KB

type-directed-search.rst

File metadata and controls

336 lines (257 loc) · 13.5 KB

Type Directed Search :search

Idris' :search command searches for terms according to their approximate type signature (much like Hoogle for Haskell). For example:

Idris> :search e -> List e -> List e
= Prelude.List.(::) : a -> List a -> List a
Cons cell

= Prelude.List.intersperse : a -> List a -> List a
Insert some separator between the elements of a list.

> Prelude.List.delete : Eq a => a -> List a -> List a


< assert_smaller : a -> b -> b
Assert to the totality checker than y is always structurally
smaller than x (which is typically a pattern argument)

< Prelude.Basics.const : a -> b -> a
Constant function. Ignores its second argument.

The best results are listed first. As we can see, (::) and intersperse are exact matches; the = symbol to the left of those results tells us the types of (::) and intersperse are effectively the same as the type that was searched.

The next result is delete, whose type is more specific than the type that was searched; that's indicated by the > symbol. If we had a function with the signature e -> List e -> List e, we could have given it the type Eq a => a -> List a -> List a, but not necessarily the other way around.

The final two results, assert_smaller and const, have types more general than the type that was searched, and so they have < symbols to their left. For example, e -> List e -> List e would be a valid type for assert_smaller. The correspondence for const is more complicated than any of the four previous results. :search shows this result because we could change the order of the arguments! That is, the following definition would be legal:

f : e -> List e -> List e
f x xs = const xs x

About :search results

:search's functionality is based on the notion of type isomorphism. Informally, two types are isomorphic if we can identify terms of one type exactly with terms of the other. For example, we can consider the types Nat -> a -> List a and a -> Nat -> List a to be isomorphic, because if we have f : Nat -> a -> List a, then flip f : a -> Nat -> List a. Similarly, if g : a -> Nat -> List a, then flip g : Nat -> a -> List a.

With :search, we create a partial order on types; that is, given two types A and B, we may choose to say that A <= B, A >= B, or both (in which case we say A == B), or neither. For :search, we say that A >= B if all of the terms inhabiting A correspond to terms of B, but it need not necessarily be the case that all the terms of B correspond to terms of A. Here's an example:

a -> a            >=          Nat -> Nat

The left-hand type has just a single inhabitant, id, which corresponds to the term id {a = Nat}, which has the right-hand type. However, there are various terms inhabiting the right-hand type (such as S) which cannot correspond with terms of type a -> a.

We can consider the partial order for :search to be, in some sense, inductively generated by several classes of "edits" which are described below.

Possible edits

Here is a simple approximate list of the edits that are possible in :search. They are not entirely formal, and do not necessarily reflect the :search command's actual behavior. For example, the argument application rule may be used directly on arguments that are bound after other arguments, without using several applications of the argument transposition rule.

  • Argument transposition

    A : Type          B : Type           a : A, b : B  |-  M : Type

    ----------------------------------------------------------------------------(x : A) -> (y : B) -> [x,y/a,b]M == (y : B) -> (x : A) -> [x,y/a,b]M

    Score: 1 point

Example:

a -> Vect n a -> Vect (S n) a    ==    Vect n a -> a -> Vect (S n) a

Note that in order for it to make sense to change the order of arguments, neither of the arguments' types may depend on the value bound by the other argument!

  • Symmetry of equality

    A = B : Type      t : Type  |-  M : Type
    ----------------------------------------
     [A = B/t]M        ==       [B = A/t]M

    Score: 1 point

Example:

(x,y,z : Nat) -> x + (y + z) = (x + y) + z
                   ==
(x,y,z : Nat) -> (x + y) + z = x + (y + z)

Note that this rule means that we can flip equalities anywhere they occur (i.e., not only in the return type).

  • Argument application

    e : A  |-  M : Type        y1 : T1, ..., yn : Tn  |-  x : A

    -----------------------------------------------------------------(z : A) -> [z/e]M >= (y1 : T1) -> ... -> (yn :Tn) -> [x/e]M

    Score: <= : 3 points, >= : 9 points

Examples:

a -> a            >=    Nat -> Nat
a -> a            >=    List e -> List e
Vect k (Fin k)    >=    Vect 5 (Fin 5)

Note that the n shown in the scheme above may be 0; that is, there are no Pi terms to be added on the right side. For example, that's the case for the first example shown above. This is probably the most important, and most widely used, rule of all.

  • Type class application

    C : Type -> TypeClass
    ,   y1 : T1, ..., yn : Tn  |-  A : Type, instance : C A
    ,   t : Type  |- M : Type
    -----------------------------------------------------------------
    C a => [a/t]M     >=      (y1 : T1) -> ... -> (yn : Tn) -> [A/t]M

    Score: <= : 4 points, >= : 12 points

Examples

Ord a => a                           >=    Int
Show (List e) => List e -> String    >=    Show a => List a -> String

This rule is used by looking at the instances for a particular type class. While the scheme is shown only for single-parameter type classes, it naturally generalizes to multi-parameter type classes. This rule is particularly useful in conjunction with argument application. Again, note that the n in the scheme above may be 0.

  • Type class introduction

    t : Type |- M : Type      C : Type -> TypeClass
    -----------------------------------------------
    (t : Type) -> M      >=      C t => M

    Score: <= : 2 points, >= : 6 points

Example:

a -> a -> Bool    >=    Eq a => a -> a -> Bool

Scoring and listing search results

When a type S is searched, the type is compared against the types of all of the terms which are currently in context. When :search compares two types S and T, it essentially tries to find a chain of inequalities

R1    R2     Rn    Rn+1

S <= A1 <= ... <= An <= T

using the edit rules listed above. It also tries to find chains going the other way (i.e., showing S >= T) as well. Each rule has an associated score which indicates how drastic of a change the rule corresponds to. These scores are listed above. Note that for the rules which are not symmetric, the score depends on the direction in which the rule is used. Finding types which are more general that the searched typed (S <= T) is preferred to finding types which are less general.

The score for the entire chain is, at minimum, the sum of the scores of the individual rules (some non-linear interactions may be added). The :search function tries to find the chain between S and T which results in the lowest score, and this is the score associated to the search result for T.

Search results are listed in order of ascending score. The symbol which is shown along with the search result reflects the type of the chain which resulted in the minimum score.

Practically, naive and undirected application of the rules enumerated above is not possible; not only is this obviously inefficient, but the two application rules (particularly argument application) are really impossible to use without context given by other types. Therefore, we use a heuristic algorithm that is meant to be practical, though it might not find ways to relate two types which may actually be related by the rules listed above.

Suppose we wish to match two types, S and T. We think of the problem as a non-deterministic state machine. There is a State datatype which keeps track of how well we've matched S and T so far. It contains:

  • Names of argument variables (Pi-bound variables) in either type which have yet to be matched
  • A directed acyclic graph (DAG) of arguments (Pi-bindings) for S and T which have yet to be matched
  • A list of typeclass constraints for S and T which have yet to be matched
  • A record of the rules which have been used so far to get to this point

A function nextSteps : State -> [State] finds the next states which may follow from a given state. Some states, where everything has been matched, are considered final. The algorithm can be roughly broken down into multiple stages; if we start from having two types, S and T, which we wish to match, they are as follows:

  1. For each of S and T, split the types up into their return types and directed acyclic graphs of the arguments, where there is an edge from argument A to argument B if the term bound in A appears in the type of B. The topological sorts of the DAG represent all the possible ways in which the arguments may be permuted.
  2. For type T, recursively find (saturated) uses of the = type constructor and produce a list of modified versions of T containing all possible flips of the = constructor (this corresponds to the symmetry of equality rule).
  3. For each modified type for T, try to unify the return type of the modified T with S, considering arguments from both S and T to be holes, so that the unifier may match pieces of the two types. For each modified version of T where this succeeds, an initial State can be made. The arguments and typeclasses are updated accordingly with the results of unification. The remainder of the algorithm involves applying nextSteps to these states until either no states remain (corresponding to no path from S to T) or a final state is found. nextSteps also has several stages:
  4. Try to unify arguments of S with arguments of T, much like is done with the return types. We work "backwards" through the arguments: we try matching all remaining arguments of S which lack outgoing edges in the DAG of remaining arguments (that is, the bound value doesn't appear in the type of any other remaining arguments) with the all of the corresponding remaining arguments of T. This is done recursively until no arguments remain for both S and T; otherwise, we give up at this point. This step corresponds to application of the argument application rule, as well as the argument transposition rule.
  5. Now, we try to match the type classes. First, we take all possible subsets of type class constraints for S and T. So if S and T have a total of n type class constraints, this produces 2^n states for every state, and this quickly becomes infeasible as n grows large. This is probably the biggest bottleneck of the algorithm at the moment. This step corresponds to applications of the type class introduction rule.
  6. Try to match type class constraints for S with those for T. We attempt to unify each type class constraint for S with each constraint for T. This may result in applications of the type class application rule. Once we are unable to match any more type class constraints between S and T, we proceed to the final step.
  7. Try instantiating type classes with their instances (in either S or T). This corresponds to applications of the type class application rule. After instantiating a type class, we hopefully open up more opportunities to match typeclass constraints of S with those of T, so we return to the previous step.

The code for :search is located in the Idris.TypeSearch module.

Aggregating results

The search for chains of rules/edits which relate two types can be viewed as a shortest path problem where nodes correspond to types and edges correspond to rules relating two types. The weights or distances on each edge correspond to the score of each rule. We then may imagine that we have a single start node, our search type S, and several final nodes: all of the types for terms which are currently in context. The problem, then, is to find the shortest paths (where they exist) to all of the final nodes. In particular, we wish to find the "closest" types (those with the minimum score) first, as we'd like to display them first.

This problem nicely maps to usage of Dijkstra's algorithm. We search for all types simultaneously so we can find the closest ones with the minimum amount of work. In practice, this results in using a priority queue of priority queues. We first ask "which goal type should we work on next?", and then ask "which state should we expand upon next?" By using this strategy, the best results can be shown quickly, even if it takes a bit of time to find worse results (or at least rule them out).

Miscellaneous Notes

Whether arguments are explicit or implicit does not affect search results.