Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on introducing a minimum amount of type inference in Dhall #1370

Open
winitzki opened this issue Nov 30, 2023 · 1 comment
Open

Comments

@winitzki
Copy link
Contributor

winitzki commented Nov 30, 2023

Not sure how easy it would be to back-port "Fall-from-Grace" type inference to Dhall. But I had this idea: We can perhaps proceed in small steps and introduce a small amount of type inference that won't break things and won't hurt either.

So far I came up with some places where type inference could be added without introducing a breaking change:

  1. For any functions f of type ∀(A : Type) → B → C we may apply f (A: Type) (x: B) but we may also just write f (x: B), omitting the type argument. This should type-check only if knowing the type of x allows us to infer A. If so, we rewrite f x to f A x during the beta-normalization step. The normal form of f x is the beta-reduction of f A x.
  2. Rule 1 is applied transitively to curried functions. So, functions with type signatures f : ∀(A : Type) → ∀(B : Type) → C → D could be applied as f (x: C) omitting the two type arguments if they can be inferred. If so, we rewrite f x to f A B x during the beta-normalization step. The normal form of f x is the beta-reduction of f A B x.
  3. Allow the syntax λx → b as well as the standard λ(x : A) → b if the type of x can be inferred from the type of b without knowing anything else (other than the current type inference context that existed when we started type-checking λx → b).

I think about this as a "one-shot" type inference. It is unidirectional and quite limited in scope. There are no unification variables that we need to carry around. At every step, we infer one or more type parameters immediately and can rewrite our expression while preserving correctness.

Initially, this type inference may be restricted to "fully unambiguous" cases, that is, when no polymorphism is needed within the types that we infer.

So, for example, λx → x + 1 would be allowed but not λx → λy → x y where the inferred types of x and y have to be polymorphic (x would have the inferred type ∀(A : Type) → ∀(B : Type) → A → B).

Also, a function f x will be rewritten to f A x (or f A B x, etc.) only in cases when A, (B, ...) can be inferred unambiguously from knowing the type of x and nothing else.

The result will be that type inference will work for functions where value parameters come immediately after type parameters they use. Many Dhall functions have type signatures of that form. For example:

List/zip : (a : Type)  List a  (b : Type)  List b  List { _1 : a, _2 : b }

In those cases, we would be able to apply List/zip simply as List/zip [1, 2, 3] ["a", "b", "c"] without any type applications.

At a later step, we might add more comprehensive type inference where a function application f x y ... z is rewritten to f A ... B x C ... D y E ... F z when the type arguments A, B, ..., F inferred from knowing the combination of types of x, y, z. We might also add type inference for parameters of type Type → Type and higher kinds. But the initial stage is just to see if it is practical to infer simple type parameters.

With this proposal, there are no changes to the parser, no new syntax, no new restrictions, and no changes in existing normal forms. No existing programs become invalid, just some new programs would become valid.

@winitzki
Copy link
Contributor Author

winitzki commented May 24, 2024

I've thought some more on implementing some very simple and limited type inference. Here is what I believe to be the lowest-hanging-fruit case, which I call a "one-place" type inference:

The change concerns only functions of the form f : ∀(A : Type) → B → C. Normally we need to apply such functions as f A x where x : B (and the type expression B can depend on the type A). Now, we want to be able to write simply f x, and we want A to be inferred automatically. We want Dhall to rewrite f x to f A x automatically, with appropriately chosen A.

We know the type expression B (as function of A) and we also know the actual type of x. The type argument A is inferred by matching the type expression B with the actual type of x. For example, if B = { name : Text, value : A } and the type of x is { name : Text, value : Natural } then we will infer A = Natural. But if the type of x is { name : Natural, value : Natural } then the pattern does not match and it is a type error.

If we succeed in finding A, we fill it in, rewriting f x to f A x. The we continue further type-checking, beta-reductions, and so on, with the new type argument automatically filled in the right place. This is the "single-place" type inference.

This requires the following changes to the language definition:

  • The typechecker previously was a function from Γ and t to an inferred type T. The judgment Γ ⊢ t : T had only the type T as output. Now, we will sometimes need to rewrite the input expression t and fill in some type arguments in it. So, we need the typing judgment to be Γ ⊢ t : T , t₁ where the new output t₁ is a rewritten expression with some type arguments filled in.
  • The existing typing judgment Γ ⊢ t : T will have to be replaced by the new judgment Γ ⊢ t : T , t₁ , as I outline below.
  • Once the judgment Γ ⊢ t : T , t₁ is established, all further judgments (type checking, beta reduction, etc.) will need to use t₁ instead of t. The old expression t should never be used again.

This change affects many typing judgments because expressions may be rewritten after a typecheck. So far, rewriting affects only value expressions (but in the future we may also rewrite type expressions).

This change will enable type inference in many cases where value arguments allow us to infer the preceding type arguments. An example is List/length, which has the type signature ∀(A : Type) → List A → Natural. This is of the form ∀(A : Type) → B → C. The single-step and so we will be able to write List/length [1, 2, 3].

An example where this does not immediately work is List/map : ∀(A : Type) → ∀(B : Type) → ∀(f : A → B) → List A → List B. To apply List/map, we currently have to write List/map A B f xs. However, we now have two type arguments, and the "single-place" algorithm will not be able to infer both of them at once. We will have to write A explicitly; the single-place algorithm will be able to infer B automatically. To mitigate this problem, we might want to change the type signature to List/map1 : ∀(A : Type) → List A → ∀(B : Type) → ∀(f : A → B) → List B. Then the single-place algorithm will be able to infer all type arguments.

Here is how the typing judgment needs to be modified:

When type-checking a function application (f x), we first type-check f and x separately. Then it could be that f has type ∀(x : A₀) → B₀ and the argument x has type A₀. (Both f and x could be rewritten after type-checking.) Then we proceed as in the standard: https://github.com/dhall-lang/dhall-lang/blob/master/standard/type-inference.md#functions However, we also output the expression t₁ because now the typing judgment must be of the form Γ ⊢ t : T , t₁, and we need to honor rewriting after each type-checking.

Γ ⊢ f : ∀(x : A₀) → B₀ , f₁
Γ ⊢ a₀ : A₁ , a₁
A₀ ≡ A₁
↑(1, x, 0, a₁) = a₂
B₀[x ≔ a₂] = B₁
↑(-1, x, 0, B₁) = B₂
B₂ ⇥ B₃
t₁ = f₁ a₁
──────────────────────
Γ ⊢ f a₀ : B₃ , t₁

If the equivalence check A₀ ≡ A₁ fails, we do not yet fail this judgment. Instead, we do something like this:

Γ ⊢ f : ∀(A : Type) → ∀(x : B₀) → C₀ , f₁
Γ ⊢ b₀ : B₁ , b₁
↑(1, A, 0, B₁) = B₂   ; in case A was free in B₁
B₀[A ≔ T] ≡ B₂        ; find T such that B₀[A ≔ T] equals B₂
Γ ⊢ f₁ T b₁ : B₃ , t₁    ; fill in the type T and type-check again
──────────────────────
Γ ⊢ f b₀ : B₃ , t₁

Here the judgment B₀[A ≔ T] ≡ B₂ is not a simple equivalence check but a pattern match that tries to find the type T that will make types equal. At this point, we are only looking for a single unknown type T.

Maybe it will be easier if there are two kinds of typing judgments: one that has no rewriting (the current one) and another that has rewriting.

I must say, this is a bit discouraging. Adding even a tiny bit of type inference (just to support the simplest use case, List/length [1, 2, 3]) will require a massive overhaul of the entire typechecking algorithm, where many places will have to honor a possible rewriting. Every time we have Γ ⊢ t : T we will now need to have Γ ⊢ t : T , t₁. We will need to change the algorithm to use t₁ in all later judgments.

Is there a simpler approach to type inference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant