Add type-based termination checker #7152

knisht · 2024-02-27T14:32:32Z

This is a technical description of the pull request.
For intro-level overview of type-based termination, consider reading user manual.
For deep semantical discussion of the process, consider reading a paper.

The development is pretty much ready to play with, and I welcome whatever tricky examples (including mutual induction-coinduction) you have.

Overview

This PR introduces three options: --type-based-termination (disabled by default), --syntax-based-termination (enabled by default), --size-preservation (enabled by default).

If a module is type-checked with --type-based-termination, the process of encoding starts for all provided definitions. Encoding must run after polarity checking, and currently it is invoked at the beginning of the termination checking. During encoding, internal types for all definitions are converted to sized types (do not confuse them with the existing sized types of Agda; starting from now, the expression sized types refers to a separate concept local to this PR), which form a variant of System Fω. Sized types are stored alongside constants and can be serialized.

If termination checking runs for a function definition under --type-based-termination, then an alternative (to usual structural recursion) termination certificate may be produced. Each clause of the function definition is bidirectionally type-checked against the encoded sized type, where each recursive call gives rise to a CallMatrix based on relations between sized types. As in the classic termination checker, size-change principle ensures that the function is strongly normalizing by the set of matrices.

There is also a feature of size preservation, which is regulated by --size-preservation. The implementation is roughly the following: in a function signature, each output size is attempted to be identified with each input size, obtaining a modified sized type. If the function can still be type-checked against the modified sized type, then the input and output sizes are actually the same and this information is serialized later.

There should not be problematic interaction with any other feature, as in the current implementation, sized types "live in their own world", which makes the whole process similar to the double checker. Additionally, the algorithm fails if it cannot understand unusual behavior (such as pattern-matching in the presence of univalence), so it would rather not work than produce a strange termination certificate.

Performance

The measurements were taken on MacBook M2.

On the standard library, the time measurement produces the following results:

Total                                 275,005ms                 
Typing                                 15,200ms (99,964ms)
TypeBasedTermination                      305ms  (5,639ms)
TypeBasedTermination.SizeTypeChecking   2,735ms           
TypeBasedTermination.SizeGraphSolving   1,060ms           
TypeBasedTermination.SizeTypeEncoding   1,035ms           
TypeBasedTermination.PatternRigids        283ms           
TypeBasedTermination.Preservation         119ms           
TypeBasedTermination.Matrix               100ms           
Termination                               377ms  (2,617ms)
Termination.RecCheck                    2,166ms           
Termination.Compare                        73ms

Without the type-based termination, the measurement for termination checking is 3,530ms.

On graded type theory, the results are of similar proportion:

Total                                 1,430,003ms            
Typing                                   60,738ms (742,448ms)                 
TypeBasedTermination                        120ms  (34,736ms)
TypeBasedTermination.Matrix              12,595ms            
TypeBasedTermination.SizeTypeChecking     9,050ms            
TypeBasedTermination.SizeGraphSolving     8,385ms            
TypeBasedTermination.SizeTypeEncoding     4,029ms            
TypeBasedTermination.PatternRigids          542ms            
TypeBasedTermination.Preservation            13ms                
Termination                                 829ms   (4,813ms)
Termination.RecCheck                      2,878ms            
Termination.Compare                         794ms            
Termination.Graph                           310ms

Without the type-based termination, the measurement for termination checking is 31,342ms.

One important note is that for graded type theory I had to disable size preservation, which is currently implemented very naïvely. This is an area of future work.

Overall, performance impact is acceptable -- on large projects, matrix solving starts taking most of the time, but the syntax-based termination checker would also suffer from it. I suggest that, in the absence of issues with soundness, the type-based termination checker may be enabled by default in the future releases.

knisht · 2024-02-29T18:16:00Z

Regarding #1209: my termination checker does not permit these functions.

For example, in #1209 (comment), the problematic function looks like

inh : Stream D
force inh = lim inh , inh

inh inside lim inh prevents passing the termination check, because lim : Stream D → D is a constructor accepting a stream of infinite depth, which means that inh should also be at least (due to contravariance) an infinite stream.

src/full/Agda/Termination/TypeBased/Checking.hs

cmcmA20 · 2024-03-14T10:39:33Z

Can you also elaborate on interactions with --cubical, --erased-cubical and --erased-matches?
I couldn't manage to reproduce #5910 with --type-based-termination + --no-syntax-based-termination.

knisht · 2024-03-14T10:51:37Z

I am not aware of any problems connected with --cubical. The core principle here is that my checker tries to resemble System F which it is based on, and if something strange starts happening (like univalence), it fails to proceed. It also means that if there is a terminating function that involves univalence and is accepted by the syntax-based checker, then this function would likely be rejected by the type-based checker.

As for #5910: the pattern-matching on <-> unifies two types, which is quite difficult to fit into System F.

My termination checker also correctly rejects #3883 (comment), because it does not consider Bad as a recursive datatype, hence b is not a size-increasing constructor.

Thank you for the comments, I think it makes sense to also clarify them in the user manual.

andreasabel · 2024-03-20T14:34:33Z

Discussion Agda dev meeting 2024-03-24.

@knisht will push his latest changes
some parts of this PR should be reviewed: documentation, test cases
merge it into master so we can easily test it before 2.7.0

jespercockx

Thank you again for all the work. I reviewed the changelog and the user manual, as requested.

CHANGELOG.md

jespercockx · 2024-03-24T08:56:03Z

CHANGELOG.md

@@ -56,6 +56,26 @@ Additions to the Agda syntax.
  As in a `with`, multiple bindings can be separated by a `|`, and variables to
  the left are in scope in bindings to the right.

+* Type-based termination checker
+
+  Agda is now able to understand polymorphic functions during checking for structural recursion.


It's not clear to me what it means to "understand polymorphic functions", could you give a concrete example of what's the benefit here (apart from size preservation)?.

jespercockx · 2024-03-24T08:57:39Z

CHANGELOG.md

+  qsort cmp (cons x xs) = qsort cmp (filter (cmp x) xs) ++ cons x (qsort cmp (filter (λ y → cmp y x) xs))
+  ```
+
+  Type-based termination checking also works for coinduction, which improves the guardedness predicate.


Again, one minimal example of something that used to fail would help here.

jespercockx · 2024-03-24T08:58:27Z

src/data/lib/prim/Agda/Builtin/Int.agda

@@ -1,4 +1,4 @@
-{-# OPTIONS --cubical-compatible --safe --no-sized-types --no-guardedness --level-universe #-}
+{-# OPTIONS --cubical-compatible --safe --no-sized-types --no-guardedness --level-universe --type-based-termination #-}


Type-based termination is not necessary for checking the builtin files, so why did you add the flag here?

The type-based termination checker needs to preprocess existing datatypes before using them in the actual checking process. If there is no flag, the preprocessing does not happen.

This is a behavior I would like to change: it is easy to forget to provide the flag for some definition, and then this definition will be useless for the checker. The preprocessing itself is very cheap and syntax-based, so it might make sense to enable it by default for all files. The only drawback I see here is that the preprocessing may be unstable since it is new.

Another solution is to make --type-based-termination coinfective, which I don't like at all.

src/full/Agda/Interaction/Options/Base.hs

jespercockx · 2024-03-24T09:33:51Z