Inductive datatypes

This is a collection of remarks regarding the implementation of inductive datatypes.

Kernel doesn't need to support mutually recursive datatypes

Mutually recursive datatypes can be simulated using indexed families. For example, suppose we want to define

inductive tree (A : Type) :=
| leaf : A → tree A
| node : tree_list A → tree A
with tree_list :=
| nil  : tree_list A
| cons : tree A → tree_list A

We can define these two types in the following way using an indexed family.

inductive tree_core (A : Type) : bool → Type :=
| nil  : tree_core A ff
| cons : tree_core A tt → tree_core A ff → tree_core A ff
| leaf : A → tree_core A tt
| node : tree_core A ff → tree_core A tt

-- Define types
definition tree_list (A : Type) := tree_core A ff
definition tree (A : Type) := tree_core A tt
-- Define constructors
definition tree_list.nil (A : Type) : tree_list A := tree_core.nil A
definition tree_list.cons {A : Type} (h : tree A) (t : tree_list A) : tree_list A := tree_core.cons h t
definition tree.leaf {A : Type} (a : A) : tree A := tree_core.leaf a
definition tree.node {A : Type} (l : tree_list A) := tree_core.node l
-- Define recursors
definition tree.rec {A : Type} {C₁ : tree A → Type} {C₂ : tree_list A → Type}
                    (h₁ : C₂ (tree_list.nil A))
                    (h₂ : ∀ (h : tree A) (t : tree_list A), C₁ h → C₂ t → C₂ (tree_list.cons h t))
                    (h₃ : ∀ (a : A), C₁ (tree.leaf a))
                    (h₄ : ∀ (l : tree_list A), C₂ l → C₁ (tree.node l))
                    (t : tree A) : C₁ t :=
@tree_core.rec A (λ (k : bool), bool.rec_on k (λ t, C₂ t) (λ t, C₁ t))
  h₁
  begin intro h t, esimp, intro ih₁ ih₂,  exact h₂ h t ih₁ ih₂ end
  (λ a, h₃ a)
  begin intro l, esimp, intro ih, exact h₄ l ih end
tt t

definition tree_list.rec
  {A : Type} {C₁ : tree A → Type} {C₂ : tree_list A → Type}
  (h₁ : C₂ (tree_list.nil A))
  (h₂ : ∀ (h : tree A) (t : tree_list A), C₁ h → C₂ t → C₂ (tree_list.cons h t))
  (h₃ : ∀ (a : A), C₁ (tree.leaf a))
  (h₄ : ∀ (l : tree_list A), C₂ l → C₁ (tree.node l))
  (l : tree_list A) : C₂ l :=
@tree_core.rec A (λ (k : bool), bool.rec_on k (λ t, C₂ t) (λ t, C₁ t))
  h₁
  begin intro h t, esimp, intro ih₁ ih₂,  exact h₂ h t ih₁ ih₂ end
  (λ a, h₃ a)
  begin intro l, esimp, intro ih, exact h₄ l ih end
ff l

-- Computational rules hold definitionally
universes l
variables
  {A : Type} {C₁ : tree A → Type.{l}} {C₂ : tree_list A → Type.{l}}
  (h₁ : C₂ (tree_list.nil A))
  (h₂ : ∀ (h : tree A) (t : tree_list A), C₁ h → C₂ t → C₂ (tree_list.cons h t))
  (h₃ : ∀ (a : A), C₁ (tree.leaf a))
  (h₄ : ∀ (l : tree_list A), C₂ l → C₁ (tree.node l))

example : @tree_list.rec A C₁ C₂ h₁ h₂ h₃ h₄ (tree_list.nil A) = h₁ :=
rfl

example (t : tree A) (l : tree_list A) :
  @tree_list.rec A C₁ C₂ h₁ h₂ h₃ h₄ (tree_list.cons t l) = 
  h₂ t l (@tree.rec A C₁ C₂ h₁ h₂ h₃ h₄ t) (@tree_list.rec A C₁ C₂ h₁ h₂ h₃ h₄ l) :=
rfl

example (a : A) : @tree.rec A C₁ C₂ h₁ h₂ h₃ h₄ (tree.leaf a) = h₃ a :=
rfl

example (l : tree_list A) : 
  @tree.rec A C₁ C₂ h₁ h₂ h₃ h₄ (tree.node l) = 
  h₄ l (@tree_list.rec A C₁ C₂ h₁ h₂ h₃ h₄ l) :=
rfl

So, mutually recursive datatypes can be simulated on top of a simpler kernel. We will simplify the Lean kernel and remove this feature.

Encoding datatypes that contain recursive occurrences nested in existing datatypes.

It would be nice to be able to write

inductive tree (A : Type) :=
| leaf : A → tree A
| node : list (tree A) → tree A

The kernel does not accept this definition. So, users have to write

inductive tree (A : Type) :=
| leaf : A → tree A
| node : tree_list A → tree A
with tree_list :=
| nil  : tree_list A
| cons : tree A → tree_list A → tree_list A

This is not just a minor inconvenience because the list lemmas cannot be used with tree_list. One trick is to define a bijection between list (tree A) and tree_list A.

inductive tree (A : Type) :=
| leaf : A → tree A
| node : tree_list A → tree A
with tree_list :=
| nil  : tree_list A
| cons : tree A → tree_list A → tree_list A

definition to_list {A : Type} : tree_list A → list (tree A)
| (tree_list.nil A)    := list.nil
| (tree_list.cons a l) := list.cons a (to_list l)

definition of_list {A : Type} : list (tree A) → tree_list A
| list.nil        := tree_list.nil A
| (list.cons a l) := tree_list.cons a (of_list l)

lemma to_list_of_list {A : Type} : ∀ l : list (tree A), to_list (of_list l) = l
| list.nil        := rfl
| (list.cons a l) :=
  begin
   change list.cons a (to_list (of_list l)) = list.cons a l,
   rewrite (to_list_of_list l)
  end

lemma of_list_to_list {A : Type} : ∀ l : tree_list A, of_list (to_list l) = l
| (tree_list.nil A)    := rfl
| (tree_list.cons a l) :=
  begin
    change tree_list.cons a (of_list (to_list l)) = tree_list.cons a l,
    rewrite (of_list_to_list l)
  end

Then, we can define a "simulation" layer. First, the "constructors":

definition tlist.leaf {A : Type} (a : A) : tree A := tree.leaf a
definition tlist.node {A : Type} (c : list (tree A)) : tree A := tree.node (of_list c)

Then, we define the "simulated" recursors:

definition tlist.cases_on {A : Type} {C : tree A → Type}
                          (t : tree A)
                          (h₁ : ∀ a : A,             C (tlist.leaf a))
                          (h₂ : ∀ c : list (tree A), C (tlist.node c))
                          : C t :=
tree.cases_on t
  begin intro a, exact h₁ a end
  begin intro c, exact eq.rec_on (of_list_to_list c) (h₂ (to_list c)) end

The main problem is that some of the expected computational rules for tlist.cases_on do not hold definitionally. Example: @tlist.cases_on A C (tlist.node c) h₁ h₂ is not definitionally equal to h₂ c. These two terms are provably equal.

example (c : list (tree A)) : @tlist.cases_on A C (tlist.node c) h₁ h₂ = h₂ c :=
have ∀ H : of_list (to_list (of_list c)) = of_list c, @@eq.rec (λ (a : tree_list A), C (tree.node a)) (h₂ (to_list (of_list c))) H = h₂ c, from
 begin rewrite to_list_of_list, intro e, reflexivity end,
this (of_list_to_list (of_list c))

Here are some issues for this approach:

Code generators. We don't want to keep converting back and forth between list (tree A) and tree_list A. The code generator should be smart enough to handle tlist.cases_on efficiently.
Definitional package. One of our current TODO lists is to generate induction principles for recursive definitions. The definitional package would not be able to assume that the tlist.cases_on computational rule holds definitionally.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inductive datatypes

Kernel doesn't need to support mutually recursive datatypes

Encoding datatypes that contain recursive occurrences nested in existing datatypes.

Clone this wiki locally