feat(library/init/data/list): add array_list#1438
Conversation
|
|
||
| def write {α} (l : array_list α) : fin l^.length → α → array_list α := | ||
| λ ⟨n, h⟩ v, { l with data := l^.data^.write ⟨n, l^.lt_capacity h⟩ h v } | ||
|
|
There was a problem hiding this comment.
The VM will not be able to perform a destructive update on l^.data even if the reference counter to l is 1 and it owns data. This structure update is currently desugared into
array_list.mk l^.capacity l^.length (l^.data^.write ⟨n, l^.lt_capacity h⟩ h v) l^.len_leIn the current code generator we evaluate arguments from left to right.
So, after we execute l^.data, the array will have at least two references to it: one on the stack, and one from l. So, a destructive update is not performed.
To make sure we can perform a destructive update at run time, we need to write this function using a pattern matching
def write {α} : ∀ (l : array_list α), fin l^.length → α → array_list α
| ⟨cap, len, data, hle⟩ ⟨n, h⟩ v :=
⟨cap, len, data^.write ⟨n, nat.lt_of_lt_of_le h hle⟩ h v, hle⟩ We can inspect whether we are performing destructive updates or not at run time by using the option trace.array.updat.
set_option trace.array.update true
vm_eval ((list.to_array_list [1, 2, 3])^.write (fin.of_nat 1) 10)^.read (fin.of_nat 1)I have considered using match-with to compile structure updates, but this has nasty consequences.
For example, in the current approach we have the following definitional equality
{ l with data := ...}^.capacity =?= l^.capacityThat is, the projection of a non-updated field is definitionally to the same projection before the update.
This property is essential when we are building hierarchies of type classes (e.g., algebra).
If we use match-with to compile structure updates, then these two terms will be provably equal but not definitionally equal.
This is why I want to support lenses in Lean (#1431). With lenses, we can perform this kind of update in a much more convenient way and make sure a destructive update will be performed at run time if the reference counters are 1. Note that, in the lenses definition we will have to use a match-with. Otherwise we will have the same problem.
That being said, I realize that it would be great to have a mechanism to statically enforce that the reference counters are 1 at run time. To be able to do this, we would have to add linear types to Lean.
We have considered this, but the complexity is too high and they are super inconvenient to use.
There was a problem hiding this comment.
@leodemoura I tried your version and a few others:
def write2 {α} : ∀ (l : array_list α), fin l^.length → α → array_list α
| ⟨cap, len, data, hle⟩ ⟨n, h⟩ v :=
⟨cap, len, data^.write ⟨n, nat.lt_of_lt_of_le h hle⟩ h v, hle⟩
def write3 {α} (l : array_list α) : fin l^.length → α → array_list α
| ⟨n, h⟩ v :=
match l, h with
⟨cap, len, data, hle⟩, h := ⟨cap, len, data^.write ⟨n, nat.lt_of_lt_of_le h hle⟩ h v, hle⟩
end
def write4 {α} (l : array_list α) : fin l^.length → α → array_list α
| ⟨n, h⟩ v :=
let cap := l^.capacity, len := l^.length, data := l^.data, hle := l^.len_le in
⟨cap, len, data^.write ⟨n, nat.lt_of_lt_of_le h hle⟩ h v, hle⟩
def write5 {α} (l : array_list α) : fin l^.length → α → array_list α
| ⟨n, h⟩ v :=
match l^.capacity, l^.length, h, l^.data, l^.len_le with
cap, len, h, data, hle := ⟨cap, len, data^.write ⟨n, nat.lt_of_lt_of_le h hle⟩ h v, hle⟩
end
set_option trace.array.update true
vm_eval ([1, 2, 3]^.to_array_list^.write (fin.of_nat 1) 10)^.read (fin.of_nat 1) -- non-destructive
vm_eval ([1, 2, 3]^.to_array_list^.write2 (fin.of_nat 1) 10)^.read (fin.of_nat 1) -- destructive
vm_eval ([1, 2, 3]^.to_array_list^.write3 (fin.of_nat 1) 10)^.read (fin.of_nat 1) -- destructive
vm_eval ([1, 2, 3]^.to_array_list^.write4 (fin.of_nat 1) 10)^.read (fin.of_nat 1) -- non-destructive
vm_eval ([1, 2, 3]^.to_array_list^.write5 (fin.of_nat 1) 10)^.read (fin.of_nat 1) -- destructive
example {α} (l : array_list α) (n : ℕ) (h : n < l^.length) (v : α) : (l^.write ⟨n, h⟩ v)^.capacity = l^.capacity := rfl
example {α} (l : array_list α) (n : ℕ) (h : n < l^.length) (v : α) : (l^.write2 ⟨n, h⟩ v)^.capacity = l^.capacity := rfl --failed
example {α} (l : array_list α) (n : ℕ) (h : n < l^.length) (v : α) : (l^.write3 ⟨n, h⟩ v)^.capacity = l^.capacity := rfl --failed
example {α} (l : array_list α) (n : ℕ) (h : n < l^.length) (v : α) : (l^.write4 ⟨n, h⟩ v)^.capacity = l^.capacity := rfl
example {α} (l : array_list α) (n : ℕ) (h : n < l^.length) (v : α) : (l^.write5 ⟨n, h⟩ v)^.capacity = l^.capacity := rfl
Version 2 is the one you presented above, and version 3 is the suggested match expression. Both of them fail to reduce projections definitionally. I have a few alternatives, versions 4 and 5.
I assume that let expressions allow controlling the evaluation order, so I was hoping that version 4 would work. But it doesn't achieve the destructive write - do you know why? Perhaps it is just because l is still in scope, even though it is not being used. If this version can be made to work, it is a good alternative compilation for {s with ...}.
The only version which has both properties is version 5, which uses a match expression on the projections, but this involves an extra construction / destruction step.
There was a problem hiding this comment.
I was hoping that version 4 would work. But it doesn't achieve the destructive write - do you know why?
The problem is that there are implicit references to l in the data^.write application (i.e., in the implicit arguments, and one of them is being captured by a closure that is just dead code).
They can be eliminated. I will try to fix this.
That being said, I also think write4 is a great alternative for compiling {s with ...}. I will try to work on this too.
There was a problem hiding this comment.
I pushed a fix that erases the artificial dependency that was preventing the destructive update.
write4 is performing a destructive update after this fix.
I will implement the new approach for {s with ...}
…{s with ...}`
See discussion at leanprover#1438
leanprover#1438 (comment)
@digama0 With this commit, the original `array_list.write` will also
perform a destructive update when the reference counter for `l` is 1.
```lean def write {α} (l : array_list α) : fin l^.length → α → array_list α :=
λ ⟨n, h⟩ v, { l with data := l^.data^.write ⟨n, l^.lt_capacity h⟩ h v }
```
|
@digama0 I added the new encoding for |
|
@leodemoura Do you have a specific example with this behavior? I have the following, but it's not showing what you say. |
|
@digama0 Here is an example structure test :=
(data1 : array nat 3)
(data2 : array nat 3)
(sz: nat)
def test.write (s : test) (i : fin 3) (v : nat) :=
{s with data1 := s^.data1^.write i v, data2 := s^.data2^.write i v}
def mk_test (n : nat) : test :=
{ data1 := mk_array 3 0,
data2 := mk_array 3 1,
sz := n }
set_option trace.array.update true
#eval test.write (mk_test 10) (fin.of_nat 1) 10
/-
Output:
[array.update] non-destructive write at #1
[array.update] destructive write at #1
-/
def test.write2 (s : test) (i : fin 3) (v : nat) : test :=
match s with
| ⟨d₁, d₂, s⟩ := ⟨d₁^.write i v, d₂^.write i v, s⟩
end
#eval test.write2 (mk_test 10) (fin.of_nat 1) 10
/-
Output:
[array.update] destructive write at #1
[array.update] destructive write at #1
-/ |
|
In your example vm_eval let a := {test . sz := 4, data1 := {data := λ_, 5}, data2 := {data := λ_, 1}}^.write
(fin.of_nat 1) 10 in
(a^.data1^.read (fin.of_nat 1), a^.data2^.read (fin.of_nat 2)) -- destructive writeThe compiler simplifies |
|
BTW, your example help me to find a potential performance problem in the compiler. |
|
@leodemoura My recommended solution to your we compile this to (where let bindings are only generated for data, as you suggested). |
At commit c58f61e, the transformation was implemented at |
|
@leodemoura That actually makes a lot of sense. Evaluation order isn't even supposed to be a thing users need to worry about in dependent type theory, and the code generator is the one supplying the evaluation order on top of the original structure anyway. What you really need is a proper analysis of the
In their implementation, there are two heaps, one for regular objects and one for linear objects, which must be routed through linear functions and have no GC. For our purposes, you can make sure that |
|
@jroesch and I needed a char buffer yesterday. We defined it as |
|
I am confused by your assertion. Exactly how is array being stored
internally, that would cause size and capacity to diverge?
…On Mar 22, 2017 11:10 AM, "Leonardo de Moura" ***@***.***> wrote:
@jroesch <https://github.com/jroesch> and I needed a char buffer
yesterday. We defined it as
https://github.com/leanprover/lean/blob/master/library/data/buffer.lean#L8
We realized it is essentially array_list because the internal array
implementation already has support for capacity/size.
Sorry for not addressing this PR earlier. I will not merge it since buffer
is simpler.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1438 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA1A7FlNCIh1RBkd2LQcHPn71vZNOI6lks5roTnMgaJpZM4MWcMd>
.
|
|
@digama0 |
|
@leodemoura Although it is nice to know that the array implementation has
these additional features, I would really like to have access to a raw
array type (without even array size stored), from which more complicated
verified programs can be built. Perhaps parray can also be exposed by a
different name, but for proper complexity control for lean programs, the
real underlying array structure should be exposed as much as possible, so
that more of it can be formalized and verified and fewer e.g. bounds checks
are needed. In particular, defined_array is meant to be an abstraction for
uninitialized or partially-initialized arrays, and it would be nice if the
VM could actually avoid array initialization for this structure.
…On Wed, Mar 22, 2017 at 11:41 AM, Leonardo de Moura < ***@***.***> wrote:
@digama0 <https://github.com/digama0>
parray is used to implement Lean arrays.
https://github.com/leanprover/lean/blob/master/src/library/
vm/vm_array.cpp#L14
The capacity in parray is stored before the elements.
https://github.com/leanprover/lean/blob/master/src/library/parray.h#L26
The size is stored in a cell:
https://github.com/leanprover/lean/blob/master/src/library/parray.h#L65
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1438 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA1A7BHWpVkakfR_bnOmm78BI1ssRckAks5roUElgaJpZM4MWcMd>
.
|
|
@digama0 Yes, it would be nice to have a primitive C-like array. We will need this kind of array to interface with external code. We can call them |
|
@leodemoura Here is a slight variation on structure partial_carray (α : Sort u) (n) (s : fin n → Prop) :=
(data : Π (n : fin n), s n → α)
def carray (α : Sort u) (n) := partial_carray α n (λ_, true)If I understand correctly, the parameter |
An object of type BTW, we erase irrelevant data (i.e., types, propositions (which are also types) and proofs) by replacing them with unit (i.e. list.length (list.map (fun _, true) [1, 2, 3])That is, the user creates a list of propositions using |
I think you can take this one step farther: Just replace
Hm, that's too bad. This seems like an edge case, so it might be worth working around it. What is lambda lifting exactly? If I'm reading the code correctly, you are uncurrying sequences of lambdas. Does a function of type I'm starting to see why a solid semantics for the VM data types is useful. |
For the bytecode interpreter, this would be reasonable. However, for native code generation, we don't want this kind of overhead. |
It is one of the preprocessing steps. For each nested lambda (except the ones used for |
|
We reduce the overhead of def safe_div.aux x y :=
... -- code for the original safe_div is here
def safe_div x y h :=
safe_div.aux x yThen, for fully applied That is, we only pay for the overhead for partial applications. |
Add support for an array list type, using the "mutable"
arrayimplementation.