-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(computability/timed): add complexity analysis for sorting algorithms #14494
base: master
Are you sure you want to change the base?
Conversation
08e3c6e
to
98ace57
Compare
def0de9
to
21714ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! I've made some light suggestions here. In general: it would be easier to review this if you could break it into smaller PRs. You can keep this one open as a "tracking" PR, and open smaller ones that add one file at a time.
I'll defer to @digama0 about the broader approach.
rw ordered_length, | ||
rw l_ih, } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rw ordered_length, | |
rw l_ih, } | |
rw [ordered_length, l_ih], } |
Here and elsewhere you can combine multiple rw
s into one
unfold insertion_sort, | ||
rw ← ordered_insert_equivalence r l_hd fst, | ||
cases ordered_insert r l_hd fst, | ||
unfold insertion_sort, } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfold insertion_sort, } | |
refl } |
A goal-ending unfold
can usually be replaced by refl
- log_2_times : ∀ (a : ℕ), 2 * nat.log 2 (a + 2) ≤ a + 2 | ||
-/ | ||
|
||
lemma log_pred : ∀ (a : ℕ) , nat.log 2 a - 1 = nat.log 2 (a / 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a lot cleaner by translating to +
and using nat.log_of_one_lt_of_le
. (There may be ways to make this slicker, this was just the first I came up with.)
lemma log_pred (n : ℕ) : nat.log 2 n - 1 = nat.log 2 (n / 2) :=
if h : n < 2 then by simp [nat.log_of_lt h] else
(nat.sub_eq_iff_eq_add (by rw ← nat.pow_le_iff_le_log; linarith)).mpr
(nat.log_of_one_lt_of_le (by norm_num) (le_of_not_lt h))
cases h, | ||
end | ||
|
||
lemma log_2_val : nat.log 2 2 = 1 := |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this is a consequence of a more general lemma that should go in data.nat.log
.
lemma log_base {b : ℕ} (hb : 1 < b) : nat.log b b = 1 :=
by simpa using nat.log_pow hb 1
lemma log_2_val : nat.log 2 2 = 1 :=
log_base (by norm_num)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think most of these lemmas can be simplified and/or generalized.
simp only [list.length] at IH, | ||
simp only [list.length], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simp only [list.length] at IH, | |
simp only [list.length], | |
simp only [list.length] at IH ⊢, |
and below. The symbol can be entered as \vdash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds cool! Added some comment and proof-shortening suggestions.
# Timed Insertion Sort | ||
This file defines a new version of Insertion Sort that, besides sorting the input list, counts the | ||
number of comparisons made through the execution of the algorithm. Also, it presents proofs of | ||
it's time complexity and it's equivalence to the one defined in data/list/sort.lean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want "its", the possessive. ("it's" is simply "it is".)
# Timed Merge | ||
This file defines a new version of Merge that, besides combining the input lists, counts the | ||
number of operations made through the execution of the algorithm. Also, it presents proofs of | ||
it's time complexity and it's equivalence to the one defined in data/list/sort.lean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"its"
# Timed Merge Sort | ||
This file defines a new version of Merge Sort that, besides sorting the input list, counts the | ||
number of operations made through the execution of the algorithm. Also, it presents proofs of | ||
it's time complexity and it's equivalence to the one defined in data/list/sort.lean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"its"
# Timed Split | ||
This file defines a new version of Split that, besides splitting the input lists into two halves, | ||
counts the number of operations made through the execution of the algorithm. Also, it presents | ||
proofs of it's time complexity, it's equivalence to the one defined in data/list/sort.lean and of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"its"
have l₂s_id : (merge_sort r l₂).fst = l₂s := (congr_arg prod.fst h₂).trans rfl, | ||
rw merge_sort_equivalence at l₂s_id, | ||
have same_lengths₂ := list.length_merge_sort r l₂, | ||
have l₂s_len_l₂_len : l₂s.length = l₂.length := | ||
begin | ||
rw l₂s_id at same_lengths₂, | ||
exact same_lengths₂, | ||
end, | ||
rw l₁s_len_l₁_len, | ||
rw l₂s_len_l₂_len, | ||
|
||
exact split_lengths l l₁ l₂ hs, | ||
end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be shortened by generalizing with the previous section, i.e., lines 176-185
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Robert's comment above that there are other parts that can be generalized. Since merge sort is divide-and-conquer, I'd suspect most proofs for the two sub-lists can be reused.
have ns_bound : ns ≤ 4 * (l.length + 1) * nat.log 2 l.length := | ||
begin | ||
have ns_id : (merge_sort r l₁).snd = ns := (congr_arg prod.snd h₁).trans rfl, | ||
rw ← ns_id, | ||
refine le_trans ih₁ _, | ||
calc 8 * l₁.length * nat.log 2 l₁.length | ||
= 4 * (2 * l₁.length) * nat.log 2 l₁.length : | ||
by linarith | ||
... ≤ 4 * (l.length + 1) * nat.log 2 l₁.length : | ||
begin | ||
rw mul_assoc, | ||
rw mul_assoc 4 (l.length + 1) (nat.log 2 l₁.length), | ||
refine (mul_le_mul_left zero_lt_four).mpr _, | ||
exact nat.mul_le_mul_right (nat.log 2 l₁.length) l₁_length, | ||
end | ||
... ≤ 4 * (l.length + 1) * nat.log 2 l.length : | ||
begin | ||
refine nat.mul_le_mul_left (4 * (l.length + 1)) _, | ||
exact @nat.log_monotone 2 l₁.length l.length l₁_length_weak, | ||
end | ||
end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be shortened by generalizing with the previous section, i.e., lines 131-152
have (split (a₁ :: a₂ :: t)).snd.fst.length < (a₁ :: a₂ :: t).length := | ||
begin | ||
cases e : split (a₁ :: a₂ :: t) with l₁ l₂n, | ||
cases l₂n with l₂ n, | ||
cases length_split_lt e with h₁ h₂, | ||
exact h₂, | ||
end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be shortened by generalizing with the previous section, i.e., lines 70-75
Add formalization and proofs for time complexity of insertion_sort and merge_sort, as discussed here and here. Reference (pt-br): https://github.com/tomaz1502/RunTimeFormalization/blob/main/Report.pdf