feat(computability/timed): add complexity analysis for sorting algorithms #14494

tomaz1502 · 2022-05-31T17:07:34Z

Add formalization and proofs for time complexity of insertion_sort and merge_sort, as discussed here and here. Reference (pt-br): https://github.com/tomaz1502/RunTimeFormalization/blob/main/Report.pdf

robertylewis

Thanks for the contribution! I've made some light suggestions here. In general: it would be easier to review this if you could break it into smaller PRs. You can keep this one open as a "tracking" PR, and open smaller ones that add one file at a time.

I'll defer to @digama0 about the broader approach.

robertylewis · 2022-06-14T18:16:55Z

src/computability/timed/insertion_sort.lean

+    rw ordered_length,
+    rw l_ih, }


Suggested change

rw ordered_length,

rw l_ih, }

rw [ordered_length, l_ih], }

Here and elsewhere you can combine multiple rws into one

robertylewis · 2022-06-14T18:17:28Z

src/computability/timed/insertion_sort.lean

+    unfold insertion_sort,
+    rw ← ordered_insert_equivalence r l_hd fst,
+    cases ordered_insert r l_hd fst,
+    unfold insertion_sort, }


Suggested change

unfold insertion_sort, }

refl }

A goal-ending unfold can usually be replaced by refl

robertylewis · 2022-06-14T18:30:50Z

src/computability/timed/lemmas.lean

+  - log_2_times : ∀ (a : ℕ), 2 * nat.log 2 (a + 2) ≤ a + 2
+-/
+
+lemma log_pred : ∀ (a : ℕ) , nat.log 2 a - 1 = nat.log 2 (a / 2)


I think this is a lot cleaner by translating to + and using nat.log_of_one_lt_of_le. (There may be ways to make this slicker, this was just the first I came up with.)

lemma log_pred (n : ℕ) : nat.log 2 n - 1 = nat.log 2 (n / 2) := if h : n < 2 then by simp [nat.log_of_lt h] else (nat.sub_eq_iff_eq_add (by rw ← nat.pow_le_iff_le_log; linarith)).mpr (nat.log_of_one_lt_of_le (by norm_num) (le_of_not_lt h))

robertylewis · 2022-06-14T18:37:21Z

src/computability/timed/lemmas.lean

+  cases h,
+end
+
+lemma log_2_val : nat.log 2 2 = 1 :=


And this is a consequence of a more general lemma that should go in data.nat.log.

lemma log_base {b : ℕ} (hb : 1 < b) : nat.log b b = 1 := by simpa using nat.log_pow hb 1 lemma log_2_val : nat.log 2 2 = 1 := log_base (by norm_num)

In general, I think most of these lemmas can be simplified and/or generalized.

robertylewis · 2022-06-14T18:39:07Z

src/computability/timed/merge.lean

+    simp only [list.length] at IH,
+    simp only [list.length],


Suggested change

simp only [list.length] at IH,

simp only [list.length],

simp only [list.length] at IH ⊢,

and below. The symbol can be entered as \vdash.

jn1z

Sounds cool! Added some comment and proof-shortening suggestions.

jn1z · 2022-06-22T15:44:44Z

src/computability/timed/insertion_sort.lean

+# Timed Insertion Sort
+  This file defines a new version of Insertion Sort that, besides sorting the input list, counts the
+  number of comparisons made through the execution of the algorithm. Also, it presents proofs of
+  it's time complexity and it's equivalence to the one defined in data/list/sort.lean


You want "its", the possessive. ("it's" is simply "it is".)

jn1z · 2022-06-22T15:44:59Z

src/computability/timed/merge.lean

+# Timed Merge
+  This file defines a new version of Merge that, besides combining the input lists, counts the
+  number of operations made through the execution of the algorithm. Also, it presents proofs of
+  it's time complexity and it's equivalence to the one defined in data/list/sort.lean


jn1z · 2022-06-22T15:45:09Z

src/computability/timed/merge_sort.lean

+# Timed Merge Sort
+  This file defines a new version of Merge Sort that, besides sorting the input list, counts the
+  number of operations made through the execution of the algorithm. Also, it presents proofs of
+  it's time complexity and it's equivalence to the one defined in data/list/sort.lean


jn1z · 2022-06-22T15:45:23Z

src/computability/timed/split.lean

+# Timed Split
+  This file defines a new version of Split that, besides splitting the input lists into two halves,
+  counts the number of operations made through the execution of the algorithm. Also, it presents
+  proofs of it's time complexity, it's equivalence to the one defined in data/list/sort.lean and of


jn1z · 2022-06-22T15:54:19Z

src/computability/timed/merge_sort.lean

+    have l₂s_id : (merge_sort r l₂).fst = l₂s := (congr_arg prod.fst h₂).trans rfl,
+    rw merge_sort_equivalence at l₂s_id,
+    have same_lengths₂ := list.length_merge_sort r l₂,
+    have l₂s_len_l₂_len : l₂s.length = l₂.length :=
+    begin
+      rw l₂s_id at same_lengths₂,
+      exact same_lengths₂,
+    end,
+    rw l₁s_len_l₁_len,
+    rw l₂s_len_l₂_len,
+
+    exact split_lengths l l₁ l₂ hs,
+  end,


I think this can be shortened by generalizing with the previous section, i.e., lines 176-185

I agree with Robert's comment above that there are other parts that can be generalized. Since merge sort is divide-and-conquer, I'd suspect most proofs for the two sub-lists can be reused.

jn1z · 2022-06-22T15:55:25Z

src/computability/timed/merge_sort.lean

+  have ns_bound : ns ≤ 4 * (l.length + 1) * nat.log 2 l.length :=
+  begin
+    have ns_id : (merge_sort r l₁).snd = ns := (congr_arg prod.snd h₁).trans rfl,
+    rw ← ns_id,
+    refine le_trans ih₁ _,
+    calc 8 * l₁.length * nat.log 2 l₁.length
+                = 4 * (2 * l₁.length) * nat.log 2 l₁.length :
+                      by linarith
+            ... ≤ 4 * (l.length + 1) * nat.log 2 l₁.length :
+                      begin
+                        rw mul_assoc,
+                        rw mul_assoc 4 (l.length + 1) (nat.log 2 l₁.length),
+                        refine (mul_le_mul_left zero_lt_four).mpr _,
+                        exact nat.mul_le_mul_right (nat.log 2 l₁.length) l₁_length,
+                      end
+            ... ≤ 4 * (l.length + 1) * nat.log 2 l.length :
+                      begin
+                        refine nat.mul_le_mul_left (4 * (l.length + 1)) _,
+                        exact @nat.log_monotone 2 l₁.length l.length l₁_length_weak,
+                      end
+  end,


I think this can be shortened by generalizing with the previous section, i.e., lines 131-152

jn1z · 2022-06-22T15:59:12Z

src/computability/timed/merge_sort.lean

+  have (split (a₁ :: a₂ :: t)).snd.fst.length < (a₁ :: a₂ :: t).length :=
+  begin
+    cases e : split (a₁ :: a₂ :: t) with l₁ l₂n,
+    cases l₂n with l₂ n,
+    cases length_split_lt e with h₁ h₂,
+    exact h₂,
+  end,


I think this can be shortened by generalizing with the previous section, i.e., lines 70-75

tomaz1502 added 10 commits May 31, 2022 13:57

Add formalization of run time complexity of the sorting algorithms

aae362d

Replace: Author -> Authors

bdca092

reordering imports

d364a32

fix: tactic -> tactic.linarith

cf6ce81

fix: /- -> /-! in docstring

1c82b2b

removing evals

e4ce62c

fix: malformed curly braces

c303994

move timed to src/computability

102b5c1

move timed to src/computability

98ace57

move timed to src/computability

77a21e1

tomaz1502 marked this pull request as ready for review May 31, 2022 17:10

fix: import path: data.list -> computability

80207b9

tomaz1502 added awaiting-review The author would like community review of the PR undergrad Relates to undergrad.yaml labels May 31, 2022

vihdzp added the awaiting-CI The author would like to see what CI has to say before doing more work. label Jun 2, 2022

github-actions bot removed the awaiting-CI The author would like to see what CI has to say before doing more work. label Jun 2, 2022

tomaz1502 force-pushed the run_time_formalization branch from 08e3c6e to 98ace57 Compare June 6, 2022 13:50

tomaz1502 added 2 commits June 6, 2022 10:54

Added docstrings for sorting algorithms

210123e

remove r in split_lengths

21714ec

tomaz1502 force-pushed the run_time_formalization branch from def0de9 to 21714ec Compare June 6, 2022 13:54

robertylewis reviewed Jun 14, 2022

View reviewed changes

robertylewis added awaiting-author A reviewer has asked the author a question or requested changes and removed awaiting-review The author would like community review of the PR labels Jun 14, 2022

robertylewis requested a review from digama0 June 14, 2022 18:43

jn1z reviewed Jun 22, 2022

View reviewed changes

semorrison added the too-late This PR was ready too late for inclusion in mathlib3 label Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(computability/timed): add complexity analysis for sorting algorithms #14494

feat(computability/timed): add complexity analysis for sorting algorithms #14494

tomaz1502 commented May 31, 2022 •

edited

Loading

robertylewis left a comment

robertylewis Jun 14, 2022

robertylewis Jun 14, 2022

robertylewis Jun 14, 2022

robertylewis Jun 14, 2022

robertylewis Jun 14, 2022

robertylewis Jun 14, 2022

jn1z left a comment

jn1z Jun 22, 2022

jn1z Jun 22, 2022

jn1z Jun 22, 2022

jn1z Jun 22, 2022

jn1z Jun 22, 2022

jn1z Jun 22, 2022

jn1z Jun 22, 2022

jn1z Jun 22, 2022

	simp only [list.length] at IH,
	simp only [list.length],
	simp only [list.length] at IH ⊢,

feat(computability/timed): add complexity analysis for sorting algorithms #14494

Are you sure you want to change the base?

feat(computability/timed): add complexity analysis for sorting algorithms #14494

Conversation

tomaz1502 commented May 31, 2022 • edited Loading

robertylewis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jn1z left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomaz1502 commented May 31, 2022 •

edited

Loading