# [Divide and Conquer, Sorting and Searching, and Randomized Algorithms - Week 1](https://www.coursera.org/learn/algorithms-divide-conquer/home/week/1)

## 其他版本
- [以 nbviewer 檢視](https://nbviewer.jupyter.org/github/johnnyasd12/algorithms-stanford/blob/master/Lec%201%20-%20Divide%20and%20Conquer%2C%20Sorting%20and%20Searching%2C%20and%20Randomized%20Algorithms/w1.ipynb) (LaTeX 的 render 較正確)
- HackMD 待補

## Resources
- Textbook
    - *Algorithms Illuminated (Part 1)* (by Tim Roughgarden): https://www.amazon.com/dp/0999282905
- Resources
    - Mathematics for Computer Science (by Eric Lehman and Tom Leighton): https://www.cs.princeton.edu/courses/archive/fall06/cos341/handouts/mathcs.pdf

## [Lecture Slides](https://www.coursera.org/learn/algorithms-divide-conquer/supplement/s0KO3/lecture-slides)
- 感覺上他的課之前先看過他該堂課的 slide 比較好，~~一方面因為他的字跡很藝術~~，一方面課程常常只有英文字幕，預習 slide 方便理解。

## I. Introduction

### [Why Study Algorithms?](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/jSwWo/why-study-algorithms)

- important for **all** branches for **computer science**
    - computer graphics, routing & communication network, cryptography, database indices, computational biologies, and so on
- key role in modern techniques
    - PageRank by Google
- **novel** lens(viewpoints?) **outside** of computer science
    - quantum mechanics
    - economic markets
- challenging (i.e. good for the brain)
- fun



### [Integer Multiplication](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/rP869/integer-multiplication)

- input
    - two $n$-digit numbers $x$ and $y$
- output
    - the product $x\times y$
- 我們想計算該 algorithm 需要做幾次 **primitive operations(基本運算)**
- What is primitive operation?
    - **add or multiply 2 single-digit numbers**

#### The Grade-School Algorithm
![](https://i.imgur.com/YFrS3JL.png)
- 計算 **#(primitive operations)**，為求方便，我們之後縮寫成 $\text{#op}$
    - **計算一個 row：**$\text{#op}\leq 2n$
    - **計算所有 row：**$\text{#op}\leq 2n^2$
        - 因為 $\text{#row}\leq n$
    - ***所有 row 相加：***$\text{#op}=2n^2$
- so $\text{#op}\leq 4n^2$ **to run this algorithm, i.e. $\text{#op}\leq cn^2$,** $c$ is a constant
    - means that if we have double-times digit, then the (upper bound of?) $\text{#op}$ is 4-times than before.
    - also means that **if we half #digits, then we reduce $\text{#op}$ down to 1/4**

#### The Algorithm Designer's Mantra
- "Perhaps the most important principle for the good algorithm designer is to refuse to be content." -Aho, Hopcroft, and Ullman, *The Design and Analysis of Computer Algorithms*, 1974
- As an algorithm designer, you should always ask yourself, **"Can we do better?"**

### [Karatsuba Multiplication](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/wKEYL/karatsuba-multiplication)

#### Example
![](https://i.imgur.com/XdlNAbu.png)
- 這裡我們把 **input 的長度減半了!!!**


#### A Recursive Algorithm
![](https://i.imgur.com/xxHvrMD.png)
- we can focus on the **star formula: $10^nac+10^{n/2}(ad+bc)+bd$**
    - 這樣我們就只需要計算 **更小位數的乘積 ($ac,ad,bc,bd$)，他們 input 的長度只有原本的一半！**
    - 接著繼續做相同的事情(把這些更小的位數再拆成兩半)，我們就可以用 **遞迴來解這些更小的乘積**
    - 只要是遞迴，就需要設定**終止條件 (base case)**.
        - i.e. single-digit multiplication
- 其實，這個演算法還可以再做一點改進...


#### Karatsuba Multiplication
![](https://i.imgur.com/87rq3yY.png)
- 上個 slide 我們需要做 **4 次的 recursion ($ac,ad,bc,bd$)** 才能計算結果
- 但是來看看這個項 $(ad+bc)$，其實我們並不想分別計算 $ad$ 和 $bc$，因為這樣要做兩次遞迴
- ~~我本來不想用這招的~~：**Gauss' Trick**，只需要做一次遞迴便可計算 $ad+bc$。
    - 計算 $(a+b)(c+d)$，再減去之前算過的 $ac,bd$，即可得到 $ad+bc$
- therefore **only 3 recursive multiplications($ac,bd,(a+b)(c+d)$) needed**
- but we still cannot **judge if this is better** than the grade-school algorithm

### [About the Course](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/8vama/about-the-course)

#### Course Topics
- Vocabulary for design and analysis of algorithms
    - E.g., *"Big-oh"* notation
    - *"sweet spot"(?)* for high-level reasoning about algorithms
- **Divide and conquer** algorithm design paradigm
    - Will apply to: *Integer multiplication, sorting, matrix multiplication, closest pair*
    - General analysis methods (*"Master Method/Theorem"*)
- **Randomization** in algorithm design
    - Will apply to: *Quick sort, primality testing, graph partitioing, hashing*
- Primitive for reasoning about **graphs** (?)
    - *Connectivity information, shortest paths, structure of information and social networks*
- Use and implementation of **data structures**
    - *Heaps, balanced binary search trees, hashing and some variants (e.g., bloom filters)*



#### Topics in Sequel Course
- **Greedy algorithm** design paradigm
- **Dynamic programming** algorithm design paradigm
- **NP-complete** problems and what to do about them
- Fast heuristics with provable guarantees (??)
- Fast exact algorithms for special cases (??)
- Exact algorithms that beat brute-force search






#### Skills You'll Learn
- Become a better programmer
- Sharpen your math and analytical skills
- Start "thinking algorithmically"
- Literacy with computer science's "greatest hits"
- Ace your technical interviews

#### Excellent free reference
- "Mathematics for Computer Science", by Eric Lehman and Tom Leighton. (Website)

#### Supporting Materials
![](https://i.imgur.com/z6olMqJ.png)

### [Merge Sort: Motivation and Example](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/4vzQr/merge-sort-motivation-and-example)

#### Why Study Merge Sort?
- Good intro to **divide & conquer**
    - Improves over Selection, Insertion, Bubble sorts, they take $O(n^2)$ time complexity for $n$-element array
- Motivate guiding principles for algorithm analysis (?)
    - **worst case** and **asymptotic** analysis (?)
- Analysis generalizes to **"Master Method"**


#### The Sorting Problem
- assume all of numbers are **distinct**
    - **actually if numbers are not all distinct the problem would be easier. (HW)**
- input
    - unsorted array of $n$ numbers
- output
    - same numbers sorted in increasing order

#### Merge Sort: Example
![](https://i.imgur.com/UyunTwq.png)
- we just ignore some details, would explain in next slide

### [Merge Sort: Pseudocode](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/NtFU9/merge-sort-pseudocode)

#### Merge Sort: Pseudocode
0. **base cases** is that if we only have 1 or 0 element in the input array, then just return them
1. **recursively sort** 1st half of the input array
2. **recursively sort** 2nd half of the input aray
3. **merge** two sorted sublists into one


- Details like how to deal with odd number $n$ would not be explained here

#### Pseudocode for Merge
![](https://i.imgur.com/vr64QLG.png)
- $C$ is output
- $A$ and $B$ are recursive sorted array
- for the **"merge"** part we just need to compare the head of two arrays and insert them iteratively.
- we should write the **base case** by ourselves, here we don't explain it.

#### Code for Merge Sort (Unfinished)

In [None]:
#### Merge Sort Code
# TODO: 以後回來解決它

#### Merge Sort Running Time?
- $\text{running time}\approx \text{#lines of code executed}$
- we first compute the running time of simple **"merge"** part. Look at the pseudocode
    - ![](https://i.imgur.com/r6uLieW.png)

#### Running Time of Merge
- **Note:** here we assume that **output length** is $m$, but in the pseudocode it is $n$. (a little bit confusing)
- as you can see in previous slide, the **"merge"** $\text{#op} \leq 4m+2$.
- for the future convenience, we say $\text{#op}\leq 6m$
    - it is always true since $m\geq 1$

#### Running Time of Merge Sort
- **Claim**: For input array of $n$ numbers, **Merge Sort** $\text{#op}\leq 6n\log_2 n+6n$
    - would prove this afterward
- check the **logarithm** in the picture, when $n$ grows larger, the logarithm become **much smaller** than identity function $n$.
- so we know the **time complexity** of merge sort $O(n\log n)$ is better than *bubble sort, selection sort, insertion sort $O(n^2)$*. 
- ![](https://i.imgur.com/Ljis12V.png)

### [Merge Sort: Analysis](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/wW9On/merge-sort-analysis)


#### Proof of claim (assuming n = power of 2)
- assume that
    - **level 0** is root (**outer call to Merge Sort**)
    - **level k** is **k-th recursive calls**
    - so the **number of levels** would be $\log_2 n+1$
- ![](https://i.imgur.com/uS2X6ZB.png)
- key idea is **count up the #operations level-by-level**.
- so for level $j$, let's answer two questions:
    1. what's **# of sub-problems in level** $j$?
        - Ans: $2^j$
    2. what's the **input size of each sub-problems in level $j$**?
        - Ans: $\dfrac{n}{2^j}$
- Total # of operations at level $j$
    - note that when we say **$\text{op}$ in level $j$**, it **doesn't count** their respective recursive calls 
    - $\text{#op}_j \leq\text{#sub_problems}_j\cdot(6\cdot\text{output_length}_j)=2^j\cdot 6\dfrac{n}{2^j}=6n$
- $\text{#op}_{all}=\sum_j\text{#op}_j\leq\sum_j6n =(\log_2n+1)6n=6n\log_2n+6n$

![](https://i.imgur.com/2fwU4gG.png)

### [Guiding Principles for Analysis of Algorithms](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/q5fV4/guiding-principles-for-analysis-of-algorithms)
演算法分析的原則
1. **worst-case** analysis
2. **won't** pay much attention to **constant factors, lower-order terms**
3. **asymptotic** analysis: focus on **large** input size $n$

#### Guiding Principle #1
![](https://i.imgur.com/H5MclqU.png)
- benchmarks: some **practical** or **typical** input for the algorithm

#### Guiding Principle #2
![](https://i.imgur.com/EOs3mBM.png)


#### Guiding Principle #3
![](https://i.imgur.com/VLcdkd3.png)
- 當我們的 computational power 逐漸成長，自然會關注 input size 比較大的 problem
- 另外一個觀點：若 algorithm 1 的複雜度和 $n$ 成正比，而 algorithm 2 的複雜度和 $n^2$ 成正比，就代表了：當 computational power 成長到現在的 4 倍時，algorithm 1 能解決的 input size 也變成 4 倍；而 algorithm 2 能解決的 input size 只能變 2 倍。(那怎麼引導到 Only big problems are interesting?)
- 當 $n$ 越大，常數係數以及低次方項的影響就越來越小
    - ![](https://i.imgur.com/P53xOTH.png)
    - ![](https://i.imgur.com/JaeL1ty.png)

#### What Is a "Fast" Algorithm?
回顧一下
1. 關注 worst case，因為不對 domain (即 input) 做任何假設。
    - > 想法：「不對 domain 做任何假設」可以想成是，假設 input 是任何值的機率都相等，不會有什麼值機率比較高或比較低。既然任何值的可能性都一樣，那就代表 $P(\text{worst case})=P(\text{average case})=P(\text{best case})$，所以只關注最嚴重的 worst case input。
        - > 回顧：這樣想好像也不對，我上面這樣作出來的假設就是 uniform distribution 的假設，就不符合「不對 domain 作任何假設」了。現在覺得「不對 domain 作任何假設」應該解釋成：不對 input 的 distribution 作任何假設，這樣 $P(\textrm{worst case})$ 有可能很大，也有可能很小；best case input 的機率也可能大或小。既然不知道各種 case 的機率，那我們就只看最糟情況。
2. 不太關注常數係數&低次項
3. 關注 growth rate

因此總結而言，**fast algorithm $\approx$ worst-case running time grows slowly with input size**，sweet spot(關鍵點?)有：
- 可以用數學方式 tracking
- 忽略常數係數&低次項 -> 容易分析&預測

![](https://i.imgur.com/hSFQDJX.png)

## II. Asymptotic Analysis


### [The Gist](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/o2NpH/the-gist)

#### Motivation
![](https://i.imgur.com/Wvkfp7p.png)
- 這些前面似乎都講過

#### Asymptotic Analysis
![](https://i.imgur.com/lkjXf2t.png)
- 不在乎 常數係數 <- 因為容易受 compiler、programming language 影響
- 不在乎 低次項 <- 因為只在乎 $n$ 很大的情況
- 原題為 $6n\log_2n+6n$ (Slide 的 Example 有筆誤)
    - 刪掉低次項，剩 $6n\log_2n$
    - 再刪掉常數係數，剩 $n\log_2n$
- 所以 terminology(術語) 就是 running time = $O(n\log n)$

#### Simple Big-O Examples
- 這裡可以直接看 slide ㄅ

<!--
1. ![](https://i.imgur.com/Ai5NlTf.png)
1. ![](https://i.imgur.com/ZnKtfvq.png)
1. ![](https://i.imgur.com/HfcoQgX.png)
1. ![](https://i.imgur.com/EPcufcs.png)
-->

### [Big-Oh Notation](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/KGtUp/big-oh-notation)


#### Big-Oh: English Definition
![](https://i.imgur.com/elhqxUo.png)
- $T(n)=O(f(n))$ 的定義：對於所有足夠大的 $n$ 而言，$cf(n)$ 是 $T(n)$ 的上界。
    - $c$ 是常數

#### Big-Oh: Formal Definition
![](https://i.imgur.com/CWQX5rP.png)

- Formal Definition: 
    - $T(n)=O(f(n))$ iff
    
    **存在** $c,n_0>0$ 使得 $T(n)\le c\cdot f(n),\,\forall n\ge n_0$
- Warning: 注意 $c,n_0$ 對 $n$ 來說必須是常數
- 看左圖
    - 圖中的 $c=2$
    - 圖中的 $n_0=$ 交點的 $n$
- 這就像在玩一個遊戲，你要挑一組 $c,n_0$ 來使不等式 ($T(n)\leq O(f(n))$) 成立；而你的對手要挑一個 $n\geq n_0$ 來說它不成立。
    - 當你有策略能贏這個遊戲，就代表 $T(n)=O(f(n))$
        - i.e. 你挑了一組 $c,n_0$，使得對手不論挑多大的 $n$，不等式都會成立
    - 你若贏不了，它就不是 $O(f(n))$
        - i.e. 不論你挑什麼樣的 $c,n_0$，對手都可以找到一個 $n$ 推翻這個不等式

### [Basic Examples](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/mb8bV/basic-examples)

#### Example #1
- 證明 可忽略低次項
- Claim: if $T(n)=a_kn^k+...+a_1n+a_0$, then $T(n)=O(n^k)$
- Proof: 
    - Choose $n_0=1$ and $c=|a_k|+|a_{k-1}|+...+|a_1|+|a_0|$
    - Need to show that $\forall n\geq 1, T(n)\leq c\cdot n^k$
    - We have, for every $n\geq 1,$
        - $T(n)\leq |a_k|n^k+...+|a_1|n+|a_0|\\\leq|a_k|n^k+...+|a_1|n^k+|a_0|n^k\\=c\cdot n^k$
        - Q.E.D.
- 這裡有個問題：我們怎知道該如何設定 $n_0,c$ 是多少，來像這樣證明 Big-O 呢?
    - 通常會做 **reverse engineering(反向推導，反推)** 如果我選了一個 $n_0$，那麼選擇怎樣的 $c$ 能夠幫助我繼續做證明。
    - optional video 會有更多的 example 用到 reverse engineering trick

#### Example #2
- Claim: $\forall k\geq 1, n^k\ne O(n^{k-1})$
- Proof: by contradiction. 
    - Suppose $n^k=O(n^{k-1})$ $\implies \exists c,n_0$ such that $n^k\leq c\cdot n^{k-1},\,\forall n\geq n_0$
        - ***Q: 不過這邊要用反證法的話，應該要寫 $\exists k\geq 1,\exists c,n_0$ 吧?? 因為 for every $k\geq 1$ False 的反面是 exists $k\geq 1$ true?*** 雖然不會影響推導結果就是了
    - then $n\leq c\,\,\forall n\geq n_0$ which is clearly False. contradiction. Q.E.D.

### [Big Omega and Theta](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/SxSch/big-omega-and-theta)

#### Omega Notation
![](https://i.imgur.com/sfQDdO4.png)
- Definition
    - $T(n)=\Omega(f(n))$ iff 

        **存在** $c,n_0$ 使得 $T(n)\geq c\cdot f(n),\,\forall n\geq n_0$

#### Theta Notation
- Definition:
    - $T(n)=\theta(f(n))$ iff
    
    $T(n)=O(f(n))$  且  $T(n)=\Omega(f(n))$


- Equivalent:
    - 存在 $c_1, c_2, n_0$ 使得 $c_1f(n)\leq T(n)\leq c_2f(n),\,\forall n\geq n_0$


- 在這裡工程師常常會以 Big-Oh notation 來敘述一個 Big-Theta Notation，例如：我們會把 $\Theta(n)$ 說成是 $O(n)$，原因是：**身為一個 algorithm designer，我們真正在乎的是 upper bound**，而不在乎 lower bound，所以即使某 algorithm 的 complexity 是 $\Theta(n)$，我們也不會去強調 $\Theta$ 而只說他是 $O(n)$ 就好了。

#### Quiz: Big-Oh, Big-Omega, Theta
![](https://i.imgur.com/PbXBYEa.png)

#### Little-Oh Notation
- Definition:
    - $T(n)=o(f(n))$ iff
    
        **對於所有** $c>0$, 存在 $n_0$ 使得 $T(n)\leq c\cdot f(n),\,\forall n\geq n_0$
- Exercise:
    - $\forall k\geq 1,n^{k-1}=o(n^k)$

#### Where Does Notation Come From?
![](https://i.imgur.com/Y3BTWnN.png)

### [Additional Examples (Review - Optional)](https://www.coursera.org/learn/algorithms-divide-conquer/lecture/yl6kU/additional-examples-review-optional)

#### Additional Example #1
- 正式的證明一個函數可以用另一個函數的 Big-Oh 來表示
- reverse engineering
![](https://i.imgur.com/jTSfLg9.png)

#### Additional Example #2
- 一般來說會使用 **反證法** 來證明一個函數**不是**另一個函數的 Big-Oh Notation。
![](https://i.imgur.com/2FM1QD6.png)
- 最後的不等式，左邊是根據 $n$ 而增長的函數($n$若趨於無窮則左邊也會接近無窮)，而右邊只是一個常數，因此 $2^{9n}\le c,\,\forall n\ge n_0$ 不可能總是成立。

#### Additional Example #3
- ***Theta Notation 代表的意義是兩個函數會「漸進地相等(??)」 Q: 這啥***
![](https://i.imgur.com/KDPNDV0.png)
- 注意這裡是說 **"positive"** function，也就是不會有負值

![](https://i.imgur.com/1YCmXAd.png)
- 不等式 $2\cdot\max(f(n),g(n))\geq f(n)+g(n)$ 成立是因為右邊是一大一小相加；而左邊是取較大的乘以 2。