## 1.Quick Sort 
`sentry division`, `recursion`




### Basic Idea

Divide and conquer: filter the original array into 2 sub-arrays, in which the data of the first array should be smaller than the data of the latter one, and then recursively sort the 2 sub-arrays. Specifically: 
1. <font color=orange>**normally, the start/end element should be chosen as the base**</font>, which should be kept still and will only be exchanged until 2 sentries meet each other.
   1. set 2 sentries(pointers) at the start and at the end, denoted as $p_s$ and $p_e$, respectively; 
      - $p_s$ goes from left to right, aiming to find an element $x_l$ larger than the base;
      - $p_e$ goes from right to left, aiming to find an element $x_s$ smaller than the base;
   2. exchange $x_s$ and $x_l$;
   3. continue to move these 2 sentries, repeat the process above;
2. until $p_s$ and $p_e$ meet each other, exchange $x_b$ and $arr[p_s]$ (*i.e.*, $arr[p_e]$).  
3. recursively call the function to quick sort the 2 sub-array, and at last we will get a entirely ordered array.

> 1. <font color=yellow>**When the very <u>left</u> one is chosen as the base `p_b = arr[l]`, the <u>rightmost</u> sentry should move first**, *i.e.*, 
> ```
>  while ..... : p_e -= 1;
>  while ..... : p_s += 1;
> ``` 
>  2. **On the contrary, when the very <u>right</u> one is chosen as the base `p_b = arr[r]`, the <u>leftmost</u> sentry should move first.**
> 
> The reason is that, consider the condition in which the leftmost one is chosen as the base `p_b = arr[l]`, if the leftmost sentry `p_s` moves first, it might stop at the element **larger** than the base when encountering the rightmost sentry `p_e`, then after the switch between the base `arr[p_b]`(left) and `arr[p_s]`(right), the order will be incorrect.</font>
> 
> ps: it should be easy to deduce that **the base shouldn't appear in the middle of the array**

In [193]:
import random
from typing import List

class QuickSort:
    def Sort(self, arr: List[int]):
        self.randomQuickSort(arr, 0, len(arr)-1)
        return arr
        
    def Partition(self, arr: List[int], p_s: int, p_e: int):
        p_b = p_e  # !important, pay attention to the correspondence of the position of base and the order of sentries' movement 
        while p_s < p_e:
            while p_s < p_e and arr[p_s] <= arr[p_b]: p_s += 1
            while p_s < p_e and arr[p_e] >= arr[p_b]: p_e -= 1
            arr[p_s], arr[p_e] =  arr[p_e], arr[p_s] 
        arr[p_b], arr[p_s] = arr[p_s], arr[p_b]
        return p_s
    
    def randomQuickSort(self, arr: List[int], l: int, r: int):
        if l >= r:
            return
        par = self.Partition(arr, l, r);
        self.randomQuickSort(arr, l, par-1)
        self.randomQuickSort(arr, par+1, r)
        

In [206]:
QS = QuickSort()
arr = [84, 4, 2, 5, 6, 54,  3, 1, 2, 1, 1]
sorted_arr = QS.Sort(arr)
sorted_arr

[1, 1, 1, 2, 2, 3, 4, 5, 6, 54, 84]

### Characteristics

- **time complexity**:
   - optimal $\Omega(N \log N)$: each round of partition, the array is divided into 2 <u>**sub-arrays of equal length**</u>; the partition operation is linear time complexity $O(N)$0 ($n-1$ comparisons are made with the base, $n-1 \rightarrow O(N)$); total number of recursive rounds is $O(\log N)$;
   - average $\Theta(N \log N)$: the number of recursive rounds is also $O(\log N)$;.
   - worst $O(N^2)$: under certain special input arrays, each partition operation will divide the array into <u>sub-arrays of lengths $1$ and $N-1$</u>, the number of recursive rounds will increase to $N$ in this condition.
      > the worst scenario can be avoided by "*randomly selection of base*", see below for details.
- **space cmoplexity**:
  - the best and the average complexity are both $O(\log N)$;
  - the worst complexity is $O(N)$ when the input array is completely reversed.
    > via "*tail call*", the worst space complexity can be reduced to $O(\log N)$, see below for details.
- 虽然平均时间复杂度与「归并排序」和「堆排序」一致，但在实际使用中快速排序 效率更高 ，这是因为：
  - 最差情况稀疏性： 虽然快速排序的最差时间复杂度为 $O(N^2)$，差于归并排序和堆排序，但统计意义上看，这种情况出现的机率很低。大部分情况下，快速排序以 $O(N \log ⁡N)$ 复杂度运行。
  - 缓存使用效率高： 哨兵划分操作时，将整个子数组加载入缓存中，访问元素效率很高；堆排序需要跳跃式访问元素，因此不具有此特性。
  - 常数系数低： 在提及的三种算法中，快速排序的 比较、赋值、交换 三种操作的综合耗时最低（类似于插入排序快于冒泡排序的原理）。
- *原地*： 不用借助辅助数组的额外空间，递归仅使用 $O(\log N)$ 大小的栈帧空间。Unlike merging sort, quick sort only needs to open up $O(1)$ storage space to complete the modification of the array during each recursion, the entire space complexity depends on the number of stack pushes.
- *非稳定*： 哨兵划分操作可能改变相等元素的相对顺序。
- *自适应*： 若每轮哨兵划分操作都将长度为 $N$ 的数组划分为长度 $1$ 和 $N−1$ 两个子数组，则时间复杂度劣化至 $O(N^2)$。


### Tail Call

At each round of recursion, only the shorter sub-arrays are divided, the worst recursion depth can be controlled at $O(\log N)$. 

<font color=red size=5> TODO </font>

https://blog.51cto.com/nxlhero/1112835

In [180]:
def quick_sort(nums, l, r):
    # 子数组长度为 1 时终止递归
    while l < r:
        # 哨兵划分操作
        i = QuickSort.Partition(nums, l, r)
        # 仅递归至较短子数组，控制递归深度
        if i - l < r - i:
            quick_sort(nums, l, i - 1)
            l = i + 1
        else:
            quick_sort(nums, i + 1, r)
            r = i - 1

### Take the middle of three numbers

We take a number from the beginning, end, and middle, then compare the value, and take the intermediate value of these 3 numbers as the partition point, which is definitely better than simply taking a certain data.

However, if the array to be sorted is relatively large, then "choosing the middle of three numbers" may not be enough, and it may be necessary to "take the middle of five numbers" or "take the middle of ten numbers". 

### Random Selection of Base

**Poor selection of base**: Since the quicksort selects the "leftmost element of the sub-array" as the base in each round, when the input array is completely ordered or completely reversed, only one element is divided in each round, which leads to the worst time complexity $O(N^2)$.

Therefore, a random function can be used to randomly select an element in the sub-array as the reference number in each round, so that the above degradation can be avoided with a high probability. 

It is worth noting that since the worst case is still possible, the worst time complexity of quicksort is still $O ( N^2 )$, because this method doesn't guarantee that each partition point is selected better.

In [181]:
class Solution:
    def Sort(self, arr: List):
        self.recursiveSort(arr, 0, len(arr) - 1)
        return arr
    
    def Partition(self, arr: List, p_s: int, p_e: int):
        p_b = random.randint(p_s, p_e)
        arr[p_b], arr[p_s] = arr[p_s], arr[p_b]
        l, r = p_s, p_e
        while l < r: 
            while l < r and arr[r] >= arr[p_s]: r -= 1
            while l < r and arr[l] <= arr[p_s]: l += 1
            arr[l], arr[r] = arr[r], arr[l]
        arr[p_s], arr[l] = arr[l], arr[p_s]
        return l
        
    def recursiveSort(self, arr: List, l:  int, r: int):
        if r < l: return
        mid = self.Partition(arr, l, r)
        self.recursiveSort(arr, l, mid - 1)
        self.recursiveSort(arr, mid + 1, r)        

In [242]:
ss = Solution()
nums = [20,24,15,12,1,2,3,54,6]
ss.Sort(nums)
print(nums)

[1, 2, 3, 6, 12, 15, 20, 24, 54]
