#  Sorting Algorithms



Present sorting algorithms here primarily to provide some practice in **thinking about algorithm design and complexity analysis**

We begin with a simple but **inefficient** algorithm, **selection sort**

## 1 Selection sort 

### 1.1 Selection sort in Python
**Selection sort** works by maintaining the **loop invariant** that, given a partitioning of the list into 

* <b>a prefix (L[0:i])</b> 

* <b>a suffix (L[i+1:len(L)])</b>,

the prefix is sorted and no element in the prefix is larger than <b>the smallest element</b> in the suffix.

We use induction to reason about **loop invariants**.

* **Base case**: At the **start** of the first iteration, the **prefix** is **empty**, i.e., 

  * the **suffix** is the **entire list**. The invariant is  true.


* **Induction step**: At each step of the algorithm, we move `one` element **from the suffix to the prefix.** 


>   * We do this by appending a **minimum** element of the suffix to the end of the prefix. Because the invariant held before we moved the element,


>   * we know that after we append the element the prefix is **still sorted**. 


>   * We also know that since we removed the smallest element in the suffix, **no element in the prefix is larger than the smallest element in the suffix**.


* When **the loop is exited**, the **prefix** includes the **entire list**, and the **suffix**
is **empty**.


Therefore, the entire list is now sorted in **ascending** order.

Example:
```c
int a[7] = {7, 4, 5, 9, 8, 2, 1};
```


![selsort](./img/ds/selsort.jpg)

In [3]:
def selSort(L):
    """Assumes that L is a list of elements that can be
         compared using >.
       Sorts L in ascending order"""
    suffixStart = 0
    while suffixStart != len(L):
        #look at each element in suffix
        for i in range(suffixStart, len(L)):
            if L[i] < L[suffixStart]:
                #swap position of elements
                L[suffixStart], L[i] = L[i], L[suffixStart]
        suffixStart += 1

In [4]:
L=[7, 4, 5, 9, 8, 2, 1]
selSort(L)
print(L)

[1, 2, 4, 5, 7, 8, 9]


Unfortunately, it is rather inefficient
```python
while suffixStart != len(L):

   for i in range(suffixStart, len(L)):
  
   suffixStart += 1
```
For a list of size $n$, the outer loop executes $n-1$ times.

On the first pass through the outer loop, the inner loop executes $n-1$ times. 
On the second pass through the outer loop, the inner loop executes $n-2$ times.

On the last pass through the outer loop, the inner loop executes `once`. 

Thus, the total number of comparisons for a list of size $n$ is the following:

$(n-1)+(n-2)+...+1=n(n-1)=\frac{1}{2}n^2-\frac{1}{2}n$

The complexity of the entire function is $O(n^2)$. I.e., it is **quadratic** in the length n of L.

* The average-case and worst-case time complexity is $O(n^2)$.

### 1.2  SelectionSort in C

In [175]:
%%file ./demo/src/SelectionSort.c

/*
 Sorting an array using Selection Sort (SelectionSort.c) 
*/

#include <stdio.h>
#include <stdlib.h>

void selectionSort(int a[], int size);
void print(const int a[], int iMin, int iMax);

// Sort the given array of size using selection sort
void selectionSort(int a[], int size)
{
   int temp; // for swaping
   for (int i = 0; i < size - 1; ++i)
   {
      // for tracing
      print(a, 0, i - 1);
      print(a, i, size - 1);

      // [0, i-1] already sort
      // Search for the smallest element in [i, size-1]
      //  and swap with a[i]
      int minIndex = i; // assume fist element is the smallest
      for (int j = i + 1; j < size; ++j)
      {
         if (a[j] < a[minIndex])
            minIndex = j;
      }
      if (minIndex != i)
      { // swap
         temp = a[i];
         a[i] = a[minIndex];
         a[minIndex] = temp;
      }

      // for tracing
      printf("=> ");
      print(a, 0, i - 1);
      print(a, i, size - 1);
      printf("\n");
   }
}

// Print the contents of the array in [iMin, iMax]
void print(const int a[], int iMin, int iMax)
{
   printf("{");
   for (int i = iMin; i <= iMax; ++i)
   {
      printf("%d", a[i]);
      if (i < iMax)
         printf(",");
   }
   printf("}");
}

int main()
{
   const int SIZE = 7;
   int a[7] = {7, 4, 5, 9, 8, 2, 1};
   print(a, 0, SIZE - 1);
   printf("\n");
   selectionSort(a, SIZE);
   print(a, 0, SIZE - 1);
   printf("\n");

   return 0;
}


Overwriting ./demo/src/SelectionSort.c


In [176]:
!gcc -o ./demo/bin/SelectionSort ./demo/src/SelectionSort.c

In [177]:
!.\demo\bin\SelectionSort

{7,4,5,9,8,2,1}
{}{7,4,5,9,8,2,1}=> {}{1,4,5,9,8,2,7}
{1}{4,5,9,8,2,7}=> {1}{2,5,9,8,4,7}
{1,2}{5,9,8,4,7}=> {1,2}{4,9,8,5,7}
{1,2,4}{9,8,5,7}=> {1,2,4}{5,8,9,7}
{1,2,4,5}{8,9,7}=> {1,2,4,5}{7,9,8}
{1,2,4,5,7}{9,8}=> {1,2,4,5,7}{8,9}
{1,2,4,5,7,8,9}


## 2 Merge Sort
we can do a lot better than quadratic time using a **divide-and-conquer(分治）** algorithm.

The basic idea is to combine solutions of simpler instances of the original problem. 


**In general, a divide-and-conquer algorithm is characterized by**

* 1 A **threshold input size**, below which the problem is **not subdivided**

* 2 The size and number of **sub-instances** into which an instance is **split**,

* 3 The algorithm used to **combine** sub-solutions.

The threshold is sometimes called the **recursive base**. For `item 2` it is usual to consider the ratio of `initial` problem size to `sub-instance` size. In most of the examples we’ve seen so far, the ratio was 2.

Merge sort is a prototypical <b>divide-and-conquer algorithm(分治法)</b>. It was invented in 1945, by John von Neumann, and is still widely used.

Like many divide-and-conquer algorithms it is most easily described recursively:

* If the list is of **length 0 or 1**, it is **already** sorted. 
* If the list has `more than` one element, **split** the list into `two` lists, and use **merge sort** to sort each of them.
* **Merge** the results.

>**A merge sort** works as follows:
>
>* `Divide` the unsorted list into `n` sublists, each containing `1` element (a list of 1 element is considered sorted).
>
>* Repeatedly `merge` sublists to produce new sorted sublists until there is only 1 sublist remaining.



The key observation made by von Neumann is that **two sorted lists** can be efficiently **merged** into a single sorted list.

<font color='blue'>**The merge idea**</font> is: 

look at the **first** element of each list, and move the **smaller of the two** to the **end** of the **result** list.

When one of the lists is empty, all that remains is to copy the remaining items from the other list.


This will be the sorted list.

Consider, for example, merging the two lists
```
 [1,5,12,18,19,20]
 [2,3,4,17]
```
![mergesortedlist](./img/ds/mergesortedlist.png)

**The merge sort**

![merge_sort](./img/ds/merge_sort.jpg)

**The complexity of the merge process**

It involves two constant-time operations： 

* 1 comparing the values of elements 

* 2 copying elements from one list to another. 

The number of comparisons is **O(len(L))**, where L is the **longer** of the two lists. 

The number of copy operations is **O(len(L1) + len(L2))**, because each element gets copied exactly once. 

Therefore, `merging` two sorted lists is **linear** in **the length of the lists**:$O(len(L))$



**Let’s analyze the complexity of mergeSort.** 

We already know that the time complexity of `merge` is $O(len(L))$. At each level of recursion the total number of elements to be merged is $len(L)$. 

Therefore, the time complexity of mergeSort is $O(len(L))$ `multiplied` by the number of levels of `recursion`. 

Since mergeSort divides the list <b>in half</b> each time, we know that the number of levels of recursion is $O(log(len(L))$. 
  
Therefore, the time complexity of mergeSort is $O(n*log(n))$, where n is len(L).
                                                                                           This improvement in time complexity comes with a price. 

**Space complexity**

<b>Selection sort</b> is an example of an <b>in-place</b> sorting algorithm.
* Because it works by swapping the place of elements **within** the list, it uses only <b>a constant amount of extra storage</b> (`one` element in our implementation). 

<b>Merge sort</b> algorithm  involves making <b>copies of the list</b>. This means that its space complexity is $O(len(L))$.                                                                                   



### 2.1 Merge Sort in Python

In [178]:
def merge(left, right, compare):
    """Assumes left and right are sorted lists and
         compare defines an ordering on the elements.
       Returns a new sorted (by compare) list containing the
         same elements as (left + right) would contain."""
    
    result = []
    i,j = 0, 0
    while i < len(left) and j < len(right):
        if compare(left[i], right[j]):
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    while (i < len(left)):
        result.append(left[i])
        i += 1
    while (j < len(right)):
        result.append(right[j])
        j += 1
    return result

def merge_sort(L, compare = lambda x,y:x<y):
    """Assumes L is a list, compare defines an ordering
         on elements of L
       Returns a new sorted list containing the same elements as L"""
    if len(L) < 2:
        return L[:]
    else:
        middle = len(L)//2
        left = merge_sort(L[:middle], compare)
        right = merge_sort(L[middle:], compare)
        return merge(left, right, compare)

Notice that we have made the `comparison` operator a parameter of the `merge_sort` function and written a `lambda` expression to supply a default value.

In [179]:
L=[1,5,12,18,19,20,2,3,4,17]
L1=merge_sort(L)
print(L1)
L2=merge_sort(L,lambda x,y:x>y)
print(L2)

[1, 2, 3, 4, 5, 12, 17, 18, 19, 20]
[20, 19, 18, 17, 12, 5, 4, 3, 2, 1]


### 2.2 MergeSort in C

In [180]:
%%file ./demo/src/MergeSort.c

/* Sorting an array using Merge Sort (MergeSort.c) */
#include <stdio.h>
#include <stdlib.h>
 
void mSort(int a[], int size);
void mergeSort(int a[], int iLeft, int iRight, int work[]);
void merge(int a[], int iLeftHalfLeft, int iLeftHalfRight,
           int iRightHalfLeft, int iRightHalfRight, int work[]);
void print(const int a[], int iLeft, int iRight);

 
// Sort the given array of size
void mSort(int a[], int size) {
   int work[size];  // work space
   mergeSort(a, 0, size - 1, work);
}
 
// Sort the given array in [iLeft, iRight]
void mergeSort(int a[], int iLeft, int iRight, int work[]) {
   if ((iRight - iLeft) >= 1) {   // more than 1 elements, divide and sort
      // Divide into left and right half
      int iLeftHalfLeft = iLeft;
      int iLeftHalfRight = (iRight + iLeft) / 2;   // truncate
      int iRightHalfLeft = iLeftHalfRight + 1;
      int iRightHalfRight = iRight;
 
      // Recursively sort each half
      mergeSort(a, iLeftHalfLeft, iLeftHalfRight, work);
      mergeSort(a, iRightHalfLeft, iRightHalfRight, work);
 
      // Merge two halves
      merge(a, iLeftHalfLeft, iLeftHalfRight, iRightHalfLeft, iRightHalfRight, work);
   }
}
 
// Merge two halves in [iLeftHalfLeft, iLeftHalfRight] and [iRightHalfLeft, iRightHalfRight]
// Assume that iLeftHalfRight + 1 = iRightHalfLeft
void merge(int a[], int iLeftHalfLeft, int iLeftHalfRight,
           int iRightHalfLeft, int iRightHalfRight, int work[]) {
   int size = iRightHalfRight - iLeftHalfLeft + 1;
   int iResult = 0;
   int iLeft = iLeftHalfLeft;
   int iRight = iRightHalfLeft;
   while (iLeft <= iLeftHalfRight && iRight <= iRightHalfRight) {
      if (a[iLeft] <= a[iRight]) {
         work[iResult++] = a[iLeft++];
      } else {
         work[iResult++] = a[iRight++];
      }
   }
   // Copy the remaining left or right into work
   while (iLeft <= iLeftHalfRight) work[iResult++] = a[iLeft++];
   while (iRight <= iRightHalfRight) work[iResult++] = a[iRight++];
 
   // for tracing
   print(a, iLeftHalfLeft, iLeftHalfRight);
   print(a, iRightHalfLeft, iRightHalfRight);
   printf("=> ");
   print(work, 0, size - 1);
   printf("\n");
 
   // Copy the work back to the original array
   for (iResult = 0, iLeft = iLeftHalfLeft; iResult < size; ++iResult, ++iLeft) {
      a[iLeft] = work[iResult];
   }
}
 
// Print the contents of the given array from iLeft to iRight (inclusive)
void print(const int a[], int iLeft, int iRight) {
   printf("{");
   for (int i = iLeft; i <= iRight; ++i) {
      printf("%d", a[i]);
      if (i < iRight) printf(",");
   }
   printf("} ");
}

 
int main() {
   // Test 1
   const int SIZE_1 = 8;
   int a1[8] = {8, 4, 5, 3, 2, 9, 4, 1};
 
   print(a1, 0, SIZE_1 - 1);
   printf("\n");
   mSort(a1, SIZE_1);
   print(a1, 0, SIZE_1 - 1);
   printf("\n \n");
 
   // Test 2
   const int SIZE_2 = 13;
   int a2[13] = {8, 4, 5, 3, 2, 9, 4, 1, 9, 1, 2, 4, 5};
 
   print(a2, 0, SIZE_2 - 1);
   printf("\n");
   mSort(a2, SIZE_2);
   print(a2, 0, SIZE_2 - 1);
   printf("\n \n");
    
   return 0;
}

Overwriting ./demo/src/MergeSort.c


In [181]:
!gcc -o ./demo/bin/MergeSort ./demo/src/MergeSort.c

In [182]:
!.\demo\bin\MergeSort

{8,4,5,3,2,9,4,1} 
{8} {4} => {4,8} 
{5} {3} => {3,5} 
{4,8} {3,5} => {3,4,5,8} 
{2} {9} => {2,9} 
{4} {1} => {1,4} 
{2,9} {1,4} => {1,2,4,9} 
{3,4,5,8} {1,2,4,9} => {1,2,3,4,4,5,8,9} 
{1,2,3,4,4,5,8,9} 
 
{8,4,5,3,2,9,4,1,9,1,2,4,5} 
{8} {4} => {4,8} 
{5} {3} => {3,5} 
{4,8} {3,5} => {3,4,5,8} 
{2} {9} => {2,9} 
{2,9} {4} => {2,4,9} 
{3,4,5,8} {2,4,9} => {2,3,4,4,5,8,9} 
{1} {9} => {1,9} 
{1,9} {1} => {1,1,9} 
{2} {4} => {2,4} 
{2,4} {5} => {2,4,5} 
{1,1,9} {2,4,5} => {1,1,2,4,5,9} 
{2,3,4,4,5,8,9} {1,1,2,4,5,9} => {1,1,2,2,3,4,4,4,5,5,8,9,9} 
{1,1,2,2,3,4,4,4,5,5,8,9,9} 
 


## 3 Quick sort

Quicksort uses `divide` and `conquer` to sort an array. 

Divide and conquer(分治) is a technique used for breaking algorithms down into subproblems, solving the subproblems, and then combining the results back together to solve the original problem. It can be helpful to think of this method as `divide, conquer, and combine.`

Here are the divide, conquer, and combine steps that quicksort uses:

**Divide**

1. Pick a pivot element, $A[q]$

2. Partition, or rearrange, the array into two subarrays: $A[p,…,q−1]$ such that all elements are less than $A[q]$, and $A[q+1,…,r]$ such that all elements are greater than or equal to $A[q]$.

**Conquer**: 

Sort the subarrays $A[p,…,q−1]$ and $A[q+1,…,r]$ recursively with `quicksort`.

**Combine**: 

No work is needed to combine the arrays because they are already sorted


![quick-sort](./img/ds/quick-sort.jpg)

**Complexity**

* The worst-case time complexity is $O(n^2)$. The average-case (typical) and best-case is $O(nlogn)$. 

* **In-place** sorting can be achieved without additional space requirement.

![quick-sort](./img/ds/quicksort-1.jpg)

### 3.1  Quick Sort in Python

In [183]:
def quickSort(array):
    less = []
    pivotList = []
    more = []

    if len(array) <= 1:
        return array
    else:
        pivot = array[0]
        for i in array:
            if i < pivot:
                less.append(i)
            elif i > pivot:
                more.append(i)
            else:
                pivotList.append(i)
        less = quickSort(less)
        more = quickSort(more)
        return less + pivotList + more

In [184]:
array=[49, 38,65, 97,76, 13, 27,49]
sortedarray=quickSort(array)
print(sortedarray)

[13, 27, 38, 49, 49, 65, 76, 97]


### 3.2 Quick Sort in CPP


```cpp
// Sort the given array in [left, right]
void quickSort(int a[], int left, int right) {
   if ((right - left) >= 1) {   // more than 1 elements, need to sort
      choosePivot(a, left, right);
      int pivotIndex = partition(a, left, right);
      quickSort(a, left, pivotIndex -  1);
      quickSort(a, pivotIndex + 1, right);
   }
}
```

In [185]:
%%file ./demo/src/QuickSort.cpp
/* 
 Sorting an array using Quick Sort (QuickSort.cpp) 

*/
#include <iostream>
using namespace std;
 
void quickSort(int a[], int size);
void quickSort(int a[], int left, int right);
void choosePivot(int a[], int left, int right);
int partition(int a[], int left, int right);
void print(const int a[], int left, int right);
 
// Sort the given array of size
void quickSort(int a[], int size) {
   quickSort(a, 0, size - 1);
}
 
// Sort the given array in [left, right]
void quickSort(int a[], int left, int right) {
   if ((right - left) >= 1) {   // more than 1 elements, need to sort
      choosePivot(a, left, right);
      int pivotIndex = partition(a, left, right);
      quickSort(a, left, pivotIndex -  1);
      quickSort(a, pivotIndex + 1, right);
   }
}
 
// Choose a pivot element and swap with the right
void choosePivot(int a[], int left, int right) {
   int pivotIndex = (right + left) / 2;
   int temp;
   temp = a[pivotIndex];
   a[pivotIndex] = a[right];
   a[right] = temp;
}
 
// Partition the array [left, right] with pivot initially on the right.
// Return the index of the pivot after partition, all elements to the
// left of pivot are smaller; while to the right are larger.
int partition(int a[], int left, int right) {
   int pivot = a[right];
   int temp;  // for swapping
   int storeIndex = left;
      // Start the storeIndex from left, swap elements smaller than
      //  pivot into storeIndex and increase the storeIndex.
      // At the end of the pass, all elements up to storeIndex are
      //  smaller than pivot.
   for (int i = left; i < right; ++i) {  // exclude pivot
      if (a[i] < pivot) {
         // for tracing
         print(a, left, right);
 
         if (i != storeIndex) {
            temp = a[i];
            a[i] = a[storeIndex];
            a[storeIndex] = temp;
         }
         ++storeIndex;
 
         // for tracing
         cout << "=> ";
         print(a, left, right);
         cout << endl;
      }
   }
   // Swap pivot and storeIndex
   a[right] = a[storeIndex];
   a[storeIndex] = pivot;
 
   // for tracing
   print(a, left, storeIndex - 1);
   cout << "{" << a[storeIndex] << "} ";
   print(a, storeIndex + 1, right);
   cout << endl;
 
   return storeIndex;
}
 
// Print the contents of the given array from left to right (inclusive)
void print(const int a[], int left, int right) {
   cout << "{";
   for (int i = left; i <= right; ++i) {
      cout << a[i];
      if (i < right) cout << ",";
   }
   cout << "} ";
}

int main() {
   const int SIZE = 8;
   int a[SIZE] = {49, 38,65, 97,76, 13, 27,49};
 
   print(a, 0, SIZE - 1);
   cout << endl;
   cout <<"Sorting ..."<< endl;
   quickSort(a, SIZE);
   print(a, 0, SIZE - 1);
   cout << endl << endl;
 
}  
 

Overwriting ./demo/src/QuickSort.cpp


In [186]:
!g++ -o ./demo/bin/QuickSort ./demo/src/QuickSort.cpp

In [187]:
!.\demo\bin\QuickSort

{49,38,65,97,76,13,27,49} 
Sorting ...
{49,38,65,49,76,13,27,97} => {49,38,65,49,76,13,27,97} 
{49,38,65,49,76,13,27,97} => {49,38,65,49,76,13,27,97} 
{49,38,65,49,76,13,27,97} => {49,38,65,49,76,13,27,97} 
{49,38,65,49,76,13,27,97} => {49,38,65,49,76,13,27,97} 
{49,38,65,49,76,13,27,97} => {49,38,65,49,76,13,27,97} 
{49,38,65,49,76,13,27,97} => {49,38,65,49,76,13,27,97} 
{49,38,65,49,76,13,27,97} => {49,38,65,49,76,13,27,97} 
{49,38,65,49,76,13,27} {97} {} 
{49,38,65,27,76,13,49} => {38,49,65,27,76,13,49} 
{38,49,65,27,76,13,49} => {38,27,65,49,76,13,49} 
{38,27,65,49,76,13,49} => {38,27,13,49,76,65,49} 
{38,27,13} {49} {76,65,49} 
{38,13,27} => {13,38,27} 
{13} {27} {38} 
{76,49,65} => {49,76,65} 
{49} {65} {76} 
{13,27,38,49,49,65,76,97} 



## 4 Bubble Sort 

In brief, we pass thru the list, compare `two adjacent` items and swap them if they are in the wrong order.

Repeat the pass until no swaps are needed. 

1. Compare $A[0]$ and $A[1]$. If $A[0]$ is bigger than $A[1]$, swap the elements.

2. Move to the next element, $A[1]$ (which might now contain the result of a swap from the previous step), and compare it with $A[2]$. If $A[1]$ is bigger than $A[2]$, swap the elements. Do this for every pair of elements until the end of the list.

3. Do steps $1$ and $2$ $n$ times.

**Here is the pseudocode**

> pseudocode in LaTex: https://mirrors.cqu.edu.cn/CTAN/macros/latex/contrib/algorithm2e/doc/algorithm2e.pdf
```cpp
for i <- a.length() to 1 do
    for j <- 1 to i-1 do
        if a[j]>a[j+1] then
            swap a[j] <-> a[j+1]
        end if
    end for
end for  
```
Bubble sort has a nested loop. the inner loop executes $frac{1}{2}n^2-frac{1}{2}n$ times for a list of size n.

Bubble sort is not efficient with complexity of $O(n^2)$.

![bubblesort](./img/ds/bubblesort.png)




### 4.1 bubble sort in Python

In [188]:
def bubble_sort(array):
    index = len(array) - 1
    while index >= 0:
        for j in range(index):
            if array[j] > array[j+1]:
                array[j], array[j+1] = array[j+1], array[j]
        index -= 1
    return array

In [189]:
array=[56, 24, 93, 17,77, 31, 44,55,20]
sortedarray=bubble_sort(array)
print(sortedarray)

[17, 20, 24, 31, 44, 55, 56, 77, 93]


### 4.2 Improved bubble sort in Cpp

We can make a minor adjustment to the bubble sort to improve its **best-case** performance to linear.

If **no swaps** occur during a pass through the main loop, then the list is **sorted**. 

This can happen on any pass, and in the **best case** will happen on the **first** pass.

* The **best case** complexity is $O(n)$

We can track the presence of swapping with a Boolean flag and return from the function 

* bubble sort won’t perform any swaps if the list is already sorted. However, bubble sort’s worstcase behavior for exchanges is greater than linear.


In [190]:
%%file ./demo/src/BubbleSort.cpp
/* Sorting an array using Bubble Sort (BubbleSort.cpp) */
#include <iostream>
using namespace std;
 
void bubbleSort(int a[], int size);
void print(const int a[], int size);
 

// Sort the given array of size
void bubbleSort(int a[], int size) {
   bool done = false; // terminate if no more swap thru a pass
   int pass = 0;      // pass number, for tracing
   int temp;          // use for swapping
 
   while (!done) {
      cout << "PASS " << ++pass << " ..." << endl;   // for tracing
      done = true;
      // Pass thru the list, compare adjacent items and swap
      // them if they are in wrong order
      for (int i = 0; i < size - 1; ++i) {
         if (a[i] > a[i+1]) {
            print(a, size); // for tracing
            temp = a[i];
            a[i] = a[i+1];
            a[i+1] = temp;
            done = false;   // swap detected, one more pass
            cout << "=> ";  // for tracing
            print(a, size);
            cout << endl;
         }
      }
   }
}
 
// Print the contents of the given array of size
void print(const int a[], int size) {
   cout << "{";
   for (int i = 0; i < size; ++i) {
      cout << a[i];
      if (i < size - 1) cout << ",";
   }
   cout << "} ";
}

int main() {
   const int SIZE = 9;
   int a[] = {56, 24, 93, 17,77, 31, 44,55,20};
 
   print(a, SIZE);
   cout << endl;
   bubbleSort(a, SIZE);
   print(a, SIZE);
   cout << endl;
}


Overwriting ./demo/src/BubbleSort.cpp


In [191]:
!g++ -o ./demo/bin/BubbleSort ./demo/src/BubbleSort.cpp

In [192]:
!.\demo\bin\BubbleSort 

{56,24,93,17,77,31,44,55,20} 
PASS 1 ...
{56,24,93,17,77,31,44,55,20} => {24,56,93,17,77,31,44,55,20} 
{24,56,93,17,77,31,44,55,20} => {24,56,17,93,77,31,44,55,20} 
{24,56,17,93,77,31,44,55,20} => {24,56,17,77,93,31,44,55,20} 
{24,56,17,77,93,31,44,55,20} => {24,56,17,77,31,93,44,55,20} 
{24,56,17,77,31,93,44,55,20} => {24,56,17,77,31,44,93,55,20} 
{24,56,17,77,31,44,93,55,20} => {24,56,17,77,31,44,55,93,20} 
{24,56,17,77,31,44,55,93,20} => {24,56,17,77,31,44,55,20,93} 
PASS 2 ...
{24,56,17,77,31,44,55,20,93} => {24,17,56,77,31,44,55,20,93} 
{24,17,56,77,31,44,55,20,93} => {24,17,56,31,77,44,55,20,93} 
{24,17,56,31,77,44,55,20,93} => {24,17,56,31,44,77,55,20,93} 
{24,17,56,31,44,77,55,20,93} => {24,17,56,31,44,55,77,20,93} 
{24,17,56,31,44,55,77,20,93} => {24,17,56,31,44,55,20,77,93} 
PASS 3 ...
{24,17,56,31,44,55,20,77,93} => {17,24,56,31,44,55,20,77,93} 
{17,24,56,31,44,55,20,77,93} => {17,24,31,56,44,55,20,77,93} 
{17,24,31,56,44,55,20,77,93} => {17,24,31,44,56,55,20,77,93} 
{17,24,

##  5 Insertion Sort

In brief, pass thru the list. For each element, compare with `all` previous elements and `insert` it at the correct `position` by shifting the other elements. 

f $O(n^2)$.


Here is some **pseudocode**

```cpp
for i <- 1 to length(A) do
    key <- A[i]
    j <- i - 1
    while j >= 0 and  A[j] > key do
          A[j+1] <- A[j]
          j <- j - 1
    end while
    A[j+1] <- key
end for 
```

The outer loop executes $n−1$ times. In the **worst** case, when `all the data are out of order`, the inner loop iterates once on the first pass through the outer loop, twice on the second pass, and so on,  for a total of $\frac{1}{2}n^2-\frac{1}{2}n$

$1+2+...+(n-1)=\frac{1}{2}n^2-\frac{1}{2}n$$

Thus, the worst-case behavior of insertion sort is $O(n^2)$. The nsertion sort is  not efficient

But if the more items in the list that are in **order**, the **better** insertion sort gets until.

* In the **best case** of a **sorted** list, the sort’s behavior is linear $O(n)$.

In the **average** case, however, insertion sort is still quadratic $O(n^2)$.

![insertionsort](./img/ds/insertsort.jpg)

### 5.1 Insertionsort in python

In [15]:
def insertion_sort(array):
    i = 1
    for i in range(len(array)):
        itemToInsert = array[i]
        j = i - 1
        while j >= 0 and array[j]>itemToInsert:
            array[j + 1] = array[j]
            j -= 1
        array[j + 1] = itemToInsert
    return  array

In [16]:
array=[4, 3, 2, 10, 12, 1, 5, 6]
sortearray=insertion_sort(array)
print(sortearray)

[1, 2, 3, 4, 5, 6, 10, 12]


### 5.2 Insertionsort in CPP

In [195]:
%%file ./demo/src/InsertionSort.cpp
/* 
  Sorting an array using Insertion Sort (InsertionSort.cpp) 
*/
#include <iostream>
using namespace std;

void insertionSort(int a[], int size);
void print(const int a[], int iMin, int iMax);

// Sort the given array of size using insertion sort
void insertionSort(int a[], int size)
{
   int temp; // for shifting elements
   for (int i = 1; i < size; ++i)
   {
      // for tracing
      print(a, 0, i - 1);    // already sorted
      print(a, i, size - 1); // to be sorted
      cout << endl;

      // For element at i, insert into proper position in [0, i-1]
      //  which is already sorted.
      // Shift down the other elements
      for (int prev = 0; prev < i; ++prev)
      {
         if (a[i] < a[prev])
         {
            // insert a[i] at prev, shift the elements down
            temp = a[i];
            for (int shift = i; shift > prev; --shift)
            {
               a[shift] = a[shift - 1];
            }
            a[prev] = temp;
            break;
         }
      }
   }
}

// Print the contents of the array in [iMin, iMax]
void print(const int a[], int iMin, int iMax)
{
   cout << "{";
   for (int i = iMin; i <= iMax; ++i)
   {
      cout << a[i];
      if (i < iMax)
         cout << ",";
   }
   cout << "} ";
}

int main()
{
   const int SIZE = 8;
   int a[SIZE] = {4, 3, 2, 10, 12, 1, 5, 6};

   print(a, 0, SIZE - 1);
   cout << endl;
   insertionSort(a, SIZE);
   print(a, 0, SIZE - 1);
   cout << endl;
}


Overwriting ./demo/src/InsertionSort.cpp


In [196]:
!g++ -o ./demo/bin/InsertionSort ./demo/src/InsertionSort.cpp

In [197]:
!.\demo\bin\InsertionSort

{4,3,2,10,12,1,5,6} 
{4} {3,2,10,12,1,5,6} 
{3,4} {2,10,12,1,5,6} 
{2,3,4} {10,12,1,5,6} 
{2,3,4,10} {12,1,5,6} 
{2,3,4,10,12} {1,5,6} 
{1,2,3,4,10,12} {5,6} 
{1,2,3,4,5,10,12} {6} 
{1,2,3,4,5,6,10,12} 


## 6 Sorting in Python

### 6.1  The sorting algorithm in Python

The sorting algorithm used in most Python implementations is called 

* <b>Timsort</b> ： https://en.wikipedia.org/wiki/Timsort

The **key idea** is to take **advantage** of the fact that in a lot of data sets the data is <b>already partially sorted</b>. 

**Timsort**’s worst-case performance is the same as **merge** sort’s, but on average it performs considerably **better.**

The standard implementation of sorting in most Python implementations runs in roughly $O(n*log(n))$ time, where $n$ is the length of the list.


In most cases, the right thing to do is to use with Python


* 1  method **list.sort** : takes a `list` as its first argument and **modifies** that list,sorts the list (ascending sort),
    
    
* 2 function **sorted** : takes an `iterable` object (e.g., a list or a dictionary) as its first argument and returns a **new** sorted list

### 6.2  Stable Sort(稳定排序)

Both the **list.sort** method and the sorted function provide <b>stable sorts</b>. 

A sorting algorithm is stable if it preserves the **original** order of elements with equal key values (where the key is the value the algorithm sorts by).

![stable](./img/ds/stablesort.png)

In [198]:
# sorted list: new list
L = [3,5,2]
print('sorted L(a new list)=',sorted(L))
print('L=',L)



sorted L(a new list)= [2, 3, 5]
L= [3, 5, 2]


In [199]:
# sorted dict
# dict:iterable
D = {'a':12, 'c':5, 'b':'dog'}
print('sorted D(a new list)=',sorted(D))


sorted D(a new list)= ['a', 'b', 'c']


In [200]:
D.sort()

AttributeError: 'dict' object has no attribute 'sort'

* when the sorted function is applied to a dictionary, it returns a `sorted list of the keys` of the dictionary. 

* In contrast, when the sort method is applied to a dictionary, it causes an exception to be raised since there is no method dict.sort

In [48]:
# list.sort in place
L.sort()
print('L(modified L)=',L)

L(modified L)= [2, 3, 5]


Both the **list.sort** method and the **sorted** function can have two additional parameters.

* <b>key</b> parameter plays the same role as compare in our implementation of merge sort: it is used to <b>supply the comparison function</b> to be
used.


* <b>reverse</b> parameter specifies whether the list is to be sorted in <b>ascending or descending order</b>.



In [17]:
L = [[1,2,3], (3,2,1,0), 'abc']
print(sorted(L, key = len, reverse = True))

[(3, 2, 1, 0), [1, 2, 3], 'abc']


sorts the elements of L in `reverse` order of `length` and prints

## 7 Performance Criteria

There are several criteria to be used in evaluating a sorting algorithm:

* **Running time**. Typically, an elementary sorting algorithm requires $O(N^2)$ steps to sort $N$ randomly arranged items. More sophisticated sorting algorithms require $O(NlogN)$ steps on average. Algorithms differ in the constant that appears in front of the $N^2$ or $NlogN$. Furthermore, some sorting algorithms are more sensitive to the nature of the input than others. Quicksort, for example, requires $O(NlogN)$ time in the average case, but requires O(N2) time in the worst case. 

* **Memory requirements**. The amount of extra memory required by a sorting algorithm is also an important consideration. In place sorting algorithms are the most memory efficient, since they require practically no additional memory. Linked list representations require an additional N words of memory for a list of pointers. Still other algorithms require sufficent memory for another copy of the input array. These are the most inefficient in terms of memory usage. 

* **Stability**. This is the ability of a sorting algorithm to preserve the relative order of equal keys in a file.




## 8 Choosing a Sorting Algorithm


To choose a sorting algorithm for a particular problem, consider the running time, space complexity, and the expected format of the input list.

![choosesort](./img/ds/choosesort.jpg)


* Insertion sort requires linear time for almost `sorted` files. Otherwise,insertion sort should  be limited to small files.

* Quicksort is the method to use for very large sorting problems.Quicksort performs badly if the file is already sorted. Another possible disadvantage is that quicksort is not stable i.e. it does not preserve the relative order of equal keys. 

* Bubble sort, which is included in Table for comparison purposes only, is generally best avoided. 

Insertion, Bubble and Quicksortsorting algorithms are in-place(就地-原址) methods.Quicksort requires a small amount of additional memory for the auxiliary stack.

* Mergesort is $O(NlogN)$ algorithm in the average and worst cases.

Mergesort algorithm is out-place（外置-异地)method. It requires a amount of additional memory 

## Further Reading

* 严蔚敏，李冬梅，吴伟民. 数据结构（C语言版），人民邮电出版社（第2版）,2015年2月  


* Mark Allen Weiss. Data Structures and Algorithm Analysis in C


