# Sorting 

#### Different sorting techniques
* comparasion based sort:
    * 0(n^2) order sorting 
        1. bubble 
        2. insertion 
        3. selection 
    * $0(n*log(n))$
        4. heap sort 
        5. merge sort
        6. quick sort 
        7. tree sort 
    * $0(n^{3/2})$
        8. shell sort 
* Index based sorts 
    * 0(n) 
        9. count sort 
        10. bucket/bin sort 
        11. radix sort 


#### Creteria for analysis 
* Number of comparisons (decides the time complexity)
* Number of swaps 
* Adaptive (if already in place, algorithm should take less time) 
* Stable  
* Extra memory (if extra space is required) 
* If the sorting algorithm preserves the order of the duplicate elements in the list, it is called stable
    * if sorting is done for data with multiple columns, the result of first column is maintained by sorting algorithm in case duplicate values occur 



#### 1. Bubble sort

* adjacent neighboring elements are compared 
* values are swapped 
* next conjucate elements are compared and swapped 
* **When all elements are compared atleast once, its called a pass**
    * 1st path sorts the largest element 
* **for n elements (n-1) comparisons are performed in each pass**
* **max swaps or max comparions : $\frac{(n*(n-1))}{2}$**
* the heavier elements sink while the lighter elements rise, thats why its called a bubble short
* for k number of passes, we get k largest elements sorted
* first 3 greatest numbers will can be obtained for 3 passes
* we perform n-1 passes to sort the entire array, because for 5 elements, only 4 comparisons are needed 
* we take two loops to consequently compare two neighboring elements 
* the second loop is 
* ith index of outer loop is number of passes
* jth index of inner loop is comparison and swapping 
* for every pass the number of comparisons decrease 
* if the list is already sorted no swapping is done 
* minimum time taken by bubble sort 0(n) when list is already sorted 
* **maximum time is 0(n^2)**
* bubble sort is **adoptive**, 
* bubble sort is stable it maintains the order of duplicate elements 
* bubble sort works with and without minus i in the second loop 

```cpp
void bubble(vector<int> &v){ 
    for(int i = 0; i<v.size()-1; i++){
        for(int j = 0; j <v.size()-1-i; j++){
            if(v[j]>v[j+1]){
                swap(v[j], v[j+1]);
            }
        }
    }
}
```

```
compared 3 5: 3 5 1 6 3 4 6 5 9 
compared 5 1: 3 1 5 6 3 4 6 5 9 
compared 5 6: 3 1 5 6 3 4 6 5 9 
compared 6 3: 3 1 5 3 6 4 6 5 9 
compared 6 4: 3 1 5 3 4 6 6 5 9 
compared 6 6: 3 1 5 3 4 6 6 5 9 
compared 6 5: 3 1 5 3 4 6 5 6 9 
compared 6 9: 3 1 5 3 4 6 5 6 9 
0th pass complete, total comparisons: 8 total swaps: 4
compared 3 1: 1 3 5 3 4 6 5 6 9 
compared 3 5: 1 3 5 3 4 6 5 6 9 
compared 5 3: 1 3 3 5 4 6 5 6 9 
compared 5 4: 1 3 3 4 5 6 5 6 9 
compared 5 6: 1 3 3 4 5 6 5 6 9 
compared 6 5: 1 3 3 4 5 5 6 6 9 
compared 6 6: 1 3 3 4 5 5 6 6 9 
1th pass complete, total comparisons: 7 total swaps: 4
compared 1 3: 1 3 3 4 5 5 6 6 9 
compared 3 3: 1 3 3 4 5 5 6 6 9 
compared 3 4: 1 3 3 4 5 5 6 6 9 
compared 4 5: 1 3 3 4 5 5 6 6 9 
compared 5 5: 1 3 3 4 5 5 6 6 9 
compared 5 6: 1 3 3 4 5 5 6 6 9 
2th pass complete, total comparisons: 6 total swaps: 0
compared 1 3: 1 3 3 4 5 5 6 6 9 
compared 3 3: 1 3 3 4 5 5 6 6 9 
compared 3 4: 1 3 3 4 5 5 6 6 9 
compared 4 5: 1 3 3 4 5 5 6 6 9 
compared 5 5: 1 3 3 4 5 5 6 6 9 
3th pass complete, total comparisons: 5 total swaps: 0
compared 1 3: 1 3 3 4 5 5 6 6 9 
compared 3 3: 1 3 3 4 5 5 6 6 9 
compared 3 4: 1 3 3 4 5 5 6 6 9 
compared 4 5: 1 3 3 4 5 5 6 6 9 
4th pass complete, total comparisons: 4 total swaps: 0
compared 1 3: 1 3 3 4 5 5 6 6 9 
compared 3 3: 1 3 3 4 5 5 6 6 9 
compared 3 4: 1 3 3 4 5 5 6 6 9 
5th pass complete, total comparisons: 3 total swaps: 0
compared 1 3: 1 3 3 4 5 5 6 6 9 
compared 3 3: 1 3 3 4 5 5 6 6 9 
6th pass complete, total comparisons: 2 total swaps: 0
compared 1 3: 1 3 3 4 5 5 6 6 9 
7th pass complete, total comparisons: 1 total swaps: 0 

```

#### 2. Insertion sort 

* we take an array 
* We assume that the first element is already sorted
* Now remaining elements become part of unsorted array
* we take an element out of unsorted array and insert into sorted side 
* We compare and shift elements in sorted side to make space for the new element 
* continue until all the elements are sorted 
* **Each time an element is transfered its called a pass**
* for each pass, number of comparisons and number of swaps increase 
* **for n elements passes there are (n-1) passes required**
* **for n elements there are 1+2+3+...n-1 = $\frac{(n*(n-1))}{2}$ comparisons and possible swaps or 0(n^2)**
* Intermediate results are not useful in insertion sort 
* In arrays, we have to shift element, in linked list shifting is not required. 
* Insertion sort is designed for linked list 
* Implementation:
    * first element is already sorted 
    * so insertion starts at i=1 
    * x becomes what ever element at index i and it points to the first element in unsorted list as i moves 
    * j starts with the last element in the sorted array and goes backwards in the while loop that follows 
    * while j does't go out of bound from front and while every jth element is greater than first element of sorted array
        * make next element that j will point to be the element j is currently pointing to 
        * decrement j to go backward 
    * j ends up pointing at the element just smaller than x, we put current x right after it. 
* **if the list is already sorted time taken by insertion sort is 0(n), it is adoptive**


```cpp
void insertion(vector<int> & v){
    //i starts at 1 for n-1 passes 
    int j,x; 
    for(int i = 1; i<v.size(); i++){
        j = i-1; //points to last element in unsorted array 
        x = v[i]; //x is the first element in the sorted array 
        while(j>-1 && v[j]>x){
            v[j+1] = v[j];
            j--; 
        }
        v[j+1] = x; 
    }
}
```

```
3 5 1 6 3 4 6 5 9 
x is: 5 j points to 3: 3  | 5 1 6 3 4 6 5 9 
x is: 1 j points to 5: 1 3  | 5 6 3 4 6 5 9 
x is: 6 j points to 5: 1 3 5  | 6 3 4 6 5 9 
x is: 3 j points to 6: 1 3 3 5  | 6 4 6 5 9 
x is: 4 j points to 6: 1 3 3 4 5  | 6 6 5 9 
x is: 6 j points to 6: 1 3 3 4 5  | 6 6 5 9 
x is: 5 j points to 6: 1 3 3 4 5 5  | 6 6 9 
x is: 9 j points to 6: 1 3 3 4 5 5 6 6  | 9 
    ```

 #### 3. Selection sort 

* Each pass one element is sorted 
* for n elements $n-1$ passes are required 
* In first pass we select the position to find out the element for that position 
* two iterators j and k point to the same first element while i points to the first element 
* we iterate j over the list while comparing to k if any item smaller than k is found k points to that element 
* When the smallest element is found, ith position and kth position are swapped. 
* smallest element is now the first element, 
* implementation 
    *  there are two iterators i and j, and a variable k 
    * i points to the first element 
    * j and k points to ith element 
    * j iterates all the elements i onwards and looks for smallest element
    * i then assigns the index of smallest element to k 
    * ith and kth elements are swapped
    * the first element is the smallest element now
    * i moves to next element 
 * **number of comparisons is $\frac{n*(n-1)}{2}$**
 * **number of swaps: $n-1$**
 * **this algorithm sorts the elements with minimum number of swaps**
 * k number of passes gives k smallest numbers in sorted number 
 * while bubble sort gives the k largest elements 
 * itermediate results are useful
 * **adoptive means if already sorted takes minimum time, selection is not adoptive**
 * **it is not stable, i,e, original order is lost**


```cpp
void selection(vector<int> &v){
    int i,j,k=0; 
    for(i=0; i<v.size(); i++){
        k = i; 
        for(j=i; j<v.size(); j++){
            if(v[j]<v[k]){
                k=j;
            }
        }
        swap(v[k], v[i]);     
    }  
}
```

```
3 5 1 6 3 4 6 5 9 
smallest item 1 is at index: 2 -> 3 5 1 6 3 4 6 5 9 
smallest item 3 is at index: 2 -> 1 5 3 6 3 4 6 5 9 
smallest item 3 is at index: 4 -> 1 3 5 6 3 4 6 5 9 
smallest item 4 is at index: 5 -> 1 3 3 6 5 4 6 5 9 
smallest item 5 is at index: 4 -> 1 3 3 4 5 6 6 5 9 
smallest item 5 is at index: 7 -> 1 3 3 4 5 6 6 5 9 
smallest item 6 is at index: 6 -> 1 3 3 4 5 5 6 6 9 
smallest item 6 is at index: 7 -> 1 3 3 4 5 5 6 6 9 
smallest item 9 is at index: 8 -> 1 3 3 4 5 5 6 6 9
    ```

#### 4. Quick sort 

* An element is already sorted if all the elements before that is smaller than it and all the elements after it are greater than it
* choose a pivot, and two iterators i and j going forward and backward respectively
* if ith element is greater than pivot, and jth element is smaller than or equal to pivot, swap them
* if i and j cross each other it means we have walked through the whole list
* move pivot element to jth index 
* the list is now partitioned at pivot, the procedure from selecting pivot to partitioning around pivot, is called partioning 
* quick sort is recursively applied to the left of pivot and left of pivot 
* **There are $\frac{n*(n-1)}{2}$ comparisons** 
* **for already sorted list, time taken is 0(n^2)**
* **for sorted in decending order time taken  is 0(n^2), also worse**
* for ascending or descending order, if already sorted, then time is 0(n^2)
* for best case:
    * assume the partitioning put the pivot in the middle 
    * equal number of elements are passed to the recursive functions on left and right 
    * if the same pattern repeat, then 
    * we have $log(n)$ comparisons like a tree n times, total time is $0(log(n)*n)$
    * worst case is when the pivot is at the extreme ends 
    * average case is $0(log(n)*n)$
* steps:
    * choose a pivot 
    * sort the list with values greater than pivot on right and smaller than pivot on left. 
    * we do quicksort of those two halves. pick a pivot and arrange
    * Recursive steps:
        * bring the pivot to its appropriate position such that left of pivot is smaller and right is greater
        * quick sort the left part 
        * quick sort the right part 
        
 * There are n elements that go through log(n) swaps so they take $n*log(n)$ times overall
 * if pivot is picked at first element every time then there are, runtime becomes $n^2$
 * 
    

```cpp
//implementation one 
 int part(int arr[], int low, int high){
     int pivot = arr[high];
     int i = low-1; 
     for(int j = low; j <high; ++j){
         if(arr[j] <= pivot){
             i++;
             swap(arr[i], arr[j]);
         }
     }

     swap(arr[i+1], arr[high]);
     return (i+1);
 }

 void quickSort(int arr[], int low, int high){
     //cout << "quick sort called";
     if(low < high){
         int pi = part(arr, low, high);
         quickSort(arr, low, pi-1);
         quickSort(arr, pi+1, high);
     }
 }
```

* we keep working on the same array and just keept track of the start and end element 
* new array when calling recursively becomes, arr(pivot_index+1, high), arr(low, pivot_index -1)
* we always select a deterministic way of picking the pivot, maybe the first element or the last element 
* we recursively call the partition element until one element is left which is the termination condition 
* partition steps:
    * p_index starts at 0 
    * p_index is where pivot will end up after sorting 
    * as i increments any element smaller than pivot is swapped with where p_index was pointing to 
    * at end we swap pivot with element where p_index was pointing to 
    
```cpp
int partition(vector<int> & arr, int start ,int end){
   
    int pivot = arr[end];
    int p_index = start;

    for(int i = start; i < end; ++i){
        if(arr[i]<=pivot){
            swap(arr[i], arr[p_index]); 
            p_index++; 
        }
    }
    swap(arr[p_index], arr[end]);
    return p_index; 
}

 void quickSort(vector<int>& arr, int low, int high){
    
     if(low < high){
         int pi = partition(arr, low, high);
         cout << "calling left with pivot: "<<arr[pi]<< endl; 
         quickSort(arr, low, pi-1);
         cout << "calling right with pivot"<<arr[pi]<<endl; 
         quickSort(arr, pi+1, high);
         
     }
 }

  
```

#### Merge sort

* time taken for merging two arrays of size m and n into array c is 0(m+n)
* we can't store and merge in the same array. 
* we need extra space for merging 
* we can merge sort recursively and iteratively
* Iteratively:
    * start with each element merge every two elements, 
    * next pass, merge every four elements
    * next pass. merge every eight elements
    * time complexity is $0(log(n)*n$ 
* time taken by merge is log(n), and for n elements, there are n merges.
* 