# Data Structures and Algorithms (in C)
<br>
<div style="opacity: 0.8; font-family: Consolas, Monaco, Lucida Console, Liberation Mono, DejaVu Sans Mono, Bitstream Vera Sans Mono, Courier New; font-size: 12px; font-style: italic;">
    ────────
    for more from the author, visit
    <a href="https://github.com/hazemanwer2000">github.com/hazemanwer2000</a>.
    ────────
</div>

## Table of Contents
* [Algorithms](#algorithms)
    * [Searching Algorithms](#searching-algorithms)
        * [Linear Search](#linear-search)
        * [Binary Search](#binary-search)
    * [Sorting Algorithms](#sorting-algorithms)
        * [Bubble Sort](#bubble-sort)
        * [Selection Sort](#selection-sort)
        * [Insertion Sort](#insertion-sort)
        * [Quick Sort](#quick-sort)
        * [Merge Sort](#quick-sort)
<hr>

## Algorithms

### Searching Algorithms

#### Linear Search

An elementary approach to searching would be to iterate through each item in the search list, until the search item is found.

This is called *linear search*. It has an average-case *time complexity* of $O(n)$.

In [22]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

int linear_search(int elem, int arr[], int len) {
    for (int i = 0; i < len; i++) {
        if (arr[i] == elem) {
            return i;
        }
    }
    return -1;
}

int main() {
    int arr[] = {12, 57, 79, 23, 56};
    int len = LEN(arr);
    int x = 78, y = 79;
    print_arr("Array:", arr, len);
    putchar('\n');
    printf("Index of %d: %d\n", x, linear_search(x, arr, len));
    printf("Index of %d: %d\n", y, linear_search(y, arr, len));
}

Array:     12, 57, 79, 23, 56

Index of 78: -1
Index of 79: 2


#### Binary Search

For an ordered list of items, a *divide-and-conquer* approach becomes feasible. With knowledge of the length of the list, the list is divided into two halves around a middle item. The middle item is compared to the search item, and based on the result (if non-matching), the search continues in either the left half or the right half. Then, the half is further divided into left and right quarters, and so on.

This is called *binary search*. It has an average-case time complexity of $O(log(n))$.

*Note:* A time complexity of $O(log(n))$ is only feasible if the list has an access time complexity of $O(1)$.

In [26]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

int binary_search(int elem, int arr[], int len) {
    int lower = 0, upper = len - 1, mid;
    while (lower <= upper) {
        mid = (lower + upper) / 2;
        if (elem == arr[mid]) {
            return mid;
        } else if (elem > arr[mid]) {
            lower = mid + 1;
        } else {
            upper = mid - 1;
        }
    }
    return -1;
}

int main() {
    int arr[] = {5, 17, 19, 26, 54};
    int len = LEN(arr);
    int x = 17, y = 54, z = 18;
    print_arr("Array:", arr, len);
    putchar('\n');
    printf("Index of %d: %d\n", x, binary_search(x, arr, len));
    printf("Index of %d: %d\n", y, binary_search(y, arr, len));
    printf("Index of %d: %d\n", z, binary_search(z, arr, len));
}

Array:     5, 17, 19, 26, 54

Index of 17: 1
Index of 54: 4
Index of 18: -1


### Sorting Algorithms

#### Bubble Sort

One easy approach to sort a list of items, is to iterate over the list repeatedly, comparing every two consecutive items and switching them if they are in the wrong order. At most, $(n-1)^2$ comparisons are made, where $n$ is the number of items in the list.

An optimization would be to (re)set a flag whenever an entire scan of the list is completed, $n-1$ comparisons, without switching any two elements, indicating that the list is sorted, and that the algorithm should terminate, saving off further comparisons.

This is called *bubble sort*. It has an average-case time complexity of $O(n^2)$.

*Note:* A *stable* sorting algorithm is one that maintains the relative order between equal items. Since bubble sort never switches two equal items, it is a stable algorithm.

In [36]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

void bubble_sort(int arr[], int len) {
    int i, flag, tmp;
    do {
        flag = 0;                       /* flag reset, before each scan */
        for (i = 0; i < len-1; i++) {
            if (arr[i] > arr[i+1]) {
                flag = 1;               /* flag set, if switching occured */
                tmp = arr[i];
                arr[i] = arr[i+1];
                arr[i+1] = tmp;
            }
        }
    } while (flag);
}

int main() {
    int arr[] = {62, 43, 89, 80, 79, 11};
    int len = LEN(arr);
    print_arr("Before:", arr, len);
    bubble_sort(arr, len);
    print_arr("After:", arr, len);
}

Before:    62, 43, 89, 80, 79, 11
After:     11, 43, 62, 79, 80, 89


#### Selection Sort

Another easy approach is to select the smallest item in the list, and exchange it with the first item. Then, select the second most smallest item, and exchange it with the second item, and so on.

This is called *selection sort*. It has an average-case time complexity of $O(n^2)$.

*Note:* By virtue of its design, selection sort is not a stable sorting algorithm.

*Note:* One advantage of selection sort is, at most, $n-1$ exchanges take place.

In [38]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

void selection_sort(int arr[], int len) {
    int i, j, smallest, tmp;
    for (i = 0; i < len-1; i++) {        /* first item, second item, so on */
        smallest = i;
        for (j = 1+i; j < len; j++) {         /* comparing each following item */
            if (arr[smallest] > arr[j]) {      /* to determine the smallest */
                smallest = j;
            }
        }
        tmp = arr[smallest];
        arr[smallest] = arr[i];
        arr[i] = tmp;
    }
}

int main() {
    int arr[] = {62, 43, 89, 80, 79, 11};
    int len = LEN(arr);
    print_arr("Before:", arr, len);
    selection_sort(arr, len);
    print_arr("After:", arr, len);
}

Before:    62, 43, 89, 80, 79, 11
After:     11, 43, 62, 79, 80, 89


#### Insertion Sort

One more easy approach is to assume a sorted list, consisting of the first element only. Then, proceed to insert the second element backwards, maintaining the sorted property of the sub-list. Then, again, insert the third element backwards, and so on. With each insertion, the sub-list grows, until it consumes the whole list of elements.

This is called *insertion sort*. It has an average-case time complexity of $O(n^2)$.

In [39]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

void insertion_sort(int arr[], int len) {
    int i, j, tmp;
    for (i = 1; i < len; i++) {           /* inserting each element backwards, begining with second */
        tmp = arr[i];                      /* saving the element */
        j = i-1;
        while (arr[j] > tmp && j >= 0) {     /* continuously shifting sorted elements to the right */
            arr[j+1] = arr[j];                /* until element is smaller than element being inserted */
            j--;
        }
        arr[j+1] = tmp;
    }
}

int main() {
    int arr[] = {62, 43, 89, 80, 79, 11};
    int len = LEN(arr);
    print_arr("Before:", arr, len);
    insertion_sort(arr, len);
    print_arr("After:", arr, len);
}

Before:    62, 43, 89, 80, 79, 11
After:     11, 43, 62, 79, 80, 89


*Note:* Since an equal element in the sorted list is never shifted, insertion sort is a stable sorting algorithm.

#### Quick Sort

A more advanced approach to sorting employs the divide-and-conquer paradigm.
* Select an element from the list, called the *partitioning element*.
* Determine the final position of the partioning element in the sorted list.
* Re-arrange all other elements around the final position, so that all elements to the left of it are smaller than the partitioning element, and to the right, larger. Naturally, this leads the partitioning element to its final position.
* Recursively apply the algorithm on the sub-lists to the right and left of the partitioning element.

But these steps lead to a number of questions.
* How do we select the partioning element?
* How do we determine its final position?
* How do we re-arrange all other elements around the final position?

Let us consider the case, of selecting the partitioning element as the last element of the list.
* Forwardly iterate through the list with `i`, and backwardly with `j`, exclusive of the partitioning element.
* If `list[i]` is larger than the partitioning element, and `list[j]` is smaller than the partitioning element, switch both items and proceed.
* Stop switching (and iterating) when `i` crosses `j`. In other words, `i >= j`.
* Switch the partitioning element with `list[i]`, since `i` indexes a larger element, and the partitioning element is guranteed to be at a higher index.
* Finally, recursively apply the algorithm to the left and right sub-lists.

This is called *quick sort*. It has an average-case time complexity of $O(nlog(n))$.

*Note:* By virtue of its design, quick sort is not a stable sorting algorithm.

In [44]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

                                         /* classic quick-sort implementation */
                                                 
void quick_sort(int arr[], int start, int end) {
    int i, j, tmp;
    if (start < end) {                   /* making sure it is a sub-array of 2+ elements */
        i = start-1;
        j = end;
        while (1) {
            while(arr[++i] < arr[end]);
            while(arr[--j] > arr[end]);
            if (i >= j) {                /* check if 'i' and 'j' crossed */
                break;
            }
            tmp = arr[i];                /* otherwise, switch 'i' and 'j' */
            arr[i] = arr[j];
            arr[j] = tmp;
        }
        tmp = arr[i];                    /* switch 'i' and partitioning element at 'end' */
        arr[i] = arr[end];
        arr[end] = tmp;
        quick_sort(arr, start, i-1);     /* recursive sub-array calls */
        quick_sort(arr, i+1, end);
    }
}

int main() {
    int arr[] = {62, 43, 89, 80, 79, 11};
    int len = LEN(arr);
    print_arr("Before:", arr, len);
    quick_sort(arr, 0, len-1);
    print_arr("After:", arr, len);
}

Before:    62, 43, 89, 80, 79, 11
After:     11, 43, 62, 79, 80, 89


The worst-case time complexity is $O(n^2)$, and occurs when the list is almost sorted. This is because selecting the partioning element as the last (or first) element of the list is likely to lead to biased divisions, with one sub-array being very large, and the other very small.

To avoid this, an optimization called *median-of-three partioning* is frequently employed. The median of the first, last and middle elements of the list is selected, inexpensively ensuring that the partitioning element is not located at either extreme of the list.

*Note:* Since any recursive algorithm can be converted into an iterative algorithm, another possible optimization is to implement quick sort iteratively using a stack data structure, discussed later.

#### Merge Sort

Another advanced approach to sorting also employs the divide-and-conquer paradigm.

It employs the algorithm of merging two sorted arrays. If two sorted arrays, `A` and `B`, are to be merged into a new sorted array, `C`, simply walk through the items in both arrays with `i` and `j`. If `A[i]` is smaller than `B[j]`, append `A[i]` to `C`, which is initially empty, and increment `i`, then proceed, until either `A` or `B` run out of items. Lastly, append the not-empty array to `C`.

In [46]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

void merge_sorted(int A[], int lenA, int B[], int lenB, int C[]) {
    int i = 0, j = 0, k = 0;
    
    while (i < lenA && j < lenB) {     /* merging both */
        if (A[i] < B[j]) {
            C[k++] = A[i++];
        } else {
            C[k++] = B[j++];
        }
    }
    
    if (i == lenA) {           /* appending non-empty array */
        while (j < lenB) {
            C[k++] = B[j++];
        }
    } else {
        while (i < lenA) {
            C[k++] = A[i++];
        }
    }
}

int main() {
    int A[] = {1, 3, 7, 8, 9, 13, 15};
    int B[] = {2, 4, 5, 10, 11, 14, 16};
    int lenA = LEN(A);
    int lenB = LEN(B);
    int C[lenA + lenB];
    print_arr("A:", A, lenA);
    print_arr("B:", B, lenB);
    merge_sorted(A, lenA, B, lenB, C);
    print_arr("C:", C, lenA + lenB);
}

A:         1, 3, 7, 8, 9, 13, 15
B:         2, 4, 5, 10, 11, 14, 16
C:         1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13, 14, 15, 16


Now, let us assume that the two arrays to be merged are actually consecutive sub-arrays within a larger array, `D`. We would like to merge these two sub-arrays, to occupy the same location in `C` as they did in `D`.

In [52]:
//%cflags: .jupyter/print_arr.c

#include <stdio.h>
#define LEN(ARR) (*(&ARR+1) - ARR)
void print_arr(const char *label, const int arr[], const int len);

                                                                    /* 'mid' is forward-biased */
                                                                     /* 'hi' is forward-biased */
void merge_sorted(int to[], int from[], int low, int mid, int hi) { 
    int i, j, k;
    i = k = low, j = mid;
    
    while (i < mid && j < hi) {       /* merging sub-arrays */
        if (from[i] < from[j]) {
            to[k++] = from[i++];
        } else {
            to[k++] = from[j++];
        }
    }
    
    if (i == mid) {                 /* appending non-empty array */
        while (j < hi) {
            to[k++] = from[j++];
        }
    } else {
        while (i < mid) {
            to[k++] = from[i++];
        }
    }
}

int main() {
    int D[] = {1, 3, 5, 7, 2, 4, 6, 8, 10, 12};
    int len = LEN(D);
    int C[len];
    merge_sorted(C, D, 0, 4, len);
    print_arr("D:", D, len);
    print_arr("C:", C, len);
}

D:         1, 3, 5, 7, 2, 4, 6, 8, 10, 12
C:         1, 2, 3, 4, 5, 6, 7, 8, 10, 12


Now, let us assume `D` is unsorted. What if we divide `D` into sorted sub-arrays, each of unit-length, then iteratively merge every two consecutive sub-arrays, until the larger array is sorted? Sounds like a plan.