# Lab 10 Examples (Bubble Sort)

Click \<shift> \<enter> in each code cell to run the code. Be sure to start with the ```#include``` directives to load the required libraries.

In [1]:
// For Lab 10, we are limited to using only <iostream>, <iomanip>, and <string>

#include <iostream>
#include <iomanip>
#include <string>

# Overview

In today lab, we will discuss bubble sort and analyze its time complexity in terms of Big-0, Big-Ω, and Big-Θ notation. Bubble sort is a simple sorting algorithm that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. It is known as a comparison-based sorting algorithm, which relies on comparing elements to determine their correct order.

<h2>The Meaning of Big <span>\(\mathcal{O}\), \(\Omega\), and \(\Theta\)</span></h2>

These symbols stand for Big-O (the letter O), Big-Omega (the Greek letter Ω), and Big-Theta (the Greek letter Θ). They are used performance or complexity of an algorithm in terms of space or time as the input size grows.

For the rest of the analysis, we will focus on time complexity, although the same terms apply to space complexity too.

- **Big-O** describes an upper bound on the algorithm's complexity, the maximum amount of time the algorithm could take for a given problem size of `n`.

- **Big-Ω** describes a lower bound on the algorithm's complexity, the minimum amount of time the algorithm could take for a given problem size of `n`.

- **Big-Θ** describes a tight bound on the algorithm's complexity, meaning it it bound at that complexity from above and from below.

We will apply these terms to bubble sort, but before we do, let's look at a few very simple algorithms to get some practice at using these terms.

<h3>An Algorithm to Flip a Number's Sign <span>\(\mathcal{O}(1)\), Constant Time</span></h3>

In [2]:
void flipSign(int &num) {
    num = -num;
}

int num = 5;
flipSign(num);
std::cout << "Flipped sign: " << num << std::endl;

Flipped sign: -5


Regardless of the value of `num`, this algorithm takes the same amount of time to run. Flipping the sign of 10 will take the same amount of time as flipping the sign of 1000. It's a constant time operation, so we say it is O(1), Ω(1), and Θ(1).

Big-O says that the algorithm will never run any longer than constant time. It's only speaks of this upper bound. It says that constant time is the longest amount of time this algorithm could take.

Big-Ω says that the algorithm will never run any shorter than constant time. It only speaks of this lower bound. It says that constant time is the shortest amount of time this algorithm could take.

Big-Θ says that the algorithm will never run any longer than AND never run any shorter than constant time. It speaks to both the upper and lower bounds. It is a tight bound.

But not all algorithms have the same upper and lower bounds. Let's look at an example.

<h3>Linear Search <span>\(\mathcal{O}(n), \Omega(1)\)</span></h3>

In [3]:
bool linearSearch(const int arr[], int size, int target) {
    for (int i = 0; i < size; i++) {
        if (arr[i] == target) {
            return true; // Found
        }
    }
    return false; // Not Found
}

int list[] = {3, 5, 2, 8, 6};

bool found = linearSearch(list, 5, 8);
if (found) {
    std::cout << "Found." << std::endl;
} else {
    std::cout << "Not Found." << std::endl;
}

Found.


Example list: [3, 5, 2, 8, 6]

If we are searching for the number 3, the algorithm will find it on the first comparison. This is the best-case and we'll have our answer in constant time.

If we are searching for the number 6 (or for a number that isn't there) the algorithm will have to check every element in the list before knowing the answer. This is the worst-case and will take linear time, O(n), where n is the number of elements in the list.

Here the best-case case and worst-case are different. The algorithm is O(n) because in the worst-case it will take linear time. The algorithm is Ω(1) because in the best-case it will take constant time. However, we cannot say it is Θ(n) because the lower bound does not equal the upper bound.

Note that sometimes, we will use Θ to describe a specific case such as the worst-case.  It might be said that "the worst-case time complexity of linear search is Θ(n)". Here we are only talking about the worst-case, so we can place a tight bound on it. The worst-case will always be linear time. The worst-case is Θ(n).

Θ will often be used to describe an algorithm's average-case. But Θ itself does not suggest an average-case. It simply places a tight bound (a bound from above and below) on an algorithm's complexity.

<h3>Binary Search <span>\(\mathcal{O}(\log n), \Omega(1)\)</span></h3>

In [4]:
bool binarySearch(const int arr[], int low, int high, int target) {
    if (low > high) {
        return false; // Not Found
    }
    int mid = low + (high - low) / 2;
    if (arr[mid] == target) {
        return true; // Found
    } else if (arr[mid] > target) {
        return binarySearch(arr, low, mid - 1, target); // left half
    } else {
        return binarySearch(arr, mid + 1, high, target); // right half
    }
}

int sortedList[] = {2, 4, 5, 8, 9, 12, 15, 18, 20, 25, 30, 35, 40, 45, 50};
bool foundBinary = binarySearch(sortedList, 0, 14, 45);
if (foundBinary) {
    std::cout << "Found." << std::endl;
} else {
    std::cout << "Not Found." << std::endl;
}

Found.


Binary search is an efficient divide-and-conquer algorithm that only works if the list is sorted. But if the list is sorted, the running time is much better than linear search. Binary search works by dividing the list in half, and checking the middle element. If the middle element is greater than the target, we can disregard the entire upper half of the list. If the middle element is less than the target, we can disregard the entire lower half of the list. We recurse on the half that could contain the target, halving the problem size with each recursive call.

Considering the list `{2, 4, 5, 8, 9, 12, 15, 18, 20, 25, 30, 35, 40, 45, 50}` and the target value of 40.

1. First, we check the middle element (18). Since 40 is greater than 18, we can ignore the lower half of the list.

2. Next, we check the middle element of the remaining upper half (35). Since 40 is greater than 35, we can again ignore the lower half of this sub-list.

3. We then check the middle element of the remaining upper half (45). Since 40 is less than 45, we can ignore the upper half of this sub-list.

4. Finally, we check the middle element of the remaining lower half (40). We found the target!

#### Time Complexity Analysis

Just like linear search, we could get lucky and find the target on the first try. The item we're looking for could be the middle element of the whole list. This best-case is constant time. So, binary search is Ω(1).

Binary search is an efficient divide-and-conquer algorithm that only works if the list is sorted. But if the list is sorted, the running time is much better than linear search. Binary search works by dividing the list in half, and checking the middle element. If the middle element is greater than the target, we can disregard the entire upper half of the list. If the middle element is less than the target, we can disregard the entire lower half of the list. We recurse on the half that could contain the target, halving the problem size with each recursive call.

Considering the list `{2, 4, 5, 8, 9, 12, 15, 18, 20, 25, 30, 35, 40, 45, 50}` and the target value of 40.

1. First, we check the middle element (18). Since 40 is greater than 18, we can ignore the lower half of the list.

2. Next, we check the middle element of the remaining upper half (35). Since 40 is greater than 35, we can again ignore the lower half of this sub-list.

3. We then check the middle element of the remaining upper half (45). Since 40 is less than 45, we can ignore the upper half of this sub-list.

4. Finally, we check the middle element of the remaining lower half (40). We found the target!

#### Time Complexity Analysis

Just like linear search, we could get lucky and find the target on the first try. The item we're looking for could be the middle element of the whole list. This best-case is constant time. So, binary search is Ω(1).

The worst-case occurs when we keep halving until the sub-list size is 1. For example, a list of size 128 is repeatedly halved as

128 → 64 → 32 → 16 → 8 → 4 → 2 → 1

This repeated division by 2 means the number of steps grows logarithmically. In general the number of halvings is about log₂ n. For instance, 2^7 = 128, so log₂ 128 = 7. Therefore the upper bound on the running time is logarithmic, O(log n). This is impressive: for n ≈ 1,000,000, 2^20 = 1,048,576, so log₂ n ≈ 20 — binary search will only need to loo at about 20 elements in the worst-case.


<h3>Count the Occurrences of a Number in an Unsorted List <span>\(\mathcal{O}(n), \Theta(n), \Omega(n)\)</span></h3>

In [5]:
int countOccurrences(int arr[], int size, int target) {
    int count = 0;
    for (int i = 0; i < size; i++) {
        if (arr[i] == target) {
            count++;
        }
    }
    return count;
}

int data[] = {1, 2, 3, 2, 4, 2, 5, 2, 6, 2};
int occurrences = countOccurrences(data, 10, 2);
std::cout << "Occurrences of 2: " << occurrences << std::endl;

Occurrences of 2: 5


In order to count the occurrences of a number in an unsorted list, we have to check every element in the list. Here we have a tight bound. Regardless of the input, we will always have to check every element. Counting occurrences of a number is therefore Θ(n).  It is still correct to say it is O(n) and Ω(n), but Θ(n) provides more information since it speaks to both the upper and lower bounds.

<h3>Bubble Sort (Basic) <span>\(\mathcal{O}(n^2)\), \(\Omega(n^2)\), \(\Theta(n^2)\)</span></h3>


Bubble sort is a comparison-based sorting algorithm that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order.

```text

Consider the list: [5, 2, 9, 1, 4]

First Pass:

j  j+1
5    2    9    1    4   (out of order, swap)
2    5    9    1    4

     j  j+1
2    5    9    1    4   (in order, go on)

          j  j+1
2    5    9    1    4   (out of order, swap)
2    5    1    9    4

               j  j+1
2    5    1    9    4   (out of order, swap)
2    5    1    4    9

After the first pass, the largest element (9) is in its correct position,
so the subsequent passes can ignore it.

Second Pass:

j  j+1
2    5    1    4    9   (in order, go on)

     j  j+1
2    5    1    4    9   (out of order, swap)
2    1    5    4    9

          j  j+1
2    1    5    4    9   (out of order, swap)
2    1    4    5    9

After the second pass, the two largest elements (5 and 9) are in their correct positions.

Third Pass:

j  j+1
2    1    4    5    9   (out of order, swap)
1    2    4    5    9

     j  j+1
1    2    4    5    9   (in order, go on)

After the third pass, the three largest elements (4, 5, and 9) are in their correct positions.

Fourth Pass:

j  j+1
1    2    4    5    9   (in order, the end)

```

The size of the list is 5. n = 5.

The number of comparisons made on the first pass is (n-1) = 4.
The number of comparisons made on the second pass is (n-2) = 3.
The number of comparisons made on the third pass is (n-3) = 2.
The number of comparisons made on the fourth pass is (n-4) = 1.

The number decreases by 1 with each pass because the largest unsorted element "bubbles up" to its correct position at the end of each pass.

```Text
T(5) = (n-1) + (n-2) + (n-3) + (n-4)
       = 4   +   3   +   2   +   1
       = 10

T(9) = (n-1) + (n-2) + (n-3) + (n-4) + (n-5) + (n-6) + (n-7) + (n-8)
       = 8   +   7   +   6   +   5   +   4   +   3   +   2   +   1
       = 36
```

Generalizing, we have:

$$T(n) = \frac{n(n-1)}{2}$$

Examples:

$$T(5) = \frac{5(5-1)}{2} = \frac{20}{2} = 10$$
<br><br>
$$T(9) = \frac{9(9-1)}{2} = \frac{72}{2} = 36$$

Multiplied out, we have:

$$T(n) = \frac{1}{2}n^2 - \frac{n}{2}$$

As $n$ grows large, the $n^2$ term dominates. The constants and lower-order terms become insignificant.

Therefore, we say that the time complexity of bubble sort is:

$$\mathcal{O}(n^2)$$

This is the upper bound.  Is this a tight bound?  What would be the best case?


<h3>Bubble Sort (Improved) <span>\(\mathcal{O}(n^2), \Omega(n)\)</span></h3>


Improved bubble sort adds a flag to indicate that if we conclude a pass without any swaps, the sorting is complete. The best-case occurs when the list is already sorted at the start. A single pass through list will confirm that no swaps are needed, and the algorithm can exit after only n-1 comparisons. Therefore, the best-case time complexity is linear.

For today's lab assignment, we will implement the improved bubble sort, while counting the number of swaps and comparisons.

```c++
int improvedBubbleSort(int arr[], int size) {

    // Declare and initialize necessary variables. The lab assignment requires
    // us to count the number of swaps and comparisons, so include variables to
    // track those.

    // Comparison operations count as one action. Swap operations count as 
    // three actions (swapping requires three assignments).

    // Declare nested loops to perform bubble sort. The outer loop will track
    // the number of passes, and the inner loop will perform the comparisons 
    // and swaps.

    for i loops to track the required passes {
        for j loops to compare and swap the unsorted elements {
            // Compare elements (counting the comparison as one action).

            // If they are out of order, swap them.
            // Make the swap (counting the swap as three actions).
        }

        // If no swaps were made, we're done. Return the number of
        // actions.
    }
}

// Having reached the end of the loops, we weren't able to return early,
// so return the number of actions here.
```