# BubbleSort

BubbleSort is one of the oldest and slowest comparison-based sorting algorithms, even when compared to other sorting algorithms of similar time complexity. At best, when all elements are sorted, it sorts at $O(n)$. At worst, when elements are reversed, it sorts at $O(n^2)$ since it has swap every element in the sortable sequence n times. BubbleSort is a stable algorithm, with memory complexity $O(1)$

## Generating a random sequence

In [1]:
import numpy as np

In [2]:
seed = 17
np.random.seed(seed)
n = 10
high = 100
low = 0
randseq = np.random.randint(low, high, n).tolist()
print(randseq)

[15, 6, 22, 57, 45, 22, 31, 68, 39, 84]


# Starting from the back, per element sorted

BubbleSort works by going from the front of the list, comparing the elements and only swapping the elements if the current element is larger than the next element

## Step 1: Swapping only if the element is larger

In [3]:
# Length of array
array_len = len(randseq)

# Go from index 0 to n-2
for j in range(0, array_len-1):
    # If the current number is larger than the next number
    if randseq[j]>randseq[j+1]:
        # Swap the two
        temp = randseq[j]
        randseq[j] = randseq[j+1]
        randseq[j+1] = temp
print(randseq)

[6, 15, 22, 45, 22, 31, 57, 39, 68, 84]


# Step 2: Performing the algorithm for all elements

In BubbleSort, the array is considered sorted from the back. Ie, after the first iteration, element n-1 is sorted. After the second iteration, element n-1, n-2 are sorted etc... . The algorithm stops when no swaps are being made anymore

In [27]:
# Generating the same random sequence again
np.random.seed(seed)
n = 10
high = 100
low = 0
randseq = np.random.randint(low, high, n).tolist()
print(randseq)

[15, 6, 22, 57, 45, 22, 31, 68, 39, 84]


In [28]:
# For the number of elements in the aray
for i in range(0, array_len):
    # Initialising flag for whether any swaps occured
    swaps = False
    # In each iteration, only the first n-i elements have to go through sorting. The inner for loop is modified accordingly
    for j in range(0,array_len-1-i):
        # If the current number is larger than the next number
        if randseq[j]>randseq[j+1]:
            # Swap the two
            temp = randseq[j]
            randseq[j] = randseq[j+1]
            randseq[j+1] = temp
            # Change the swap flag to True
            swaps = True
    # If no swaps occured, stop the algorithm
    if swaps == False:
        break
        
    # Printing to see array after each n-i element is sorted
    print(randseq)

[6, 15, 22, 45, 22, 31, 57, 39, 68, 84]
[6, 15, 22, 22, 31, 45, 39, 57, 68, 84]
[6, 15, 22, 22, 31, 39, 45, 57, 68, 84]


## Combining everything into an algorithm

In [32]:
# Generating the same random sequence
np.random.seed(seed)
n = 10
high = 100
low = 0
randseq = np.random.randint(low, high, n).tolist()
print(randseq)

[15, 6, 22, 57, 45, 22, 31, 68, 39, 84]


In [31]:
def BubbleSort(seq):
    array_len = len(seq)
    # For the number of elements in the aray
    for i in range(0, array_len):
        # Initialising flag for whether any swaps occured
        swaps = False
        # In each iteration, only the first n-i elements have to go through sorting. The inner for loop is modified accordingly
        for j in range(0,array_len-1-i):
            # If the current number is larger than the next number
            if seq[j]>seq[j+1]:
                # Swap the two
                temp = seq[j]
                seq[j] = seq[j+1]
                seq[j+1] = temp
                # Change the swap flag to True
                swaps = True
        # If no swaps occured, stop the algorithm
        if swaps == False:
            break
    return seq

In [33]:
print(BubbleSort(randseq))

[6, 15, 22, 22, 31, 39, 45, 57, 68, 84]


# Timing the algorithm

In [36]:
def BubbleSortTester(n, high, low=0):
    randseq = np.random.randint(low, high+1, n).tolist()
    return BubbleSort(randseq)

In [37]:
print(BubbleSortTester(100, 100))

[1, 1, 2, 3, 4, 5, 6, 7, 8, 10, 14, 14, 15, 16, 18, 18, 18, 20, 21, 21, 21, 21, 22, 22, 23, 24, 24, 24, 25, 25, 26, 29, 32, 32, 34, 34, 35, 39, 39, 42, 43, 46, 47, 47, 47, 50, 51, 52, 53, 55, 55, 56, 56, 56, 57, 58, 58, 58, 59, 59, 60, 61, 65, 65, 66, 66, 66, 66, 66, 67, 70, 70, 72, 73, 74, 75, 76, 76, 76, 77, 78, 79, 79, 82, 83, 85, 85, 87, 87, 88, 88, 89, 89, 90, 90, 92, 92, 93, 96, 96]


## Standard Cases

In [38]:
%timeit BubbleSortTester(10,10)

5.11 μs ± 15.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [39]:
%timeit BubbleSortTester(100, 100)

193 μs ± 3.12 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [48]:
%timeit BubbleSortTester(10_000, 10_000)

2.14 s ± 15.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Best case where all elements are sorted ($O(n)$)

In [41]:
# Sequence of sorted elements
def best_case(n):
    sorted = [i for i in range(n)]
    return BubbleSort(sorted)

In [42]:
%timeit best_case(10_000)

343 μs ± 2.86 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


## Worst case where all elements are in reverse order ($O(n^2)$)

In [43]:
def worst_case(n):
    # Sequence of elements sorted in reverse
    rev_sorted = [i for i in range(n-1, -1, -1)]
    return BubbleSort(rev_sorted)

In [49]:
%timeit worst_case(10_000)

2.69 s ± 22.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


As expected, the worst case is much slower than the best case, especially since the best case stops after 1 iteration since there are no swaps then. It is also decently slower than the average case

# Case where there are many duplicates

In [45]:
def duplicates(n):
    # Sequence of elements with many duplicates
    dup = [n]+[int(n/2) for i in range(n-2)] + [0]
    return BubbleSort(dup)

In [46]:
duplicates(10)

[0, 5, 5, 5, 5, 5, 5, 5, 5, 10]

In [47]:
%timeit duplicates(10_000)

1.18 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


This is very different from the best case, since although the largest element bubbles to the back in the first iteration, every element before the 0 which is initially in the last place has to swap places with the 0. This is only done once per iteration, and so the 0 moves very slowly to the front. Since there is a swap every iteration, the algorithm cannot stop prematurely, making it not the best option to handle duplicates.