<a href="https://colab.research.google.com/github/bubuloMallone/Algorithms_1/blob/main/3_quick_sort.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# QuickSort Algorithm

QuickSort is one of the most widely used and efficient sorting algorithms. It was invented by Tony Hoare in 1960 and is based on the divide-and-conquer paradigm. Its elegance lies in its recursive structure and, on average, its very fast performance.


**The Algorithm**

Given an unsorted array of $n$ elements, QuickSort works as follows

1. Divide: Select a pivot element from the array.
2. Partition $O(n)$: Rearrange the array so that
  - all elements smaller than the pivot are placed to its left,
  - all elements greater than the pivot are placed to its right.

The pivot is now in its correct sorted position.

3. Conquer: Recursively apply QuickSort to the subarrays on the left and right of the pivot.
4. Combine: Nothing extra is needed - the array becomes sorted once all subarrays are processed.


**Randomized QuickSort**

The choice of the pivot is crucial for performance:
- If the pivot is always the smallest or largest element, the algorithm degenerates to $O\left(n^2\right)$ runtime.
- If the pivot is close to the median, the algorithm achieves $O(n \log n)$ runtime complexity.

To avoid the worst-case on structured or adversarial input, one can randomize:
- Pick the pivot at random from the current subarray;
- With high probability, this yields a good partitioning;

Thus, Randomized QuickSort guarantees expected running time of $O(n \log n)$, regardless of the input distribution.


**Complexity Analysis**

- Best case (**balanced partitions**): each recursive call splits the array roughly in half

$$
T(n)=2 T(n / 2)+O(n) \quad \Rightarrow \quad T(n)=O(n \log n) .
$$

- Worst case (highly **unbalanced partitions**): e.g. sorted array with naive pivot choice

$$
T(n)=T(n-1)+O(n) \quad \Rightarrow \quad T(n)=O\left(n^2\right) .
$$

- Average case (**random pivot**): expected number of comparisons is $O(n \log n)$.

Space complexity:

- In-place version: $O(\log n)$ extra space (due to recursion stack).
- Simpler recursive version: up to $O(n)$ space.

In [2]:
# Let us first create some arrays to be sorted, taken from the dataset.

!wget https://raw.githubusercontent.com/bubuloMallone/Algorithms_1/refs/heads/main/datasets/2_quick_sort/dataset.txt
!wget https://raw.githubusercontent.com/bubuloMallone/Algorithms_1/refs/heads/main/datasets/2_quick_sort/testset1.txt
!wget https://raw.githubusercontent.com/bubuloMallone/Algorithms_1/refs/heads/main/datasets/2_quick_sort/testset2.txt


dataset = open('dataset.txt', 'r')
dataset = dataset.read().split('\n')
dataset = [int(x) for x in dataset if x] # Convert to list of integers, handling empty strings

testset1 = open('testset1.txt', 'r')
testset1 = testset1.read().split('\n')
testset1 = [int(x) for x in testset1 if x] # Convert to list of integers, handling empty strings

testset2 = open('testset2.txt', 'r')
testset2 = testset2.read().split('\n')
testset2 = [int(x) for x in testset2 if x] # Convert to list of integers, handling empty strings

testset1



--2025-09-30 07:20:37--  https://raw.githubusercontent.com/bubuloMallone/Algorithms_1/refs/heads/main/datasets/2_quick_sort/dataset.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 48894 (48K) [text/plain]
Saving to: ‘dataset.txt’


2025-09-30 07:20:38 (7.99 MB/s) - ‘dataset.txt’ saved [48894/48894]

--2025-09-30 07:20:38--  https://raw.githubusercontent.com/bubuloMallone/Algorithms_1/refs/heads/main/datasets/2_quick_sort/testset1.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 48 [text/plain]
Saving to: ‘testset1.txt’


20

[2148, 9058, 7742, 3153, 6324, 609, 7628, 5469, 7017, 504]

In [3]:
# Randomized In-Place QuickSort
import random

# Partition procedure:  partiton the sub-array into <=pivot portion and >pivot portion, then place pivot in correct position.

def partition(arr, left, right):
  # suppose pivot is the last element of arr
  pivot = arr[right]
  s = left - 1   # pointer tracking the boundary of partition <= pivot
  for i in range(left, right):
    if arr[i] <= pivot:
      s += 1
      arr[s], arr[i] = arr[i], arr[s]  # perform swap operation after incrementing boundary

  # after partitioning all array need to put pivot in correct position in the middle of the two partitions
  arr[s+1], arr[right] = arr[right], arr[s+1]  # swap pivot

  # need to return the position of the pivot in order to split array into partitions
  return s+1

# Pivot selection: choose a pivot index uniformly at random, swap it with the last element, then call partition routine.

def random_partition(arr, left, right):
  # choose randomly a pivot in the vector
  pivot_index = random.randint(left, right)
  # put pivot element at the end of the array for consistency
  arr[right], arr[pivot_index] = arr[pivot_index], arr[right]  # swap pivot with last element

  # call array partitioning routine
  return partition(arr, left, right)


# QuickSort recursive routine: call QuickSort recursively on the partitions of the array, handle base case.

def quicksort(array, left = 0, right = None):
  # first call settings
  if right == None:
    right = len(array) - 1

  # base case: sub-array of size 1  -->  do nothing, no action of that portion of the array

  # recursive calls
  if left < right:
    pivot_index = random_partition(array, left, right)  # put pivot in correct place: find pivot index define two partitions
    quicksort(array, left, pivot_index-1)  # apply quicksort on the left partition: recursively sort left subarray
    quicksort(array, pivot_index+1, right)  # apply quicksort on the right partition: recursively sort right subarray

  return array

**Programming Problem: QuickSort**

**Test case 1**: This file contains 10 integers, representing a 10-element array. Your program should count 25 comparisons if you always use the first element as the pivot, 31 comparisons if you always use the last element as the pivot, and 21 comparisons if you always use the median-of-3 as the pivot (not counting the comparisons used to compute the pivot).

**Test case 2**: This file contains 100 integers, representing a 100-element array. Your program should count 620 comparisons if you always use the first element as the pivot, 573 comparisons if you always use the last element as the pivot, and 502 comparisons if you always use the median-of-3 as the pivot (not counting the comparisons used to compute the pivot).

**Data set**: This file contains all of the integers between 1 and 10,000 (inclusive) in some order, with no integer repeated. The ith row of the file indicates the ith entry of an array. How many comparisons does QuickSort make on this input when the first element is always chosen as the pivot? If the last element is always chosen as the pivot? If the median-of-3 is always chosen as the pivot?


Pivot choice strategy changes the comparison count:

1. First element as pivot;
2. Last element as pivot;
3. Median-of-3 as pivot: median among first, middle, and last elements of the subarray;
4. Random pivot: choose as pivot a random element in the sub-array.

In [4]:
# Partition procedure:  partiton the sub-array into <=pivot portion and >pivot portion, then place pivot in correct position.

def partition(arr, left, right):
  # suppose pivot is the last element of arr
  pivot = arr[right]
  comparisons = right - left
  s = left - 1   # pointer tracking the boundary of partition <= pivot
  for i in range(left, right):
    if arr[i] <= pivot:
      s += 1
      arr[s], arr[i] = arr[i], arr[s]  # perform swap operation after incrementing boundary

  # after partitioning all array need to put pivot in correct position in the middle of the two partitions
  arr[s+1], arr[right] = arr[right], arr[s+1]  # swap pivot

  # need to return the position of the pivot in order to split array into partitions
  return s+1, comparisons

# Pivot selection: choose a pivot index accordingly to a strategy (first element, last element, median of first/last/), swap it with the last element, then call partition routine.

def choose_pivot(arr, left, right, strategy="random"):
  # choose a pivot in the vector according to strategy
  if strategy == "first":
        pivot_index = left
  elif strategy == "last":
      pivot_index = right
  elif strategy == "median3":
      mid = left + (right - left) // 2
      candidates = [(arr[left], left), (arr[mid], mid), (arr[right], right)]
      candidates.sort(key=lambda x: x[0])
      pivot_index = candidates[1][1]  # index of median value
  elif strategy == "random":
      pivot_index = random.randint(left, right)
  else:
      raise ValueError("Unknown pivot strategy: choose 'first', 'last', 'median3', or 'random'")

  # put pivot element at the end of the array for consistency
  arr[right], arr[pivot_index] = arr[pivot_index], arr[right]
  # call array partitioning routine
  return partition(arr, left, right)


# QuickSort recursive routine: call QuickSort recursively on the partitions of the array, handle base case.

def quicksort(array, left = 0, right = None, strategy = 'random'):
  # first call settings
  if right == None:
    right = len(array) - 1
  tot_comparisons = 0

  # base case: sub-array of size 1  -->  do nothing, no action of that portion of the array

  # recursive calls
  if left < right:
    pivot_index, comparisons = choose_pivot(array, left, right, strategy)  # put pivot in correct place: find pivot index define two partitions
    tot_comparisons += comparisons
    tot_comparisons += quicksort(array, left, pivot_index-1, strategy)  # apply quicksort on the left partition: recursively sort left subarray
    tot_comparisons += quicksort(array, pivot_index+1, right, strategy)  # apply quicksort on the right partition: recursively sort right subarray

  return tot_comparisons

In [19]:
print("First pivot comparisons:   ", quicksort(testset1.copy(), strategy="first"))
print("Last pivot comparisons:    ", quicksort(testset1.copy(), strategy="last"))
print("Median-of-3 comparisons:   ", quicksort(testset1.copy(), strategy="median3"))
print("Random pivot comparisons:  ", quicksort(testset1.copy(), strategy="random"))

Last pivot comparisons:     29
First pivot comparisons:    25
Median-of-3 comparisons:    21
Random pivot comparisons:   24


In [23]:
print("First pivot comparisons:   ", quicksort(testset2.copy(), strategy="first"))
print("Last pivot comparisons:    ", quicksort(testset2.copy(), strategy="last"))
print("Median-of-3 comparisons:   ", quicksort(testset2.copy(), strategy="median3"))
print("Random pivot comparisons:  ", quicksort(testset2.copy(), strategy="random"))

First pivot comparisons:    718
Last pivot comparisons:     592
Median-of-3 comparisons:    500
Random pivot comparisons:   633
