# Algorithms Lab: Searching & Sorting (College-Level)

Welcome! This notebook is a guided, hands-on lab to help you **understand and implement fundamental algorithms**.

You'll learn by doing:
- Big-O reasoning via small experiments
- Linear search vs. binary search
- Classic sorts (selection, insertion, merge, quick)
- Stability & in-place properties
- Comparator-based sorting & key functions
- Practice problems (LeetCode-style) with tests

**Workflow suggestion** (for GitHub classroom / repo):
1. Create a feature branch named `algorithms-lab`.
2. Work through each section, filling in `TODO` blocks.
3. Run the local tests (cells named **Run tests**). All tests should pass.
4. Commit your changes and open a Pull Request.

> Pro tip: Write small helper functions, keep code clean, add docstrings, and handle edge cases.


## 0) Setup & Utilities
Run this once. It defines helpers and a small testing harness.

In [1]:
from __future__ import annotations
import math, random, time
from dataclasses import dataclass
from typing import List, Callable, Any, Optional, Iterable, Tuple

def assert_equal(actual, expected, msg: str = ""):
    if actual != expected:
        raise AssertionError(f"Expected {expected!r} but got {actual!r}. {msg}")

def assert_true(expr: bool, msg: str = ""):
    if not expr:
        raise AssertionError(f"Assertion failed. {msg}")

def time_function(fn: Callable, *args, repeats: int = 5, **kwargs) -> float:
    """Return median runtime (seconds) over several repeats."""
    times = []
    for _ in range(repeats):
        t0 = time.perf_counter()
        fn(*args, **kwargs)
        times.append(time.perf_counter() - t0)
    times.sort()
    return times[len(times)//2]


## 1) Warm-up: Big-O by Experiment

You're given two functions that do the same job but with different complexities.
Implement them, then **measure** how their runtime scales with input size.

**Task:**
- Implement `sum_pairs_quadratic` in $O(n^2)$.
- Implement `sum_pairs_linear` in $O(n)$ by aggregating counts.
- Verify scaling with the `benchmark_sum_pairs` cell.


In [2]:
def sum_pairs_quadratic(nums: list[int]) -> int:
    """Return the number of (i, j) equal-value pairs with i<j. O(n^2)."""
    # TODO: brute-force nested loops
    matchingNums = 0
    for i in range(1, len(nums)):
        for j in range(i):
            if nums[i] == nums[j]:
                matchingNums += 1
    return matchingNums

    raise NotImplementedError("TODO: implement sum_pairs_quadratic")

def sum_pairs_linear(nums: list[int]) -> int:
    """Return number of equal pairs in O(n) using a hashmap."""
    # TODO: linear-time using counts / hashmap
    equalPairs = {}
    matchingNums = 0
    for num in nums:
        if num in equalPairs:
            matchingNums += equalPairs[num]
            equalPairs[num] += 1
        else:
            equalPairs[num] = 1
    return matchingNums

    raise NotImplementedError("TODO: implement sum_pairs_linear")


In [3]:
# Run tests (Warm-up)
assert_equal(sum_pairs_quadratic([1,1,2]), 1)
assert_equal(sum_pairs_quadratic([1,1,1]), 3)  # pairs: (0,1),(0,2),(1,2)
assert_equal(sum_pairs_linear([1,1,2]), 1)
assert_equal(sum_pairs_linear([1,1,1]), 3)
print("[ok] Warm-up tests passed.")


[ok] Warm-up tests passed.


In [4]:
# Benchmark (feel free to tweak sizes)
for n in [1_000, 2_000, 4_000, 8_000]:
    data = [random.randint(0, 1000) for _ in range(n)]
    t_quad = time_function(sum_pairs_quadratic, data, repeats=1)  # may get slow
    t_lin = time_function(sum_pairs_linear, data)
    print(f"n={n:6d} | quadratic ~ {t_quad:7.4f}s | linear ~ {t_lin:7.4f}s")


n=  1000 | quadratic ~  0.0206s | linear ~  0.0001s
n=  2000 | quadratic ~  0.0695s | linear ~  0.0002s
n=  4000 | quadratic ~  0.2940s | linear ~  0.0004s
n=  8000 | quadratic ~  1.1879s | linear ~  0.0008s


## 2) Searching

### 2.1 Linear Search
Implement a simple linear scan. Return the index or `-1` if not found.

### 2.2 Binary Search
Assume the input list is **sorted ascending**. Implement both **iterative** and **recursive** versions. Return the index or `-1`.


In [5]:
def linear_search(arr: list[int], target: int) -> int:
    # TODO: implement linear scan
    if(len(arr) == 0):
        return -1
    
    for i in range(len(arr)):
        if(arr[i] == target):
            return i
    return -1

    raise NotImplementedError("TODO: implement linear_search")

def binary_search_iter(arr: list[int], target: int) -> int:
    # TODO: iterative binary search
    if(len(arr) == 0):
        return -1
    
    left = 0
    right = len(arr) - 1

    while (left <= right): #what does the double slash for division mean
        midpoint = int(left + (right - left)/2)
        if(target == arr[midpoint]):
            return midpoint
        elif (target < arr[midpoint]):
            right = midpoint - 1
        elif (target > arr[midpoint]):
            left = midpoint + 1
        
    return -1

    raise NotImplementedError("TODO: implement binary_search_iter")

def binary_search_rec(arr: list[int], target: int, lo: int = 0, hi: int | None = None) -> int:
    # TODO: recursive binary search
    #was not as sure on this one
    #does this count as "if high is None" case?
    if(len(arr) == 0):
        return -1
    #what does this do?
    if hi is None:
        hi = len(arr) - 1 #accidentally put a return statement here (might be related to None = None)
    """
    left = 0
    right = len(arr) - 1

    #still need a base case for this one
    if(left > right):
        return -1
    #don't necessarily need a while loop because recursion is a loop in and of itself
    midpoint = int(left + (right - left)/2)
    if(target == arr[midpoint]):
        return midpoint
    elif (target < arr[midpoint]):
        right = midpoint + 1
        binary_search_rec(arr, target, left, right)
    elif (target > arr[midpoint]):
        left = midpoint - 1
        binary_search_rec(arr, target, left, right)
    
    return -1"""

    #problems: called recursion but didn't return the method
    #kept resetting the left & right bounds instead of just using lo and hi

    if(lo > hi):
        return -1
    
    midpoint = int(lo + (hi-lo)/2)

    if(target == arr[midpoint]):
        return midpoint
    elif(target < arr[midpoint]):
        return binary_search_rec(arr, target, lo, midpoint - 1)
    elif(target > arr[midpoint]):
        return binary_search_rec(arr, target, midpoint + 1, hi)

    #raise NotImplementedError("TODO: implement binary_search_rec")


In [6]:
# Run tests (Searching)
arr = [1,3,5,7,9,11]
assert_equal(linear_search(arr, 7), 3)
assert_equal(linear_search(arr, 2), -1)
assert_equal(binary_search_iter(arr, 1), 0)
assert_equal(binary_search_iter(arr, 11), 5)
assert_equal(binary_search_iter(arr, 4), -1)
assert_equal(binary_search_rec(arr, 1), 0)
assert_equal(binary_search_rec(arr, 11), 5)
assert_equal(binary_search_rec(arr, 4), -1)
print("[ok] Search tests passed.")


[ok] Search tests passed.


## 3) Sorting

You'll implement four classic algorithms. For each, preserve the function signature. Don't use Python's built-ins for the core logic.

**Tasks**
- `selection_sort` (in-place, not stable)
- `insertion_sort` (in-place, stable)
- `merge_sort` (out-of-place, stable)
- `quick_sort` (in-place partition, average O(n log n))

> **Stability reminder:** Stable sorts keep the original relative order of items that compare equal.


In [34]:
#do bubble sort too!!
def bubble_sort(arr: list[int]) -> list[int]:
    for j in range(len(arr)):
        for i in range(len(arr) - 1):
            if(arr[i] > arr[i+1]):
                temp = arr[i]
                arr[i] = arr[i+1]
                arr[i+1] = temp
    return arr

#retry insertion sort!
def selection_sort(arr: list[int]) -> list[int]:
    """In-place or copy is fine. Not stable."""
    # TODO: implement selection sort
    if(len(arr) == 0):
        return arr
    
    for i in range(len(arr)):
        minIndex = i
        for j in range(i+1, len(arr)):
            #finding the minimum part had an error but it was fixed - have to make sure that
            #you're comparing each element to the number that is the current minimum
            if(arr[j] < arr[minIndex]):
                minIndex = j
        temp = arr[i]
        arr[i] = arr[minIndex]
        arr[minIndex] = temp
    
    return arr
    raise NotImplementedError("TODO: implement selection_sort")

def insertion_sort(arr: list[int]) -> list[int]:
    """Stable, in-place insertion sort."""
    # TODO: implement insertion sort
    if(len(arr) == 0 or len(arr) == 1):
        return arr
    
    for i in range(1, len(arr)):
        temp = arr[i]
        j = i-1
        while(j >= 0 and arr[j] > arr[i]):
            arr[j+1] = arr[j]
            j -= 1
        arr[j+1] = temp
    
    return arr
    '''
    #might not be the most efficient way to do it though
    for i in range(1, len(arr)):
        index = i - 1
        if(arr[i] < arr[index] and index != 0):
            index -= 1
        elif(arr[i] < arr[index] and index == 0):
            value = arr.pop(i)
            arr.insert(index, value)
        else:
            value = arr.pop(i)
            arr.insert(index + 1, value)

    return arr'''

    raise NotImplementedError("TODO: implement insertion_sort")

def merge_sort(arr: list[int]) -> list[int]:
    """Stable, out-of-place merge sort."""
    # TODO: implement merge sort
    #is this our base case? why is it ok to return it here?
    if(len(arr) == 0 or len(arr) == 1):
        return arr
    
    #splitting the arrays in half
    #you don't need a loop here because it automatically returns when the length is 0 or 1
    midpoint = (0 - len(arr))//2 #need to typecast to an integer using floor division
    left = merge_sort(arr[:midpoint])
    right = merge_sort(arr[midpoint:])

    #merging part, why can't you use for loops?
    i = j = 0 #two array indices
    finalList = []
    while(i < len(left) and j < len(right)):
        if(right[j] <= left[i]):
            finalList.append(right[j])
            j += 1 #increment indices so you don't create a forever while loop
        else:
            finalList.append(left[i])
            i += 1
    #how come this doesn't create an out of bounds error?
    finalList.extend(left[i:]) #slicing never goes out of bounds
    finalList.extend(right[j:])
    return finalList
    #recursive algorithm - does not need two functions

    raise NotImplementedError("TODO: implement merge_sort")

def quick_sort(arr: list[int]) -> list[int]:
    """In-place partition, average O(n log n)."""
    # TODO: implement quick sort (choose a pivot, partition, recurse)
    raise NotImplementedError("TODO: implement quick_sort")


In [35]:
# Run tests (Sorting)
samples = [
    [], [1], [2,1], [3,2,1], [5,1,4,2,8],
]
for s in samples:
    exp = sorted(s)
    assert_equal(bubble_sort(s), exp, "bubble_sort_failed")
    assert_equal(selection_sort(s), exp, "selection_sort failed")
    assert_equal(insertion_sort(s), exp, "insertion_sort failed")
    assert_equal(merge_sort(s), exp, "merge_sort failed")
    #assert_equal(quick_sort(s), exp, "quick_sort failed")
print("[ok] Sorting tests passed.")


[ok] Sorting tests passed.


### 3.1 Sorting with Custom Keys (Comparators)
Python's `sorted` accepts a `key` function. Implement a small stable sort that accepts a `key`.

**Task:** Implement `stable_sort_by_key` using **insertion sort** (to guarantee stability) and the provided `key`.


In [9]:
def stable_sort_by_key(arr: list[object], key):
    """Implement a stable sort using insertion sort and a key function."""
    # TODO: use insertion sort comparing key(x)
    raise NotImplementedError("TODO: implement stable_sort_by_key")


In [10]:
# Run tests (Custom key & stability)
people = [("alice", 3), ("bob", 3), ("carl", 2), ("dana", 3)]
res = stable_sort_by_key(people, key=lambda x: x[1])
assert_equal(res, [("carl", 2), ("alice", 3), ("bob", 3), ("dana", 3)])
# stability check: among ties (3), original order alice -> bob -> dana preserved
print("[ok] Stability test passed.")


[ok] Stability test passed.


## 4) Practice Problems (LeetCode-style)
Solve each with the **best possible time complexity**.

1. **Two Sum (classic)** – Return indices of two numbers adding up to target. (Hash map)
2. **Find First Bad Version** – Given `is_bad(version)` (monotonic), find first bad (Binary search).
3. **Search Rotated Sorted Array** – Return index of target in rotated sorted array (Binary search variant).
4. **Kth Largest Element** – Return the k-th largest using Quickselect (average O(n)).
5. **Merge Intervals** – Given intervals, merge overlaps (Sort + linear sweep).


In [11]:
def two_sum(nums: list[int], target: int) -> list[int]:
    # TODO: return [i, j] with i<j using a hashmap
    raise NotImplementedError("TODO: implement two_sum")

def first_bad_version(n: int, is_bad) -> int:
    """Return the smallest v in [1..n] with is_bad(v)=True using binary search."""
    # TODO: implement binary search on versions
    raise NotImplementedError("TODO: implement first_bad_version")

def search_rotated(nums: list[int], target: int) -> int:
    """Binary search on rotated sorted array."""
    # TODO: implement modified binary search
    raise NotImplementedError("TODO: implement search_rotated")

def kth_largest(nums: list[int], k: int) -> int:
    """Quickselect expected O(n)."""
    # TODO: implement quickselect (k-th largest)
    raise NotImplementedError("TODO: implement kth_largest")

def merge_intervals(intervals: list[list[int]]) -> list[list[int]]:
    """Sort by start, then merge overlaps."""
    # TODO: implement merge intervals
    raise NotImplementedError("TODO: implement merge_intervals")


In [12]:
# Run tests (Practice)
assert_equal(two_sum([2,7,11,15], 9), [0,1])

bad_after = 7
def is_bad(v): return v >= bad_after
assert_equal(first_bad_version(20, is_bad), 7)

assert_equal(search_rotated([4,5,6,7,0,1,2], 0), 4)
assert_equal(search_rotated([4,5,6,7,0,1,2], 3), -1)

assert_equal(kth_largest([3,2,1,5,6,4], 2), 5)

assert_equal(merge_intervals([[1,3],[2,6],[8,10],[15,18]]), [[1,6],[8,10],[15,18]])
print("[ok] Practice problem tests passed.")


[ok] Practice problem tests passed.


## 5) Stability & In-Place Discussion
Answer briefly in Markdown:
1. Which of the four sorts you wrote are stable? Which are in-place?
2. Give an example where stability affects correctness.
3. Why does quicksort have bad worst-case behavior and how to mitigate it in practice?


_Your notes here..._


## 6) Optional Extensions (Bonus)
- **Counting Sort / Radix Sort** for integers with bounded ranges.
- **Binary Search Variants**: lower_bound / upper_bound positions.
- **Order Statistics**: deterministic linear-time selection (median of medians).
- **Stability Experiment**: craft inputs to visually demonstrate stability vs. instability.


In [13]:
def lower_bound(arr: list[int], target: int) -> int:
    """First index i such that arr[i] >= target."""
    # TODO: implement binary search variant (lower_bound)
    raise NotImplementedError("TODO: implement lower_bound")

def upper_bound(arr: list[int], target: int) -> int:
    """First index i such that arr[i] > target."""
    # TODO: implement binary search variant (upper_bound)
    raise NotImplementedError("TODO: implement upper_bound")


lower_bound for 2: 1
upper_bound for 2: 4


---
### Submission Checklist
- [ ] All `Run tests` cells pass
- [ ] You answered the short discussion questions in Section 5
- [ ] Commit & Push your changes
- [ ] Open a PR titled `Algorithms Lab: Searching & Sorting`

Good luck! 🚀
