<a href="https://colab.research.google.com/github/walkerjian/DailyCode/blob/main/Code_Craft_select_approx_median.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Create an algorithm to efficiently compute the approximate median of a list of numbers.

More precisely, given an unordered list of $N$ numbers, find an element whose rank is between $N / 4$ and $3 * N / 4$, with a high level of certainty, in less than $O(N)$ time.

To efficiently compute an approximate median of a list of numbers such that the rank of the selected element is between $ \frac{N}{4} $ and $ \frac{3N}{4} $, we can use a randomized selection algorithm that has an expected linear runtime. The algorithm we'll consider is based on the Quickselect algorithm, modified to provide a probabilistically high certainty of finding the desired rank without fully sorting the list.

Here is a high-level description of the algorithm:

1. **Randomly Select a Pivot:**
   Choose a pivot element randomly from the list. This randomness helps avoid worst-case scenarios often associated with Quickselect, making the algorithm efficient on average.

2. **Partition the List:**
   Reorder the list so that all elements less than the pivot come before it (the left partition) and all elements greater than the pivot come after it (the right partition). Count the number of elements in the left partition to determine the rank of the pivot.

3. **Recursive Selection:**
   - If the rank of the pivot is between $ \frac{N}{4} $ and $ \frac{3N}{4} $, then the pivot is an acceptable approximate median, and we return it.
   - If the rank of the pivot is less than $ \frac{N}{4} $, recursively apply the algorithm to the right partition.
   - If the rank of the pivot is greater than $ \frac{3N}{4} $, recursively apply the algorithm to the left partition.

4. **Termination Condition:**
   Since the pivot is chosen randomly, each recursion reduces the size of the problem expectedly by about half, leading to a logarithmic number of recursions. Thus, the expected time complexity is $ O(N) $, and with careful implementation and sufficient randomization, it can perform well below $ O(N) $ in practical scenarios.

5. **Implementation Considerations:**
   - Use a good random number generator to avoid biases in pivot selection.
   - Handle edge cases where the list might have duplicates or is very small.

This code will compute an approximate median, ensuring that the chosen element's rank lies within the desired quartile range of the array's length. The random choice of pivots ensures that the average case performance is efficient.

In [1]:
import random

def select_approx_median(nums):
    def partition(start, end, pivot_index):
        nums[pivot_index], nums[end] = nums[end], nums[pivot_index]
        pivot = nums[end]
        i = start - 1
        for j in range(start, end):
            if nums[j] <= pivot:
                i += 1
                nums[i], nums[j] = nums[j], nums[i]
        nums[i + 1], nums[end] = nums[end], nums[i + 1]
        return i + 1

    def select(start, end):
        if start == end:
            return nums[start]

        pivot_index = random.randint(start, end)
        pivot_index = partition(start, end, pivot_index)

        # The length of the left partition
        k = pivot_index - start + 1

        if k >= len(nums) // 4 and k <= 3 * len(nums) // 4:
            return nums[pivot_index]
        elif k < len(nums) // 4:
            return select(pivot_index + 1, end)
        else:
            return select(start, pivot_index - 1)

    return select(0, len(nums) - 1)

# Example usage
nums = [random.randint(1, 100) for _ in range(100)]
approx_median = select_approx_median(nums)
print("Approximate Median:", approx_median)

Approximate Median: 25
