## **Binary Search**
A search algorithm that, given a sorted list and a target value,
returns the index of the target (or `None` if not found).

#### **Use Case**
Binary search is used when you need to efficiently find an element in a **sorted** list or array.
It’s ideal for large datasets where a linear scan would be too slow, as it reduces the search space by half each step.

#### **Algorithm Steps**
1. Define the search range (`left` to `right`).
2. Compute the midpoint of the range.
3. Compare the midpoint value with the target.
    - If target == midpoint → return the index.
    - If target < midpoint  → search the left half.
    - If target > midpoint  → search the right half.
4. Repeat (steps 2-3) until the range is 1 element / it's not found.

#### **Complexity**
| Type | Time | Space |
|------|------|--------|
| Iterative | O(log n) | O(1) |
| Recursive | O(log n) | O(log n) |

##### **Explanation**
**Time Complexity:** For a given list of size n, every iteration reduces the search range in half. 
This gives the algorithm at most log_2(n) (or log n) iterations.
Each of these iterations has a constant amount of operations x, totaling (log n) * x operations - or just log n for big-O notation.

**Space Complexity:** 
* *Iterative:* O(1). No data structures dependant on size were used so space is just constant
* *Recursive:* O(log n). Have to look at auxiliary space and call stack space. In this case, auxiliary space (temporary space like variables) is just left, right, mid and val. So, it's O(1).
However, call stack space, or recursion depth, is O(log n), as it splits the range in half, keeping the previous calls, until it finds the answer. In total, it's auxiliary * call stack = O(1) * O(log n) = O(log n)


#### **Implementation**

In [24]:
# Import relevant modules

from random import randint, sample
from typing import Optional

##### **Iterative version**

In [25]:
def binary_search_iter(ls: list, target: int) -> Optional[int]:
    # Define search range
    left, right = 0, len(ls) - 1 

    while left <= right:
        mid = (left + right) // 2 # For languages where overflow can be an issue, can use left + (right - left) // 2
        val = ls[mid]

        # Return midpoint index if it's the target
        if target == val:
            return mid
        
        # Otherwise, adjust search range as needed
        if target < val:
            right = mid - 1  # Don't include mid value as that's already checked
        else:
            left = mid + 1

    return None

##### **Recursive version**

In [26]:
# Recursive implementation of binary search
def binary_search_rec(ls: list, target: int) -> Optional[int]:
    def search(left: int, right: int) -> Optional[int]:
        # Base case for when it has looked through the whole range
        if left > right:
            return None
        
        mid = (left + right) // 2
        val = ls[mid]

        # Return midpoint index if it's the target
        if target == val:
            return mid
        
        # Otherwise, recursively search for value in the relevant range
        if target < val:
            idx = search(left, mid - 1)
        else:
            idx = search(mid + 1, right)

        return idx

    return search(0, len(ls) - 1)

#### **Example use**

In [27]:
if __name__ == "__main__":
    nums = sorted(sample(range(1000), 100))         # Sample of 100 random, different numbers 
    target = nums[randint(0, 99)]                   # Pick a random target that's found on the list
    idx_i = binary_search_iter(nums, target)
    idx_r = binary_search_rec(nums, target)
    for idx in [idx_i, idx_r]:
        if idx is not None:
            print(f'Target {target} found at index {idx}: nums[{idx}] = {nums[idx]}')
        else:
            print(f'Target {target} is not on the given list.')
    

Target 638 found at index 65: nums[65] = 638
Target 638 found at index 65: nums[65] = 638


#### **Binary search vs Linear Search**

Linear search goes through every single element until finding the target. As such, its time complexity is O(n). So, in this section, I'll briefly benchmark them against each other to illustrate the differences. 

In [5]:
# Simple implementation of linear search
def linear_search(ls: list, target: int) -> Optional[int]:
    for i, num in enumerate(ls):
        if num == target:
            return i

    return None

In [6]:
from collections import defaultdict         # Convenient for not having to initialize dictionaries
from time import perf_counter_ns as t_ns    # To measure time with high resolution
from typing import Callable, Any


### **Bonus: Wrapper functions!**

A **wrapper function** is a function that “wraps around” another function — it calls it, often adding extra behavior before or after the main function runs.

##### **Uses**

They’re commonly used for things like:

* Measuring execution time
* Adding logging
* Handling errors
* Managing resources (like opening/closing files or database connections)

##### **Accepting arguments**

Wrappers can even accept arguments needed for the function. They are divided into positional (args) and keyword (kwargs) arguments. 
* **postional:** arguments matched to parameters by position / order (i.e greet("Hi", "Bob")).
* **keyword:** arguments matched by name (i.e. greet(name="Alice", greeting="Hello")).

##### **Syntax**
The syntax is as follows:

```python
def wrapper(func, *args, **kwargs):
    # Work before function
    result = func(*args, **kwargs)
    # Work after function
    return result
```

**Note:** '*' and '**' are packing / unpacking operators. In function definitions, they collect arguments into a tuple or dictionary, respectively. While in function calls, they turn a list / tuple or dictionary into separate arguments.

##### **Decorators**

In Python, wrappers are often turned into decorators, which are a special syntax for wrapping functions. This is purely syntactic sugar for writing them more cleanly.

Instead of

```python
result = timed(my_function, arg1, arg2)
```

one can write 

```python
@timed
def my_function(...):
    ...

```

In [11]:
# Wrapper that returns the time a function takes to run
def timed(func: Callable[..., Any], *args, **kwargs) -> tuple[Any, int]:
    """
    Runs a function and returns (result, elapsed_time_ns).
    """
    start = t_ns()
    result = func(*args, **kwargs)
    elapsed = t_ns() - start
    return result, elapsed


In [None]:
sizes = [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 100, 1000, 10000, 100000]      # Try different list sizes to see effect
attempts = 1000                             # Run each experiment 100 times
# size: {binary: {idxs: list, times: list }
#        linear: {idxs: list, times: list }

results = {"binary": defaultdict(list), "linear": defaultdict(list)} # search type : {size : List[Tuple(idx, time)]}


In [54]:
for size in sizes:
    results_linear = []
    results_binary = []
    for i in range(attempts):
        nums = sorted(sample(range(size*100), size))            # Generate list of random numbers 
        target = nums[randint(0, size - 1)]                     # Pick a random target that's found on list
        res_l = timed(linear_search, nums, target)              # result = (idx, time)
        res_b = timed(binary_search_iter, nums, target)
        
        if res_l[0] != res_b[0]:
            print(f"Oops, something went wrong. Got index {res_l[0]} on linear search but {res_b[0]} on binary")   

        results_linear.append(res_l)
        results_binary.append(res_b)

    results["linear"][size] = results_linear
    results["binary"][size] = results_binary

# TODO: Target that isn't on list?



In [None]:
for size in sizes:
    avg_l = sum(t for i, t in results["linear"][size]) / (attempts * 10**3) # nanosecond to microsecond
    avg_b = sum(t for i, t in results["binary"][size]) / (attempts * 10**3) 

    print(f"Average time for size {size}: Linear={avg_l: .2f} Binary={avg_b: .2f} [microseconds]")

Average time for size 10: Linear= 0.38 Binary= 0.27 [microseconds]
Average time for size 100: Linear= 1.18 Binary= 0.49 [microseconds]
Average time for size 1000: Linear= 10.93 Binary= 1.17 [microseconds]
Average time for size 10000: Linear= 121.27 Binary= 2.00 [microseconds]
Average time for size 100000: Linear= 1320.26 Binary= 5.44 [microseconds]


In [43]:
from math import log2
for size in sizes:
    print(log2(size))

3.321928094887362
6.643856189774724
9.965784284662087
13.287712379549449
16.609640474436812
