## Is my Data Ordered? Then there's a lot!

Possible pitfalls of that search, and how the binary search algorithm differs. Binary search is an efficient algorithm for finding an item from a sorted list of items.

Binary search is a powerful algorithm for finding elements in sorted lists due to its logarithmic time complexity. However, it requires the list to be sorted and careful handling of edge cases and pointer updates. This makes it more efficient than linear search but also slightly more complex to implement correctly.

## Pitfalls of Binary Search

	1.	Sorted List Requirement:
	•	Binary search requires the list to be sorted beforehand. If the list is not sorted, the algorithm will not work correctly.
	•	Sorting the list adds additional overhead, especially for large datasets. Sorting itself has a time complexity of O(n \log n).
	2.	Integer Overflow:
	•	In languages with fixed integer sizes, calculating the midpoint as (low + high) // 2 can cause integer overflow. This is usually not a concern in Python due to its handling of large integers, but it’s a potential pitfall in other languages like C++ or Java.
	•	A safer way to calculate the midpoint is low + (high - low) // 2.
	3.	Infinite Loops:
	•	Incorrect updates to the low and high pointers can cause infinite loops. Ensuring that low and high are updated correctly in each iteration is crucial.
	4.	Edge Cases:
	•	Handling edge cases like an empty list or a list with one element can be tricky and needs special attention.
	•	If the target element is smaller than the smallest element or larger than the largest element, the algorithm needs to terminate correctly.

## How Binary Search Differs from Linear Search

Efficiency:

	•	Linear Search: Time complexity is O(n). It checks each element sequentially until the target is found or the list ends. This makes it inefficient for large lists.
	•	Binary Search: Time complexity is O(\log n). It divides the search interval in half with each step, significantly reducing the number of comparisons needed.

Applicability:

	•	Linear Search: Can be applied to any list, whether sorted or unsorted.
	•	Binary Search: Only applicable to sorted lists. If the list is unsorted, it must be sorted first, adding overhead.

Implementation Complexity:

	•	Linear Search: Simple to implement. Iterate through the list and check each element.
	•	Binary Search: More complex. Requires careful management of pointers and calculation of the midpoint.


## Pseudo code binary search

Pseudocode for Binary Search

	1.	Initialization:
	•	Set low to 0 (the index of the first element).
	•	Set high to the index of the last element (length of the list minus one).
	2.	Loop:
	•	While low is less than or equal to high:
	•	Calculate mid as the integer division of (low + high) / 2.
	•	If the element at mid is equal to the target element (needle):
	•	Return True (or the index mid).
	•	Else if the element at mid is less than the target element:
	•	Set low to mid + 1.
	•	Else (the element at mid is greater than the target element):
	•	Set high to mid - 1.
	3.	End of Loop:
	•	If the loop exits without finding the target element, return False.

    ```
    FUNCTION binary_search(haystack, needle):
    low ← 0
    high ← LENGTH(haystack) - 1

    WHILE low ≤ high DO:
        mid ← (low + high) // 2
        
        IF haystack[mid] = needle THEN:
            RETURN True
        
        ELSE IF haystack[mid] < needle THEN:
            low ← mid + 1
        
        ELSE:
            high ← mid - 1
    
    RETURN False
    ```

## Implementing Binary Search

In [5]:
# A decorter to get function execution time
import time

def timeit(func):
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(f"Function {func.__name__} executed in {end-start} seconds")
        return result
    return wrapper

In [6]:
import math


# Explanation:

# 	1.	Initialization:
# 	•	low is initialized to 0, representing the start of the list.
# 	•	high is initialized to the last index of the list, which is len(haystack) - 1.
# 	2.	Loop:
# 	•	The loop runs indefinitely with while True.
# 	•	Inside the loop, mid is calculated as low + (high - low) // 2 to avoid overflow issues.
# 	•	If the element at mid (haystack[mid]) is equal to the needle, the function returns True.
# 	•	If the element at mid is less than the needle, the search continues in the right half by updating low to mid + 1.
# 	•	If the element at mid is greater than the needle, the search continues in the left half by updating high to mid - 1.
# 	•	The loop breaks if low exceeds high, indicating that the search interval is invalid.
# 	3.	End of Loop:
# 	•	If the loop exits without finding the needle, the function returns False, indicating that the target element is not in the list.

@timeit
def bs_list(haystack: list[int], needle: int) -> bool:
    """
    Binary search in a list
    """
    low = 0
    high = len(haystack) - 1
    
    while True:
        # Calculate mid index safely
        mid = math.floor(low + (high - low) // 2)
        
        value = haystack[mid]
        
        # Check if the mid element is the target element
        if value == needle:
            return True
        # If the target element is greater, ignore the left half
        elif value < needle:
            low = mid + 1
        # If the target element is smaller, ignore the right half
        else:
            high = mid - 1
        
        # Break the loop if the search interval is invalid
        if low > high:
            break
    
    # If the element is not found, return False
    return False


In [7]:
# Test the binary search function
def test_bs_list():
    foo = [1, 3, 4, 69, 71, 81, 90, 99, 420, 1337, 69420]
    assert bs_list(foo, 69) == True
    assert bs_list(foo, 1336) == False
    assert bs_list(foo, 69420) == True
    assert bs_list(foo, 69421) == False
    assert bs_list(foo, 1) == True
    assert bs_list(foo, 0) == False

# Run the test function
test_bs_list()

Function bs_list executed in 4.482269287109375e-05 seconds
Function bs_list executed in 2.6226043701171875e-06 seconds
Function bs_list executed in 4.5299530029296875e-06 seconds
Function bs_list executed in 1.430511474609375e-06 seconds
Function bs_list executed in 1.1920928955078125e-06 seconds
Function bs_list executed in 3.0994415283203125e-06 seconds
