# **Lecture 12: Searching and Sorting**

# **Searching Algorithms**

- Linear Search
    - brute force search (aka British Museum algorithm)
    - list does not have to be sorted
- Bisection Search
    - list must be sorted to give correct answer
    - two different implementations of the algorithm. One with high time complexity and other is lower.

**Linear Search on Unsorted List:Recap O(n)**
- Must look through all elements to decide it's not there.

In [None]:
def linear_seacrch(L,e):
    found = False
    for i in range(len(L)):
        if L[i] == e:
            found = True
    return False

**Linear Search on Sorted List:Recap O(n)**

In [None]:
def search(L,e):
    for i in range(len(L)):
        if L[i] == e:
            return True
        if L[i] > e:
            return False
    return False

**Bisection Search Implementation: Recap O(log n)**
- Divide list each search/recursion. Divide and conquer.

In [None]:
def bisect_search2(L,e):
    def bisect_search_helper(L,e,low,high):
        if high == low:
            return L[low] == e
        mid = (low+high)//2
        if L[mid] == e:
            return True
        elif L[mid] > e:
            if low == mid: # nothing left to search
                return False
            else:
                return bisect_search_helper(L,e,mid+1,high)
    if len(L) == 0:
        return False
    else:
        return bisect_search_helper(L,e,0,len(L)-1)

# **Searching a Sorted List, n is len(List)**

- Using linear search, search for an element is O(n) complexity
- Using binear search, can search for an element in O(log n)
    - Assumes the list is sorted
- When does it make sense to sort first then search?
    - SORT + O(log n) < O(n) -> SORT < O(n) - O(log n)
    - When sorting is less than O(n) = never true

**Amortized Cost, n is len(List)**
- Why bother sorting first?
- Some cases, may sort a list once then do many searches in it.
- Amortize cost of the sort over many searches.
- SORT + K * O(log n) < K * O(n)
    - FOr large K, sort time becomes irrelevant if cost of sorting is small enough

# **Complexity of Bogo Sort O(?)**
- A sort that jumbles the list randomly and then check if theyre in order.
- Best case:
    - O(n) where n is len(L) to check if sorted.
- Worst Case:
    - O(?) it is unbounded if really unlucky.

In [None]:
def bogo_sort(L):
    while not is_Sorted(L):
        random.shuffle(L)

# **Complexity of Bubble Sort O(n^2)**
- A sort that starts at the front and then compares i index and i+1 index. Comparing consecutive pairs.
    - See if i is less than i+1, else switch their spots.
    - Move onto i+1 as new i, and repeat again.
    - Basically comparing pairs.
- Repeat until the list is sorted and dont have to switch.

In [None]:
def bubble_sort(L):
    swap = False
    while not swap:                 # O(len(L)) or O(n)
        swap = True
        for j in range(l,len(L)):   # O(len(L)) or O(n)
            if L[j-1] > L[j]:
                swap = False
                temp = L[j]
                L[j] = L[j-1]
                L[j-1] = temp

# **Complexity of Selection Sort O(n^2)**
- A sort that runs through a list and finding the min (or max) element and then popping it into a new list.
    - Another method is to swap indexes with ever growing counter of i position instead of making a new list.
- Repeat until nothing in original list and new list is sorted from least element to greatest.
- 
- First Step
    - Extract minimum element
    - Swap with element at index 0
- Subsequent Step
    - In remaining sublist, extract minimum element
    - Swap with element at index 1
- Keep left portion of list sorted.
    - at i'th step, first i elements in list are sorted
    - all other elements are bigger than first i elements

In [None]:
def selection_sort(L):
    suffixSt = 0 # posRecorder
    while suffixSt != len(L):                           # O(n)
        for i in range(suffixSt, len(L)):               # O(n), even though range decrease, it is still O(n) basically
            if L[i] < L[suffixSt]:
                L[suffixSt], L[i] = L[i], L[suffixSt]
            suffixSt += 1

# **Complexity of Merge Sort O(n log n)**
- A sort that keeps splitting the list into halves until there are no more pairs or single elements.
    - Perform sort on these mini pairs and then combine them, return them to the recursive call above and merge.
    - Combine and merge and repeat up the recursive depth until got a big sorted list.
    - Divide and conquer
- Complexity of merging sublists each step
    - go through two lists only one pass
    - compare only smallest element in each sublist
    - O(len(left) + len(right)) copied elements
    - O(len(longer list)) comparisons
    - Linear in length of the lists
- Complexity of the main merge sort algorithm
    - Divide list into two halves
    - Depth-first such that conquer smallest pieces down one branch first before moving to larger pieces.
    - Analysis
        - First recursion level
            - n/2 elements in each list
            - O(n) + O(n) = O(n) where n is len(L)
        - Second recursion level
            - n/4 elements in each list
            - two merges -> O(n) where n is len(L)
        - At each recursion level is O(n) where n is len(L)
        - Dividing list in half with each recursive call
            - O(log(n)) where n is len(L)
        - Overall complexity is (n log (n)) where n is len(L)

In [None]:
# Portion 1, merge function

def merge_function(left,right): # O(n)
    result = []
    i,j = 0,0
    while i < len(left) and j <len(right):      # O(n)
        if left[i] < right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    while (i < len(left)):                      # O(n)
        result.append(left[i])
        i += 1
    while (j < len(right)):                     # O(n)
        result.append(right[j])
        j+= 1
    return result

In [None]:
# Portion 2, Merge Sort Function

def merge_sort(L): # O(log n)
    if len(L) <2:
        return L[:]
    else:                                   # O( log n ), only do this until len(L) is less than 2. List is getting halved.
        middle = len(L)//2
        left = merge_sort(L[:middle])       # O(1)
        right = merge_sort(L[middle:])      # O(1) 
        return merge_function(left,right)