# Searching

In Python, there is a very easy way to ask whether an item is in a list of items.

We use the <font color=red>__in__</font> operator.

In [1]:
15 in [3, 5, 2, 4, 1]

False

In [2]:
3 in [3, 5, 2, 4, 1]

True

## The Sequential Search

Starting from the first item in the list, we simply move from item to item, following the underlying sequential order until we either find what we are looking for or run out of items. 

In [3]:
def sequentialSearch(alist, item):
    pos = 0
    found = False
    
    while pos < len(alist) and not found:
        if alist[pos] == item:
            found = True
        else:
            pos = pos+1
    
    return found

In [4]:
testlist = [1, 2, 32, 8, 17, 19, 42, 13, 0]

In [5]:
print(sequentialSearch(testlist, 3))

False


In [6]:
print(sequentialSearch(testlist, 13))

True


## Analysis of Sequential Search

| __Case__ | __Best Case__ | __Worst Case__ | __Average Case__ |
|----------|---------------|----------------|------------------|
| item is present | 1 | n | n/2 |
| item is not present | n | n | n |

## What if the list is ordered?

In some cases, the algorithm does not have to continue looking through all of the items to report that the item was not found.

In [7]:
def orderedSequentialSearch(alist, item):
    pos = 0
    found = False
    stop = False
    while pos < len(alist) and not found and not stop:
        if alist[pos] == item:
            found = True
        else:
            if alist[pos] > item:
                stop = True
            else:
                pos = pos+1
                
    return found

In [8]:
testlist = [0, 1, 2, 8, 13, 17, 19, 32, 42]

In [9]:
print(orderedSequentialSearch(testlist, 3))

False


In [10]:
print(orderedSequentialSearch(testlist, 13))

True


## The Binary Search

Instead of searching the list in sequence, a __binary search__ will start by examing the middle term.

If it is not the correct item, we can use the ordered nature of the list to eliminate half of the remaining items.

In [11]:
def binarySearch(alist, item):
    first = 0
    last = len(alist)-1
    found = False
    
    while first<=last and not found:
        midpoint = (first + last) // 2
        if alist[midpoint] == item:
            found = True
        else:
            if item < alist[midpoint]:
                last = midpoint-1
            else:
                first = midpoint+1
                
    return found

In [12]:
testlist = [0, 1, 2, 8, 13, 17, 19, 32, 42]

In [13]:
print(binarySearch(testlist, 3))

False


In [14]:
print(binarySearch(testlist, 13))

True


This algorithm is a great example of a divide and conquer strategy.

## Recursive Version of Binary Search

In [15]:
def r_binarySearch(alist, item):
    if len(alist) == 0:
        return False
    else:
        midpoint = len(alist) // 2
        if alist[midpoint]==item:
            return True
        else:
            if item < alist[midpoint]:
                return r_binarySearch(alist[:midpoint], item)
            else:
                return r_binarySearch(alist[:midpoint], item)

In [16]:
testlist = [0, 1, 2, 8, 13, 17, 19, 32, 42]

In [17]:
print(r_binarySearch(testlist, 3))

False


In [18]:
print(r_binarySearch(testlist, 13))

True


## Analysis of Binary Search

Remember that each comparison eliminates about half of the remaining items from consideration.

| __Comparisons__ | __Approximate Number of Items Left__ |
|-----------------|--------------------------------------|
| 1 | n/2 |
| 2 | n/4 |
| 3 | n/8 |
| ... | |
| i | n/2^i |

When we split the list enough times, we end up with a list that has just one item.

Either that is the item we are looking for or it is not.

The number of comparisons necessary to get to this point is i where n/2^i = 1.

Solving for i gives us i = log n

Therefore, the binary search is O(log n).

In the solution shown above, the recursive call,

<font color=red>__binarySearch[:midpoint], item)__</font>

uses the slice operator to create the left half of the list.

We know that the slice operator in Python is actually O(k).

This means that the binary search using slice will not perform in strict logarithmic time.