# Lecture 14 - Searching
 
In this lecture and [Lecture 15](ME400_Lecture_15.ipynb), we tackle two of the most important practical problems in computing: *searching* and *sorting*.  We'll start with *searching* in this lecture because it is the simpler problem, but efficient searching depends on sorted values.  Along the way, algorithms will be classified by their *order*, a way to describe how good (or bad) an algorithm is for problems of different sizes.

### Objectives

By the end of this lesson, you should be able to

- Search an array of sorted or unsorted numbers using a linear search.
- Search an array of sorted numbers using a binary search.
- Describe what is meant by order and use it to compare algorithms
- Perform simple, numerical experiments to confirm the order of an algorithm

## The Basic, Linear Search

**The Problem**: given a sequence of values, find the location of the element in the sequence equal to some value of interest. 

**Question**: Does the order of the elements matter?  

```
"""Algorithm for linear search of an unsorted sequence"""
Input: a, n, v # sequence, number elements, value of interest
Set location = Not found
Set i = 0
While ________
    If ________ then
        Set ________
        Break  # could we do this without break?
Output: location
```

**Exercise**:  Implement this search algorithm as a Python function named `linear_search(a, v)`.

## When Searching for Equality is Not Enough

What if one wants to find
- The location of an element in a sequence equal to some value or, *if not found*,
- The location of the value that is closest to but less/greater than the value of interest.  **Sorted elements are needed---we'll cover sorting algorithms next time**

**Example**:  Modify the linear search algorithm to return the location of the first match or, if no match, then the element closest to but less than the target value. *Assume* that the sequence is sorted in *increasing* order.

```
"""Algorithm for linear search of an unsorted sequence"""
Input: a, n, v # sequence, number elements, value of interest
Set location = Not found
Set i = 0
While ________
    If ________ then
        Set ________
        Break  # could we do this without break?
Output: location
```

## A Bit About *Order*

The *order* of an algorithm tells us how *expensive* the algorithm is as a function of $n$ for problems of size $n$.

Example: the cost of linear search is the **number of comparisons** (`a[i] > v`) required to find `v`.  How many is that?

The fancy way to say it: 
 - **order n** 
 - $\mathcal{O}(n)$ (this is "Big O" notation)

The computational cost of an algorithm (time and, sometimes, memory) is often proportional to its order.  **Smaller order means a faster algorithm.**

In this class, *order* is used in an engineering/empirical sense that describes how *computationally expensive* an algorithm is on the average.

**Exercise**: Suppose that, on the average, a particular implementation of linear search requires about 0.1 seconds to search a list of $10^5$ numbers.  About how long do you think it would take to search a list of $10^6$ numbers?

## Binary Search

**Linear search** is easy to understand and easy to implement, but is it what you should use to search *sorted data*?


Better approach for *sorted data*: check the middle of the sequence to decide in which half the value lives (if it does at all). Then, check the middle value of the new half, and repeat.

The process just described is the basic idea of **binary search** and is the simplest of **divide and conquer** algorithms.

### The basic algorithm

```
 0. """Algorithm for binary search of a sorted sequence"""
 1. Input: a, n, v # sorted sequence, its length, and value
 2. Set location = Not Found
 3. Set L = 0           
 4. Set R = n - 1
 5. While L <= R
 6.     Set C = (L + R) // 2 
 7.     If v == a[C] then
 8.         Set location = C
 9.         Break
10.     If If v < a[C] then
11.         Set R = C - 1
12.     If If v > a[C] then
13.         Set L = C + 1
14.  Output: location
```

**Exercise**: Step through this algorithm for `a = [1, 3, 7, 9, 11]` and `v = 3`.  Do I have a volunteer to trace this algorithm for these inputs?

### Back to Order

Whereas linear search is $\mathcal{O}(n)$, binary search is $\mathcal{O}(\log n)$ (i.e., "log n").

**Numerical experiments** are a great way to investigate algorithms and their performance.  To test search algorithms, we need:
 - random arrays
 - (and/or) random value to search
 - arrays of different sizes
 - a way to record how long it takes to search

## Recap

By now, you should be able to

- Search an array of numbers using a linear search.
- Search an array of numbers using a binary search.
- Describe what is meant by order and use it to compare algorithms
- Perform simple, numerical experiments to confirm the order of an algorithm