# Sequential/Linear Search

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gao-hongnan/reighns-ml-blog/blob/master/docs/reighns_ml_journey/data_structures_and_algorithms/Stack.ipynb)

## Intuition of Sequential Search

This idea is quite simple, given a container, say a list (which is stored sequentially/linearly as each element's position is relative to one another).

If we want to search for an target element `e` in the list, we can do so **sequentially**, where we search from the 1st element up to the last element from the list, if we find `e` while searching through, then return `True`, else if we reached to the end of the list and there is no `e` found, then return `False`. We can also return the index of `e` if we found it.

```{figure} ../assets/linear_search_geeksforgeeks.png
---
name: linear_search_diagram
---
Linear Search Algorithm. Image credit to [GeeksforGeeks](https://www.geeksforgeeks.org/linear-search/).
```

## Unordered Sequential Search

The list that we want to search for is unordered.

### Algorithm (Iterative)

````{prf:algorithm} Basic Linear Search Algorithm (Iterative)
:label: basic_linear_search_iterative

Given a list $L$ of $n$ elements with values or records $L_0, L_1, ..., L_{n-1}$, and target value $T$, the following subroutine uses linear search to find the index of the target $T$ in $L$.

1. Set $i$ to 0.
2. If $L_i = T$, the search terminates successfully; return $i$. Else, go to step 3.
3. Increase $i$ by 1.
4. If $i < n$, go to step 2. Otherwise, the search terminates unsuccessfully and return $-1$.

Pseudocode:

```python
def linear_search(L, T):
    for i from 0 to n-1:
        if L[i] == T:
            return i
    return -1
```
````

### Implementation (Iterative)

Assumptions:

1. The `container` is a list.
2. The list holds integers or floats.

In [1]:
from __future__ import annotations

from typing import TypeVar, Tuple, Iterable

T = TypeVar("T")


def unordered_sequential_search_iterative(
    container: Iterable[T], target: T
) -> Tuple[bool, int]:
    """If the target element is found in the container, returns True and its index,
    else, return False and -1 to indicate the not found index."""
    is_found = False  # a flag to indicate so your return is more meaningful
    index = 0
    for item in container:
        if item == target:
            is_found = True
            return is_found, index
        index += 1
    return is_found, -1

In [10]:
unordered_list = [1, 2, 32, 8, 17, 19, 42, 13, 0]

print(unordered_sequential_search_iterative(unordered_list, -1)) # smaller than smallest element
print(unordered_sequential_search_iterative(unordered_list, 45)) # larger than largest element
print(unordered_sequential_search_iterative(unordered_list, 13)) # in the middle

(False, -1)
(False, -1)
(True, 7)


#### Time Complexity

We need to split the time complexity into a few cases, this is because the 
time complexity ***heavily*** depends on the position of the target element we are searching for.

If the element we are searching for is at the beginning of the list, then the time complexity is $\O(1)$, because we only need to check the first element.

If the element is at the end of the list, then the time complexity is $\O(n)$, because we need to check every element in the list.

On average, the time complexity is $\O(\frac{n}{2})$. This average means that for a list with $n$
elements, there is an equal chance that the element we are searching for is at the beginning, middle, or end of the list. In short, it is a uniform distribution. And therefore the expected time complexity is $\O(\frac{n}{2})$.

However, so far we assumed that the element we are searching for is in the list. If the element is not in the list, then the time complexity is $\O(n)$ for all cases,
because we need to check every element in the list.

```{list-table} Time Complexity of Sequential Search
:header-rows: 1
:name: sequential_search_time_complexity

* - Case
  - Worst Case
  - Average Case
  - Best Case
* - Element is in the list
  - $\O(n)$
  - $\O(\frac{n}{2})$
  - $\O(1)$
* - Element is not in the list
  - $\O(n)$
  - $\O(n)$
  - $\O(n)$
```

#### Space Complexity

Space complexity: $\O(1)$ because we are keeping track of one boolean/index variable in the
loop. However, if we count the space of the list, then the space complexity is $\O(n)$ since
the list is of size $n$.

However, the consensus is that, if the list given is a constant list, and not part of the algorithm,
we will not count the size of the list, and thus the space complexity is $\O(1)$.

### Implementation (Recursive)

In [3]:
def unordered_sequential_search_recursive(
    container: Iterable[T], target: T, index: int = 0
) -> int:
    """Recursive implementation of unordered Sequential Search."""
    if len(container) == 0:  # if not container is also fine
        return -1  # not found

    if container[0] == target:  # this is base case
        return index  # found

    # notice we increment index by 1 to mean index += 1 in the iterative case
    return unordered_sequential_search_recursive(
        container[1:], target, index + 1
    )  # recursive case

In [11]:
unordered_list = [1, 2, 32, 8, 17, 19, 42, 13, 0]

print(unordered_sequential_search_recursive(unordered_list, -1)) # smaller than smallest element
print(unordered_sequential_search_recursive(unordered_list, 45)) # larger than largest element
print(unordered_sequential_search_recursive(unordered_list, 13)) # in the middle

-1
-1
7


Let's see if our implementation obeys the 3 Laws of Recursion ({prf:ref}`axiom_three_laws_of_recursion`).

We need to shrink our `container` list from $n$ all the way down, and at the same time,
keep track of our `index` to point to the correct index of the `container`.

1. We have two base cases:
    - in `lines 5-6`, we first check if the list is empty, if it is, means we reached till the end
of the list and have not found the `target` element, and hence return `-1`.
    - in `lines 8-9`, if the list's first element is the `target`, then return the `index` since we found it.
2. Has our recursive algorithm change its state and move towards our base case? Yes, because after each function call 
    at `lines 12-14`, we slice our list by `[1:]`, which means we drop the first element, and move on to check if the "next" element is our `target`.
    Here, we also need to increment `index` by 1 since we need to recover the index if we found the `target`.
3. This is a recursive algorithm because the function calls itself at `lines 12-14`.

```{admonition} Tip
:class: tip

Time to revisit this recursion for revision, especially understand how
recursion is stacking function calls and popping it later.

I also think converting from an iterative solution to recursive is easier
than just thinking of recursion straight. You just need to observe
what variables are changing in **states** in iterative,
and try to do the same to its recursive counterpart.
```

Using Python Tutor to visualize recursive calls [here](https://pythontutor.com/render.html#code=def%20f%28container,%20target,%20index%3D0%29%3A%0A%20%20%20%20if%20len%28container%29%20%3D%3D%200%3A%20%20%23%20if%20not%20container%20is%20also%20fine%0A%20%20%20%20%20%20%20%20return%20-1%20%20%23%20not%20found%0A%0A%20%20%20%20if%20container%5B0%5D%20%3D%3D%20target%3A%20%20%23%20this%20is%20base%20case%0A%20%20%20%20%20%20%20%20return%20index%20%20%23%20found%0A%0A%20%20%20%20%23%20notice%20we%20increment%20index%20by%201%20to%20mean%20index%20%2B%3D%201%20in%20the%20iterative%20case%0A%20%20%20%20return%20f%28container%5B1%3A%5D,%20target,%20index%20%2B%201%29%20%20%23%20recursive%20case%0A%20%20%20%20%0Aunordered_list%20%3D%20%5B1,%202,%2032,%208,%2017,%2019,%2042,%2013,%200%5D%0Aprint%28f%28unordered_list,%2013%29%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false).

<iframe width="800" height="500" frameborder="0" src="https://pythontutor.com/iframe-embed.html#code=def%20f%28container,%20target,%20index%3D0%29%3A%0A%20%20%20%20if%20len%28container%29%20%3D%3D%200%3A%20%20%23%20if%20not%20container%20is%20also%20fine%0A%20%20%20%20%20%20%20%20return%20-1%20%20%23%20not%20found%0A%0A%20%20%20%20if%20container%5B0%5D%20%3D%3D%20target%3A%20%20%23%20this%20is%20base%20case%0A%20%20%20%20%20%20%20%20return%20index%20%20%23%20found%0A%0A%20%20%20%20%23%20notice%20we%20increment%20index%20by%201%20to%20mean%20index%20%2B%3D%201%20in%20the%20iterative%20case%0A%20%20%20%20return%20f%28container%5B1%3A%5D,%20target,%20index%20%2B%201%29%20%20%23%20recursive%20case%0A%20%20%20%20%0Aunordered_list%20%3D%20%5B1,%202,%2032,%208,%2017,%2019,%2042,%2013,%200%5D%0Aprint%28f%28unordered_list,%2013%29%29&codeDivHeight=400&codeDivWidth=350&cumulative=false&curInstr=0&heapPrimitives=nevernest&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false"> </iframe>

## Ordered Sequential Search

Previously, we showed how to perform sequential search on a list, which does not assumes order.

We noticed that when the item is not in the list, the time complexity is $\O(n)$, because we need to check every element in the list. This can be alleviated if we assume that the list is ordered, and we can stop searching when we reach an element that is greater than the element we are searching for.

For now, we will assume the list contains a list of integers, but this can be generalized to other data types through
mapping. For example, we can map the alphabet to a list of integers, and then perform ordered sequential search on the list of integers.

### Algorithm (Iterative)

```{prf:algorithm} Basic Ordered Linear Search Algorithm (Iterative)
:label: basic_ordered_linear_search_iterative

Given an ordered list $L$ of $n$ elements with values or records $L_0, L_1, ..., L_{n-1}$
such that $L_0 \leq L_1 \leq ... \leq L_{n-1}$, and target value $T$, the following subroutine uses ordered linear search to find the index of the target $T$ in $L$.

1. Set $i$ to 0.
2. If $L_i = T$, the search terminates successfully; return $i$. Else, go to step 3.
3. If $L_i > T$, the search terminates unsuccessfully; return $-1$.
```

### Implementation (Iterative)

In [8]:
def ordered_sequential_search(container: Iterable[T], target: T) -> Tuple[bool, int]:
    """Sequential search for ordered container."""
    is_found = False  # a flag to indicate so your return is more meaningful
    index = 0
    for item in container:
        if item == target:
            is_found = True
            return is_found, index
        index += 1
        if item > target:
            return is_found, -1
    # do not forget this if not if target > largest element in container, this case is not covered
    return is_found, -1

The reason for not using `enumerate` to get the index of a number in 
a list when iterating is to minimize the
usage of in-built functions.

In [9]:
ordered_list = [0, 1, 2, 8, 13, 17, 19, 32, 42]
print(ordered_sequential_search(ordered_list, -1)) # smaller than smallest element
print(ordered_sequential_search(ordered_list, 45)) # larger than largest element
print(ordered_sequential_search(ordered_list, 13)) # in the middle

(False, -1)
(False, -1)
(True, 4)


#### Time Complexity

Note that for ordered sequential search, the time complexity does not change for the case
when the item is in the list.

However, for the case when the item is not in the list, we have our 
best case scenario to be $\O(1)$, because upon checking our first element,
and if the first element is already greater than the element we are searching for, then we can stop searching and return `False`.

For the worst case scenario, it is still $\O(n)$ since we have to check every element in the list.

But, for the average case, it is now $\O(\frac{n}{2})$, because we can stop searching when we reach an element that is greater than the element we are searching for.

```{list-table} Time Complexity of Ordered Sequential Search
:header-rows: 1
:name: ordered_sequential_search_time_complexity

* - Case
  - Worst Case
  - Average Case
  - Best Case
* - Element is in the list
  - $\O(n)$
  - $\O(\frac{n}{2})$
  - $\O(1)$
* - Element is not in the list
  - $\O(n)$
  - $\O(\frac{n}{2})$
  - $\O(1)$
```

#### Space Complexity

Similarly, the space complexity is still $\O(1)$.

## Further Readings

- https://www.geeksforgeeks.org/linear-search/
- https://runestone.academy/ns/books/published/pythonds/SortSearch/TheSequentialSearch.html
- https://en.wikipedia.org/wiki/Linear_search
- https://stackoverflow.com/questions/4295608/recursive-linear-search-returns-list-index
- https://ozaner.github.io/sequential-search/