## 13.5 Binary search variants

Binary search can be adapted to other problems, for example
if we don't know the value being searched for, only some property of it.

To design a binary search we must answer these questions:

1. When can we know the search is unsuccessful and stop?
2. When can we know the search is successful and stop?
3. How do we decide whether to search the left or the right half of the sequence?

For the basic binary search in the previous section, the answers are:

1. When the sequence is empty.
2. When the middle item is the sought item.
3. If the sought item is smaller than the middle item, search the left half;
   otherwise search the right half.

### 13.5.1 Transition

Consider the problem of finding the transition between negative and positive
numbers in an ascending sequence. More precisely, we want the index of the first
positive number. Let's assume there's always at least one.
We don't know its value: it might be 1, 5486, or anything else.
Still, we can use binary search to find it.

<div class="alert alert-info">
<strong>Info:</strong> This problem is inspired by LeetCode problem
<a href="https://leetcode.com/problems/first-bad-version/">278</a>.
</div>

We're assuming there's a positive number in the sequence, so
the search is always successful and question 1 doesn't apply.

What's the answer to question 2, i.e. when can we stop, having found
a positive number? Can we stop when the middle number is positive?

___

No we can't, because there might be other positive numbers to its left.
We only stop when the current slice has a single number: it must be positive.

What's the answer to question 3? How do we determine which half to search?

___

If the middle number is not positive, any positive numbers must come after it
because the sequence is ascending, so we search the right half.
Otherwise we search the left half. We can't exclude the middle item when
searching the left half, because it might be the first positive number.

Here's an algorithm for function first_positive(*numbers*, *start*, *end*),
with 0 ≤ *start* < *end* ≤ │*numbers*│.
Contrary to previous binary searches, the input sequence isn't empty
(there's at least one positive number),
so the start index is strictly smaller than the end index.

1. if *end* − *start* = 1:
   1. let *first* be *numbers*[*start*]
2. otherwise:
   1. let *middle* be *start* + floor((*end* – *start*) / 2)
   1. let *middle item* be *numbers*[*middle*]
   1. if *middle item* > 0:
      1. let *first* be first_positive(*numbers*, *start*, *middle* + 1)
   2. otherwise:
      1. let *first* be first_positive(*numbers*, *middle* + 1, *end*)

Since we're working on slices, we're following the convention of not including
the end index. Step&nbsp;2.3.1 must therefore set it to *middle* + 1 in order to
include the middle item.

This raises the question of whether the slice is always reducing its length.
Length one is already handled as a base case by step&nbsp;1.
Let's assume the length is two, i.e. *end* – *start* = 2. In that case

*middle* = *start* + floor((*end* – *start*) / 2) = *start* + floor(2 / 2) = *start* + 1

which means that the middle number is the second and last number of the slice.
It can't be negative because any positive number would have to come after it, but there are no more numbers in a slice of length&nbsp;2.
So, the middle (actually last) number of the two must be positive.
The algorithm will execute step&nbsp;2.3.1 but *middle* + 1 = *start* + 2 = *end*, which means that the recursive call will be made on the same slice.
To sum up, when the slice has only two numbers, the second, which must be positive, is chosen as the middle number and the recursive call doesn't decrease the slice.

We must handle this input size as a separate base case and choose
either the first number, if both are positive, or else the second number.

2. otherwise if *end* – *start* = 2:
   1. if *numbers*[*start*] > 0:
      1. let *first* be *numbers*[*start*]
   1. otherwise:
      1. let *first* be *numbers*[*start* + 1]
3. otherwise:
   1. ...

Next we must analyse the case *end* – *start* = 3. Again,

*middle* = *start* + floor((*end* – *start*) / 2) = *start* + floor(3 / 2) = *start* + 1

but now this means that *start* < *middle* + 1 < *end*. So, whether
the algorithm takes the left half (*start* to *middle* + 1) or
the right half (*middle* + 1 to *end*), the new slice is smaller than the
input slice from *start* to *end* and there's no risk of infinite recursion.

<div class="alert alert-warning">
<strong>Note:</strong> Check that the recursive calls reduce the input's size or value.
Any case for which they don't must be handled as a base case.
</div>

#### Exercise 13.5.1

Implement the inner auxiliary function below recursively
and run the tests.

In [1]:
from algoesup import test


def first_positive(numbers: list) -> int:
    """Return the first (lowest index) positive integer in numbers.

    Preconditions:
    - numbers is a list of integers in ascending order
    - numbers has a positive integer
    """

    def in_slice(start: int, end: int) -> int:
        """Return the first positive number within numbers[start:end].

        Preconditions: 0 <= start < end <= len(items)
        """
        pass

    return in_slice(0, len(numbers))


first_positive_tests = [
    # case,             numbers,            first
    ('one number',      [1],                    1),
    ('is last',         [-2, -2, 0, 3],         3),
    ('all positive',    [2, 3, 4],              2),
    ('all but first',   [0, 1, 2, 2, 2, 2, 2],  1),
]

test(first_positive, first_positive_tests)

[Answer](../32_Answers/Answers_13_5_01.ipynb)

#### Exercise 13.5.2

Implement the function iteratively. The docstring isn't repeated.

In [2]:
def first_positive(numbers: list) -> int:  # noqa: D103
    pass


test(first_positive, first_positive_tests)

[Hint](../31_Hints/Hints_13_5_02.ipynb)
[Answer](../32_Answers/Answers_13_5_02.ipynb)

There's a more efficient version with a more general base case:
if the start number is positive, no matter how long the slice is,
then we've found the first positive integer and we can stop.
Once a slice of only positive numbers is obtained, this version stops
whereas the version above continues decreasing the slice until
it has only one or two numbers. This new version has worst-case complexity
O(log │*numbers*│), since it's more efficient for some inputs.
The recursive algorithm starts as follows:

1. if *numbers*[*start*] > 0:
   1. let *first* be *numbers*[*start*]
2. otherwise if *end* – *start* = 2:
   1. let *first* be *numbers*[*start* + 1]
2. otherwise:
   1. ...

If the slice has two numbers (step&nbsp;2 is true) and the first one isn't positive
(step&nbsp;1 is false), then the second one must be positive (step&nbsp;2.1).

### 13.5.2 Right number in the right place

Given *numbers*, an ascending sequence of integers without duplicates,
we want to know if there's an index *i* such that *numbers*[*i*] = *i*.
For example, (1, 2, 3) doesn't have any number that matches its index,
but for (-1, 0, 2), number 2 is at index&nbsp;2.

Like the previous problem, this one can be easily solved with a linear search,
but you can do much better than that.

#### Exercise 13.5.3

Solve this decision problem using iterative or recursive binary search.
You can outline an algorithm or write it in full, whatever you prefer.
Think about the three questions at the start of this section.

_Write your answer here._

[Hint](../31_Hints/Hints_13_5_03.ipynb)
[Answer](../32_Answers/Answers_13_5_03.ipynb)

⟵ [Previous section](13_4_binary_search.ipynb) | [Up](13-introduction.ipynb) | [Next section](13_6_divide.ipynb) ⟶