## 13.2 Decrease by half

Progressing towards the base case one item at a time is slow.
This and the next section show more efficient decrease-and-conquer algorithms
that decrease the input each time by a substantial amount.

One way is to decrease the input by a **constant factor** *f* rather than by
a constant amount *a*, i.e. the size or value is reduced to *n* / *f*
rather than *n* – *a*. Usually *f* = 2, i.e. the input is decreased by half.

### 13.2.1 Problem

Consider the [exponentiation operation](../02_Sequence/02_2_operations.ipynb#2.2.2-On-integers):
compute $b^{n}$ for integers *b* (the base) and *n* (the exponent),
with non-negative *n*.

A decrease-by-one definition is similar to the factorial:

- if *n* = 0: $b^{n} = 1$
- if *n* > 0: $b^{n} = b^{n-1} \times b$.

This effectively multiplies *b* by itself *n* times, in linear time.
The algorithm follows directly from the recursive definition,
so I skip to the code. For the mathematical notation,
it's convenient to have single-letter variable names, but
when writing code they should be descriptive.

In [1]:
def power_by_one(base: int, exponent: int) -> int:
    """Return the base to the power of the exponent.

    Preconditions: exponent >= 0
    """
    if exponent == 0:
        return 1
    else:
        return power_by_one(base, exponent - 1) * base


power_by_one(3, 20) == 3**20  # test with Python's power operator

True

### 13.2.2 Algorithm

Using some properties of exponentiation,
we can halve the exponent instead of decreasing it by one,
to do fewer multiplications.
For example, 4⁶ = 4³ × 4³. Assuming 4³ requires three multiplications,
we need four instead of six multiplications to obtain 4⁶.
By halving the exponent we get twice the same expression
and only compute it once.
If the exponent is odd, we need one extra multiplication, e.g.
4⁵ = 4² × 4² × 4. The general recursive definition is:

- if *n* = 0: $b^{n} = 1$
- if *n* > 0 and is even: $b^{n} = b^{n/2} × b^{n/2}$
- if *n* is odd: $b^{n} = b^{(n-1)/2} \times b^{(n-1)/2} \times b$.

The definition has one base case and two recurrence relations.
They cover all possible values of *n*.
For example, if *n* = 1 then the last case applies and we have
*b*¹ = *b*⁰ × *b*⁰ × *b* = 1 × 1 × *b* = *b*.

Here's the algorithm, with an auxiliary variable for the subsolution
to avoid recomputing it.

1. if *n* = 0:
   1. let *solution* be 1
1. otherwise:
   1. let *subsolution* be power(*b*, floor(*n* / 2))
   1. if *n* mod 2 = 0:
      1. let *solution* be *subsolution* × *subsolution*
   1. otherwise:
      1. let *solution* be *subsolution* × *subsolution* × *b*

The last steps could also be written as:

2. otherwise:
    1. let *subsolution* be power(*b*, floor(*n* / 2))
    1. let *solution* be *subsolution* × *subsolution*
    1. if *n* mod 2 = 1:
       1. let *solution* be *solution* × *b*

I prefer the first alternative: its intent is clearer, in my opinion.

### 13.2.3 Complexity

Each recursive call takes constant time because it does at most four arithmetic
operations: integer division, modulo and one or two multiplications.
The complexity is therefore *r* × Θ(1) = Θ(*r*),
where *r* is the number of recursive calls.

Each extra recursive call can handle up to double the value of the exponent.
With *r* recursive calls, the algorithm can handle any *n* up to $2^r$.
You've seen this exponential growth rate
[before](../11_Search/11_5_subsets.ipynb#11.5.3-Complexity):
every item added to a set doubles the number of subsets the set has.

What we're really interested in is the inverse relationship:
how $r$ grows in terms of the input $n$.
The inverse of the exponential is the logarithm: $\log_b b^y = y$
for any real number $b > 1$.
The notation $\log_b n$ is read 'logarithm of $n$ to base $b$'.
For this problem, $n = 2^r$, so $\log_2 n = \log_2 2^r = r$. We say that
the exponentiation algorithm has **logarithmic complexity** Θ($\log_2 n$).

Actually, the base of the logarithm doesn't matter for complexity analysis
because it has been shown that the logarithms of the same number
in different bases only differ by a constant factor.
Thus, $\log_a n$ and $\log_b n$ have the same growth rate for
any bases $a$ and $b$, and we just write Θ($\log n$) without any base.

<div class="alert alert-info">
<strong>Info:</strong> Logarithms are covered in MU123 Unit&nbsp;13 Sections 4 and 5,
and in MST124 Unit&nbsp;3 Section&nbsp;4.
</div>

The safest way to analyse recursive algorithms is to write the
recursive definition of T and see which pattern it follows.
For this algorithm we have:

- if *n* = 0: T(n) = Θ(1)
- if *n* > 0: T(n) = T(floor(*n* / 2)) + Θ(1).

Whether the algorithm halves an even exponent or halves and rounds down an
odd exponent makes no difference to the complexity, so we can simply write
T(*n*) = T(*n* / 2) + Θ(1). It has been proven that such a
recursive definition leads to T(*n*) = Θ(log *n*).

<div class="alert alert-warning">
<strong>Note:</strong> If T(0) = Θ(1) and T(<em>n</em>) = T(<em>n</em> / 2) + Θ(1), then T(<em>n</em>) = Θ(log <em>n</em>).
</div>

When introducing [run-time measurements](../02_Sequence/02_8_time.ipynb#2.8.1-Checking-growth-rates),
I noted that although we assumed $b^{n}$ to take Θ(*n*)
to do *n* constant-time multiplications,
Python's interpreter took less than linear time to compute it.
We henceforth assume exponentiation takes logarithmic time in *n*.

### 13.2.4 Code and performance

Let's implement the decrease-by-half approach.

In [2]:
def power_by_half(base: int, exponent: int) -> int:
    """Return the base to the power of the exponent.

    Preconditions: exponent >= 0
    """
    if exponent == 0:
        return 1
    else:
        subsolution = power_by_half(base, exponent // 2)
        if exponent % 2 == 0:
            return subsolution * subsolution
        else:
            return subsolution * subsolution * base


power_by_half(3, 20) == 3**20

True

Since the complexity depends on the exponent only,
to measure the run-time I use always the same base,
start with a not too small exponent and double it each time.

In [3]:
exponent = 20
while exponent <= 200:
    %timeit -r 5 -n 10_000 power_by_one(3, exponent)
    exponent = 2 * exponent

1.21 μs ± 246 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)
1.83 μs ± 45 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)
3.86 μs ± 13.4 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)
10.5 μs ± 133 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)


The doubling of the run-time confirms the algorithm is linear in the exponent.
Now the decrease-by-half approach.

In [4]:
exponent = 20
while exponent <= 200:
    %timeit -r 5 -n 10_000 power_by_half(3, exponent)
    exponent = 2 * exponent

342 ns ± 4.37 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)
413 ns ± 1.22 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)
496 ns ± 0.582 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)
592 ns ± 0.717 ns per loop (mean ± std. dev. of 5 runs, 10,000 loops each)


The run-time increases by a fixed amount each time
because doubling the exponent requires a single extra multiplication.

<div class="alert alert-warning">
<strong>Note:</strong> If doubling the input size increases the run-time by a fixed amount,
then the complexity is logarithmic.
</div>

An exponential function with integer base greater than one grows very fast;
the logarithm function with the same base thus grows very slowly.
For example, $2^{20}$ is about one million, so computing $b^{1,000,000}$
takes just $\log_2$ 1,000,000 ≈ 20 recursive calls!
Even if each one does two multiplications (the worst case),
40 multiplications is far better than doing a million of them.
The efficiency gain is tremendous, even compared to a linear algorithm.
If you find a logarithmic algorithm for a problem, you're on to a winner.

⟵ [Previous section](13_1_decrease_one.ipynb) | [Up](13-introduction.ipynb) | [Next section](13_3_variable_decrease.ipynb) ⟶