## 13.3 Variable decrease

The third kind of decrease-and-conquer algorithm reduces the size or value of
the input by a **variable amount**,
i.e. a possibly different amount in every recursive call or iteration.
An example is Euclid's algorithm for the greatest common divisor (GCD),
which you saw in the BBC programme in the
[first week](../01_Introduction/01_1_expectations.ipynb#1.1.1.1-Activities).

### 13.3.1 Problem

The GCD of two positive integers is the greatest positive integer
that divides both without a remainder, e.g. gcd(5, 15) = 5 and gcd(3, 7) = 1.
It can be formulated as a search problem:
given positive integers _a_ and _b_, find the largest integer _n_ that
is a factor of _a_ and of _b_.

#### Exercise 13.3.1

Before looking at Euclid's algorithm, sketch a brute-force search algorithm.

_Write your answer here._

[Hint](../31_Hints/Hints_13_3_01.ipynb)
[Answer](../32_Answers/Answers_13_3_01.ipynb)

### 13.3.2 Algorithm

The BBC programme gave a geometric explanation of Euclid's algorithm:
the GCD is the side of the largest square tile that
covers a rectangle of a given width and length.
If you want to watch it again, it's in [part&nbsp;1](https://learn2.open.ac.uk/mod/oucontent/view.php?id=1550051)
from about 5:30 to 8:30.

We start with a rectangle of area _width_ × _length_,
with _width_ < _length_.
We then fill the rectangle as much as possible with square tiles
of area _width_ × _width_.
The rectangle that remains to be filled has now area
_width_ × (_length_ mod _width_).
By definition of the modulo operation, _length_ mod _width_ < _width_,
so if we take the length to always be the longest side, the width of
the original rectangle is now the length of the remaining rectangle.

We repeat the process until no rectangle to be filled remains,
i.e. when one of its sides is zero.
At that point, the side of the last tile used is the GCD.

If we define the operation so that the first argument is the width
and the second the length, and we allow the width to become zero,
the definition is:

- if _width_ = 0: gcd(_width_, _length_) = _length_
- if _width_ > 0: gcd(_width_, _length_) = gcd(_length_ mod _width_, _width_).

The programme's example is:
gcd(150, 345) = gcd(45, 150) = gcd(15, 45) = gcd(0, 15) = 15.

The recursive algorithm is trivial: it just follows the recursive definition.

1. if _width_ = 0:
   1. let _factor_ be _length_
2. otherwise:
   1. let _factor_ be gcd(_length_ mod _width_, _width_)

Being a tail-recursive algorithm, the output for the last call is
the output for the initial call because there's no combination step.
The algorithm just decreases the input
until it computes the solution for the base case.
This can be done iteratively too.

1. while _width_ > 0:
   1. let _new width_ be _length_ mod _width_
   2. let _length_ be _width_
   3. let _width_ be _new width_
4. let _factor_ be _length_

As Euclid's algorithm shows, decrease and conquer is an old technique that
predates computers. All proper books on algorithms include an ancient algorithm.

### 13.3.3 Complexity

In the best case, the length is a multiple of the width, i.e. the modulo is
zero, and the algorithm computes the result in two calls or one iteration,
e.g. gcd(4, 400) = gcd(0, 4) = 4. The best-case complexity is Θ(1).

In the worst case, it can be proven that the second argument, the length,
decreases by at least half in each recursive call or iteration,
as the BBC's example illustrates, so this algorithm is logarithmic too.

However, the modulo can decrease the length to substantially less than one half.
In the example, the length decreases at one point from 150 to 45,
to less than a third, and then again from 45 to 15, exactly one third.
The number of calls or iterations is at most the logarithm of
the initial length, but for many input values it's much less.

When we don't know the exact growth rate for each input, e.g. because
each iteration or recursive call decreases a value by a variable amount,
but we know its upper bound, we use the **Big-Oh** notation.
For this algorithm, we write O(log _length_) instead of Θ(log _length_).

The difference between using Big-Oh and Big-Theta is like saying
'Bob is at most 6 feet tall' and 'Bob is exactly 6 feet tall':
the former gives an upper bound whereas the latter is precise.

<div class="alert alert-warning">
<strong>Note:</strong> For decrease-and-conquer algorithms with a variable decrease,
use Big-Oh notation instead of Big-Theta.
</div>

The Big-Oh notation can be used for the best- and worst-case complexity.
For example, if the best-case complexity is O(_n_), this means that
in the best case the algorithm takes at most linear time in _n_
(whatever _n_ means for the problem at hand), but it may take less time,
e.g. logarithmic time, we just don't know for which inputs.
Normally, the best-case scenario is quite clear and the complexity
can be stated precisely, with the Big-Theta notation.

The worst-case complexity is always an upper bound of the best-case complexity,
so in many websites and texts you'll read statements of the form
'this algorithm has complexity O(...)'. What those authors often mean is
that the algorithm has worst-case complexity Θ(...)
and some lower complexity for the best case, which they're not interested in.

Every growth rate is an upper bound for a slower growth rate, e.g.
2ⁿ grows faster than *n*² which in turn grows faster than _n_.
So an algorithm with complexity Θ(_n_) also has complexity O(*n*²) and O(2ⁿ).
Although that's technically correct, it's useless information.
It's akin to saying that Bob is at most 3 metres tall:
it really doesn't give a clue about Bob's real height.
If you have to use Big-Oh notation because you can't state the
best- or worst-case complexity precisely for all inputs, then
give an upper bound as low as you can.

⟵ [Previous section](13_2_decrease_half.ipynb) | [Up](13-introduction.ipynb) | [Next section](13_4_binary_search.ipynb) ⟶