## 13.6 Divide and conquer

A decrease-and-conquer algorithm may divide the input into multiple parts,
but only conquers (i.e. solves the problem for) one of them.
Binary search is an example: the input sequence is
divided into two halves, but only one of them is searched.

A divide-and-conquer algorithm conquers more than one part, usually all of them,
and then combines their solutions.
The [multiple recursion](../12_Recursion/12_7_multiple.ipynb#12.7-Multiple-recursion) examples
in the previous chapter are divide-and-conquer algorithms.

### 13.6.1 Complexity

Let *n* be the size of the input and
*s* be the size of the smallest input, which is necessarily a base case.
Let's assume the algorithm divides the input into *p* parts of
equal or nearly equal size. Then its complexity is defined by

- if *n* = *s*: T(*n*) = Θ(*b*)
- if *n* > *s*: T(*n*) = Θ(*d*) + *p* × T(*n* / *p*) + Θ(*c*)

where Θ(*b*) is the complexity of handling the base case,
Θ(*d*) is the complexity of dividing the input and
Θ(*c*) is the complexity of combining the subsolutions for the parts.

The expression *p* × T(*n* / *p*) is the time it takes to solve
the *p* subproblems, each of size *n* / *p*.

Let's analyse the complexity of the divide-and-conquer algorithm for
maximum(*numbers*, *start*, *end*), presented in the previous chapter.
Remember that the input sequence isn't empty.

#### Maximum with slicing

The first algorithm presented was:

1. let *n* be │*numbers*│
1. if *n* = 1:
   1. let *solution* be head(*numbers*)
1. otherwise:
   1. let *middle* be floor(*n* / 2)
   1. let *largest left* be maximum(*numbers*[0:*middle*])
   1. let *largest right* be maximum(*numbers*[*middle*:*n*])
   1. let *solution* be max(*largest left*, *largest right*)

The base case has size *s* = 1 and takes constant time to process (step&nbsp;2.1).
Steps 3.1 to 3.3 take linear time to divide the input into *p* = 2 parts.
Step&nbsp;3.4 takes constant time to combine the subsolutions.
We have:

- if *n* = 1: T(*n*) = Θ(1)
- if *n* > 1: T(*n*) = Θ(*n*) + 2 × T(*n* / 2) + Θ(1) = 2×T(*n* / 2) + Θ(*n*).

It has been proven that this corresponds to T(*n*) = Θ(*n* × log *n*).
This is called **log-linear** complexity.
It's slightly worse than linear but much better than quadratic complexity,
because logarithmic run-times grow very slowly as the input size grows.
In maths, the multiplication operator is omitted when that causes no confusion,
so we usually write Θ(*n* log *n*).

<div class="alert alert-info">
<strong>Info:</strong> Log-linear complexity is also called linearithmic complexity.
</div>

#### Maximum without slicing

The second version presented was:

1. if *start* + 1 = *end*:
   1. let *solution* be *numbers*[*start*]
2. otherwise:
   1. let *middle* be *start* + floor((*end* – *start*) / 2)
   1. let *largest left* be maximum(*numbers*, *start*, *middle*)
   1. let *largest right* be maximum(*numbers*, *middle*, *end*)
   1. let *solution* be max(*largest left*, *largest right*)

The base case has size *s* = 1 and takes constant time to process (step&nbsp;1.1).
Steps 2.1 to 2.3 take constant time to divide the input into *p* = 2 parts.
Step&nbsp;2.4 takes constant time to combine the subsolutions.
We have:

- if *n* = 1: T(*n*) = Θ(1)
- if *n* > 1: T(*n*) = Θ(1) + 2 × T(*n* / 2) + Θ(1) = 2×T(*n* / 2) + Θ(1).

It has been proven that this corresponds to T(*n*) = Θ(*n*).

#### General comments

The direct expressions for T(*n*) remain the same for any other *p* > 2,
as long as dividing and combining takes constant time. In other words,
dividing into more than two parts and combining their results doesn't reduce the
complexity but complicates the algorithm and increases the run-time.
Therefore, most divide-and-conquer algorithms just divide the input into halves.

<div class="alert alert-warning">
<strong>Note:</strong> If T(<em>s</em>) = Θ(1), where <em>s</em> is the smallest input size, and
T(<em>n</em>) = <em>p</em> × T(<em>n</em> / <em>p</em>) + Θ(1) for <em>n</em> > <em>s</em> and <em>p</em> > 1,
then T(<em>n</em>) = Θ(<em>n</em>).
If instead T(<em>n</em>) = <em>p</em> × T(<em>n</em> / <em>p</em>) + Θ(<em>n</em>), then T(<em>n</em>) = Θ(<em>n</em> log <em>n</em>).
</div>

If a divide-and-conquer algorithm, like the one above, does the same steps
for all inputs, i.e. there's no input for which it stops early, then
the complexity obtained is both the best- and worst-case complexity.
Otherwise, the recursive definition captures the worst-case complexity.

The analysis shows that it's not worth computing the maximum with
a divide-and-conquer algorithm: it isn't more efficient than a much
simpler iterative linear search. The next chapter presents two examples
in which divide and conquer pays off.

Divide and conquer is a good approach if implemented in a parallel fashion
to take advantage of multi-processor hardware.
Each recursive call can be executed as a separate thread that
works independently on its part of the input. The operating system allocates
each thread to an available processor, reducing the time the user waits
for the result, compared to executing the algorithm in one thread.
Writing a multi-threaded algorithm requires special libraries or
programming language constructs that are outside the scope of M269.

<div class="alert alert-info">
<strong>Info:</strong> The Operating Systems block of TM129 introduces threads.
</div>

⟵ [Previous section](13_5_variants.ipynb) | [Up](13-introduction.ipynb) | [Next section](13_7_summary.ipynb) ⟶