## 2.7 Complexity

We want algorithms to be correct *and* fast, especially on large inputs.
The run-time of an algorithm, implemented as a Python function,
depends on the hardware, operating system and Python interpreter we're using,
and whether other processes are running in the background,
like checking for software updates.

Computer scientists found a way of talking about algorithms
that is independent of all these factors. Instead of getting bogged down with
the exact run-times for particular input values,
we look at how the run-times increase for ever-larger inputs. In other words,
what we really want to know is how well (or not) an algorithm copes with growing inputs.

### 2.7.1 Constant complexity

The algorithms that best cope with growing inputs are those where
the run-time stays roughly the same, no matter how small or large the input is.
Such algorithms are said to have constant run-time or **constant complexity**.
The term 'constant' doesn't mean that the run-time stays *exactly* the same
for all inputs: it means that it doesn't grow.

The **complexity** of an algorithm is the growth rate of its run-times
as inputs get larger, when executed on the same computational environment
(hardware, operating system, programming language and interpreter).
The complexity is *not* about how fast the algorithm runs. For example,
an addition algorithm that would take a whole day to find out the sum of 3 and 4
but also takes one day (in the same environment) to add two 500-digit numbers
would have constant complexity. A constant complexity algorithm may be slow,
but it won't get slower for larger inputs.

A simple way to see if an algorithm has constant complexity is
to implement the algorithm in some computational environment,
run it with ever larger inputs, measure the run-times
and see if they remain more or less the same.
A better approach is to determine the complexity of an algorithm
before implementing it, from its English description.
This prevents wasting effort in coding and testing algorithms
that turn out to be inefficient.
To determine the complexity of an algorithm
we have to agree on the complexity of each operation it uses.

M269 covers general-purpose algorithms, not specialised ones that
require humongous numbers with hundreds of digits, like in cryptography.
Even though Python supports arbitrarily large integers,
64-bit integers and floats are large enough for our purposes.

In [1]:
(2**63) - 1  # largest 64-bit integer; 1 bit is for the sign

9223372036854775807

Modern processors can do arithmetic operations on two 64-bit numbers
with a single hardware instruction, so for M269's purposes
we can assume that all [arithmetic operations](../02_Sequence/02_2_operations.ipynb#2.2-Arithmetic-operations)
(except exponentiation, which I explain later) have constant complexity.
We're *not* assuming that, for example, multiplication takes the same time as addition, but
rather that adding 3 and 7 takes about the same time as adding 3 million and 7 million,
and that multiplying 3 and 7 takes about the same time as multiplying 3 million and 7 million.

We also assume that assignments and return statements
have constant complexity because the work required is always the same,
no matter how small or large the value being named or returned is.
To be clear, we're not assuming that `x = expression` or `return expression`
always takes the same time, as that will depend on the expression.
However, once the expression is evaluated, assigning the value to a name or
returning the value is a constant-time operation.

If each instruction always takes some fixed amount of time,
and the number of instructions is fixed, i.e. doesn't depend on the inputs,
then the overall time the algorithm takes is also fixed. For example,
floor(*x* × *y* / *z*) consists of three constant-time arithmetic operations,
so the evaluation of the expression also takes constant time.
Multiplication, division and computing the floor all take different times,
but each takes a fixed time, independent of the values of its operands,
so the overall time is also fixed.

<div class="alert alert-warning">
<strong>Note:</strong> An algorithm that executes a fixed number of operations, each with constant complexity, has constant complexity.
</div>

The **Big-Theta notation** states the complexity in a concise and precise way.
If the run-time is constant, we say that the algorithm has complexity Θ(1), or
takes Θ(1) time, or has run-time Θ(1). The Θ(1) notation informally means
'proportional to 1', which is a roundabout way of saying 'constant' because
a value that is proportional to a constant (1 in this case) is also constant.
While constant complexity could also be written as Θ(2), Θ(57) or
with any other fixed value, the convention is to write Θ(1).

### 2.7.2 Linear complexity

In primary school we learned an algorithm that
adds two arbitrarily large integers digit by digit, from right to left,
carrying over 1 from one addition to the next when necessary.
Since adding two digits (possibly with a carry over) takes constant time,
the time to add two integers is directly proportional to
the number of digits of the longest integer,
e.g. 222 + 88 requires three digit additions, which are (from right to left)
2 + 8, 2 + 8 + 1 carry over, and 2 + 1 carry over.
If the number of digits of the longest integer doubles,
then addition will take double the time.

Algorithms where the run-time grows proportionally to the value or size of the
inputs have **linear complexity** or take linear time.
The **size** of an input is, strictly speaking, how much memory it occupies.
Since the memory allocated to an integer may vary across
computational environments, we use a proxy measure.
For the purposes of M269, the size of integer *n*,
written │*n*│, is the number of its decimal digits, e.g. │102│ = 3.

If the run-time is constant, then it doesn't depend on the inputs, but
for linear-time algorithms we have to state
how their run-time exactly depends on the inputs. For example,
the school algorithm for *x* + *y* is linear in max(│*x*│, │*y*│), i.e. its
run-time is proportional to the largest size of the two integers being added.

The Big-Theta notation Θ(...) indicates that an algorithm's run-time is
proportional to ..., so we can simply state:
the complexity of the school addition algorithm for integer inputs *x* and *y* is Θ(max(│*x*│, │*y*│)).

The school addition algorithm works for arbitrary large integers,
but in M269 we only use 64-bit integers, which have at most 19 decimal digits
(see the largest 64-bit above). So, even if computer hardware were to use the
school algorithm for adding *x* and *y*, the complexity would be at most
Θ(max(│*x*│, │*y*│)) = Θ(max(19, 19)) = Θ(19) = Θ(1).
In other words, while adding arbitrary large integers takes linear time,
adding integers with a bounded number of digits
(like 64 binary digits or 19 decimal digits) takes constant time,
because the run-time won't grow beyond what it takes to process two operands
that are the largest 64-bit numbers.

Likewise, we can expect subtraction, multiplication, division and modulo to
take longer the more digits they need to process. However, by assuming that
we will only deal with 64-bit numbers we can treat them all as constant-time operations.

Let's now consider exponentiation. (Remember that
we [don't use negative exponents](../02_Sequence/02_2_operations.ipynb#2.2.2-On-integers) in M269.)
For integers *x* and *y*, with *y* ≥ 0 , we have $x^y = 1×x×x×\ldots×x$.
Hence *x*⁰ = 1 requires zero multiplications,
*x*¹ = 1 × *x* = *x* requires one multiplication and
in general $x^y$ requires *y* multiplications. If each
multiplication takes constant time and if *y* doubles in value (not size!),
then the number of multiplications (and therefore the run-time) also doubles.
The exponentiation algorithm is therefore linear in the value of the exponent,
not in the number of its digits. We write that
the complexity of $x^y$ is Θ(*y*).

If the complexity of exponentiation depended on the *size* of the exponent, then we'd know
there would be at most 19 multiplications and we could treat exponentiation as
a constant-time operation, like we did for the other arithmetic operations.
But since the number of multiplications depends on the *value* of the exponent,
even if its size remains fixed, e.g. at 4 decimal digits, the number of multiplications
(and therefore the run-time) keeps growing, e.g. from 1000 to 9999.

Actually, it takes constant time to compute $x^y$ when *y* = 0,
because no multiplication is done.
So the complexity of the algorithm varies:
it's constant for *y* = 0 and linear for *y* > 0.
When the complexity is different for small inputs, we just ignore it,
because we're only interested in how an algorithm behaves for large inputs.
So, we keep stating that the complexity of exponentiation is linear in the
exponent's value, even though it's constant for one small exponent.

To sum up, Θ(*e*),
where *e* is an expression involving zero or more of the input variables,
means that the run-time is proportional to *e* for large inputs.
(It may or not be proportional to *e* for small inputs.)

Note that we assumed that multiplication takes constant time. In reality,
as we keep multiplying with a 64-bit number *x*, at some point the intermediate
result may not fit into 64&nbsp;bits and we can't assume that each further
multiplication takes constant time. So, for arbitrary integers *x* and *y*,
exponentiation by repeated multiplication actually takes more than linear complexity.
However, complexity analysis is a back of the envelope calculation to
approximately predict the growth of run-times, so we're entitled to make some
simplifying assumptions, as long as we clearly state them.

### 2.7.3 Mistakes

I wrote that *e* involves zero or more input variables because
an algorithm's complexity either doesn't depend at all on the inputs
(constant complexity) or depends on one or more of the inputs.
The variables that appear in the complexity expression must always be
some or all of the input variables, otherwise the complexity isn't defined.
For example, if a function definition starts like this:

**Function**: secret operation\
**Inputs**: *left*, an integer; *right*: an integer

then I can't write that an algorithm for this function has complexity Θ(*x*)
or Θ(max(*l*, *r*)) or Θ(│*y*│) because none of those variables are defined:
they don't refer to any of the inputs. I must write
Θ(*left*) or Θ(max(*left*, *right*)) or Θ(│*right*│) or whatever the complexity is.

<div class="alert alert-warning">
<strong>Note:</strong> Many texts always use the variable <em>n</em> in Big-Theta expressions,
without making clear to what the variable refers. Don't follow their example.
</div>

Another common mistake is to confuse the size and the value of an integer.
For example, if the complexity of $x^y$  were Θ(│*y*│), then
it would mean that the complexity is linear in the size of the exponent.
If that were so, $x^{44}$ would take double the time to compute as $x^{4}$,
because │44│ = 2 and │4│ = 1, when in fact it takes 11 times longer,
because $x^{44}$ requires 44 multiplications whereas $x^{4}$ requires four.
The complexity of exponentiation is linear in the value (not the size!)
of the exponent, i.e. it is Θ(*y*), not Θ(│*y*│).

#### Exercise 2.7.1

Here again is an algorithm for the circumference,
where *radius* is the input variable and *length* is the output variable.

1. let *diameter* be 2 × *radius*
2. let *length* be π × *diameter*

What is the complexity of this algorithm?
State it in words and with Big-Theta notation.

_Write your answer here._

[Hint](../31_Hints/Hints_02_7_01.ipynb)
[Answer](../32_Answers/Answers_02_7_01.ipynb)

⟵ [Previous section](02_6_py_functions.ipynb) | [Up](02-introduction.ipynb) | [Next section](02_8_time.ipynb) ⟶