## 26.1 Tractable and intractable problems

Before we look at two major classes of problems, we must transfer
the notion of complexity from algorithms to problems.

### 26.1.1 Problem complexity

The classification of problems is based on how efficiently they can be solved,
so the **complexity of a problem** is defined as
the complexity of the most efficient algorithm that solves it.
Like for algorithms, we could distinguish between the best-, average- and
worst-case complexity of a problem, but for the classification of problems
we're only interested in worst-case scenarios.

<div class="alert alert-warning">
<strong>Note:</strong> In this chapter, we only consider worst-case complexities
and I will therefore often omit the ' worst-case' adjective.
</div>

It's important to realise that the complexity of a problem is
the lowest complexity of *all* algorithms that solve the problem,
including those that haven't been discovered yet.
To be able to say that a problem has, say, quadratic complexity,
we must construct a quadratic algorithm that solves the problem *and*
we must *prove* that it's impossible to write a more efficient algorithm.

Consider the problem of sorting comparable items.
This problem has log-linear complexity because
there are log-linear algorithms that solve it, like merge sort and heapsort,
and there's [a proof](../14_Sorting/14_8_pigeonhole.ipynb#14.8.1-Comparison-sort-complexity) that
a log-linear number of comparisons is needed to sort the items.
Hence no algorithm with a lower complexity is possible.
(More efficient algorithms, like pigeonhole sort, only work
for items that can be sorted without comparing them.)

For many problems, like the TSP, their exact complexity is unknown because
there's currently no proof that the most efficient algorithm we know of
is also the most efficient algorithm there will ever be. In other words,
there's no proof that a more efficient algorithm is impossible.

All we can say for such problems is that their complexity is
*at most* the complexity of the most efficient known algorithm and
*at least* the complexity it takes to produce the output.
For example, the complexity of the TSP is at least linear, because
it takes linear time to copy all nodes to the output tour,
and at most exponential, because the best known algorithm has that complexity.
The exact complexity of the TSP could be linear, exponential,
or anything in between. We don't know which one it is yet.

### 26.1.2 Tractable problems

A polynomial is an expression of the form
$a_o + a_1×n + a_2×n² + ... + a_c×n^c$,
where the $a$'s and the $c$ are constants and $n$ is a numeric variable.
For example, 5 + 3*n* + 2.5*n*² + 0.5*n*⁴  is a polynomial with $a_3$ = 0.

<div class="alert alert-info">
<strong>Info:</strong> MST124 Unit&nbsp;3 Section&nbsp;1.6 introduces polynomials.
</div>

In complexity analysis,
only the highest term counts and constant factors are ignored, so
$$\mathrm{O}(a_o + a_1×n + a_2×n² + ... + a_c×n^c) = \mathrm{O}(n^c).$$
That's why we say that
an algorithm with complexity O($n^c$) has **polynomial complexity**,
or vice versa, that a **polynomial algorithm** has complexity O($n^c$).

Big-Oh indicates an upper bound, so having polynomial complexity means to have
complexity Θ($n^c$) or better. Therefore, any algorithm with
constant, logarithmic, linear, log-linear, quadratic or cubic complexity
is polynomial because it has complexity Θ(*n*³) or better.
Most M269 algorithms have polynomial complexity.

A problem is **tractable** if it has polynomial complexity, i.e. if
the most efficient algorithm that solves the problem
has complexity Θ($n^c$) or better, for input size $n$ and some constant $c$.
Most M269 problems are tractable.

We don't always need to know the exact complexity of a problem to know if
it's tractable: a single algorithm of polynomial complexity is sufficient.
For example, if the only known sorting algorithm were insertion sort,
that would be enough to show that the sorting problem is tractable, as follows:

1. Since the most efficient sorting algorithm can't be worse than
   insertion sort, it must have complexity Θ(*n*²) or better.
1. Since the most efficient sorting algorithm has complexity Θ($n^c$) or better
   for *c* = 2, it has by definition polynomial complexity.
1. Since the most efficient sorting algorithm has polynomial complexity,
   by definition the sorting problem has polynomial complexity.
1. Since the sorting problem has polynomial complexity,
   it is by definition tractable.

Usually, we don't spell out all these steps: we simply say that
the sorting problem is tractable because it is solved by a polynomial algorithm,
e.g. insertion sort, which has quadratic complexity.

<div class="alert alert-warning">
<strong>Note:</strong> To show that a problem is tractable you only need to indicate or construct
<em>one</em> polynomial algorithm that solves the problem.
</div>

### 26.1.3 Intractable problems

A problem is **intractable** if it's not tractable:
there's no polynomial algorithm for it.
The most efficient algorithm for an intractable problem has complexity higher
than O($n^c$). There are many such complexities but in M269 you learned
only two: the exponential and factorial complexities Θ(2ⁿ) and Θ(*n*!).

A problem that can be solved with an exponential or factorial algorithm
isn't necessarily intractable: there might be a polynomial algorithm that
solves it, making the problem tractable. For example,
the sorting problem can be solved with the factorial [bogosort](../14_Sorting/14_2_bogosort.ipynb#14.2-Bogosort)
algorithm, but the sorting problem is actually tractable, as explained above.

To classify a problem as intractable we must prove that no polynomial algorithm
solves it. There's a case where this can be easily proven, namely
if the size of the output is exponential or factorial in the size of the input.
In that case, even if each item in the output can be computed in constant time,
it will take an exponential or factorial time to produce the output,
and therefore no polynomial algorithm for it can exist.

For example, the problem of
[computing all subsets](../11_Search/11_5_subsets.ipynb#11.5.4-Code)
of a given set with *n* items is intractable, because there are 2ⁿ subsets.
Even if each subset were produced in constant time,
it would take exponential time to produce all of them,
and so no polynomial algorithm can solve the problem.

<div class="alert alert-warning">
<strong>Note:</strong> If the size of the output is exponential or factorial in the size of the input,
then the problem is intractable.
</div>

In Python, floating-point numbers have a fixed size (64&nbsp;bits),
but integers can be arbitrarily large. So
be careful when analysing the complexity of problems that have integer inputs,
because the complexity is the growth rate of the run-time with respect to
the input *size*, not the input *value*.

For example, the multiplication of integers *x* and *y* takes
constant time for 64-bit integers. But for arbitrarily large integers,
an algorithm that multiplies each digit of *x* with each digit of *y*
has complexity Θ(│*x*│ × │*y*│), the product of the sizes of both numbers.

The size of an integer *n* is │*n*│ = $\log_2$ *n* because that's the
least number of bits required to store *n*.
For example, │23│ = $\log_2$ 23 = 4.52.
In fact, 23 is written 10111 in binary, which takes 5&nbsp;bits,
but we'll ignore the rounding up because it doesn't affect the complexity.

Since the logarithm and exponentiation are inverse operations,
we have $n = 2^{\log_2 n} = 2^{│n│}$.
So, if we have a complexity expression in terms of the value $n$, then
the expression in terms of the size │$n$│ is obtained by
replacing $n$ with $2^{│n│}$, which in turn can be replaced with $2^{\log n}$
because we [ignore the base of logarithms](../13_Divide/13_2_decrease_half.ipynb#13.2.3-Complexity)
when analysing complexity.

For example, our [factorisation algorithm](../11_Search/11_2_factorisation.ipynb#11.2.3-Sort-candidates)
for a positive integer $n$ has complexity $Θ(\sqrt{n}) = Θ(n^{0.5})$.
This means the factorisation algorithm is polynomial in the input *value*
because the complexity is of the form Θ($n^c$).
However, to express the complexity in terms of the input *size* we must write
$$Θ(n^{0.5}) = Θ((2^{│n│})^{0.5}) = Θ((2^{\log n})^{0.5})
= Θ(2^{0.5 \log n}) = Θ(2^{\log n}).$$
(Remember that complexity analysis ignores constant factors.)
It thus turns out that factorisation is exponential in terms of the input size,
which is the log *n* bits required to represent *n*.

<div class="alert alert-info">
<strong>Info:</strong> MU123 Unit&nbsp;3 Section&nbsp;1.5 explains why $(x^y)^z = x^{y × z}$ and
Section&nbsp;3.4 explains why $\sqrt{x} = x^{0.5}$.
</div>

Algorithms that have polynomial complexity in the *value* (but not in the *size*)
of an integer input are called **pseudo-polynomial**.

<div class="alert alert-warning">
<strong>Note:</strong> A pseudo-polynomial algorithm seems polynomial but is in fact exponential.
</div>

### 26.1.4 The twilight zone

For some problems, we don't know if they're tractable or not.
This happens when the currently most efficient algorithm is *not* polynomial but
the output's size *is* polynomial in the size of the input and so
a polynomial algorithm *might* be possible.

For example, I wrote earlier that the exact complexity of the TSP is
at least linear and at most exponential.
So the problem could be tractable or intractable.

- If someone invents a polynomial algorithm for the TSP,
  then we know it's tractable.
- If someone proves that there can't be a polynomial algorithm for the TSP,
  then we know it's intractable.

<div class="alert alert-warning">
<strong>Note:</strong> If a problem's complexity is at most non-polynomial but could be polynomial,
then we don't know if the problem is tractable or intractable.
</div>

#### Exercise 26.1.1

Based only on what you have read in this book,
is the [interval scheduling](../18_Greed/18_1_scheduling.ipynb#18.1-Interval-scheduling) problem
tractable, intractable or can't you say either way?

_Write your answer here._

[Hint](../31_Hints/Hints_26_1_01.ipynb)
[Answer](../32_Answers/Answers_26_1_01.ipynb)

#### Exercise 26.1.2

Based only on what you have read in this book,
is the [0/1 knapsack](../23_Dynamic_Programming/23_3_knapsack.ipynb#23.3-Knapsack) problem
tractable, intractable or can't you say either way?

_Write your answer here._

[Hint](../31_Hints/Hints_26_1_02.ipynb)
[Answer](../32_Answers/Answers_26_1_02.ipynb)

#### Exercise 26.1.3

A group of tourists is visiting a city on foot.
They start the day in their hotel.
They know the restaurants where they will have lunch and dinner.
They will visit a museum immediately after lunch.

Given an unweighted, undirected, connected graph representing the city's
places and streets, we want to compute all paths from node *Hotel* to
node *Dinner* that go through nodes *Lunch* and *Museum* one after the other.

Is this problem tractable, intractable or can't you say?

_Write your answer here._

[Hint](../31_Hints/Hints_26_1_03.ipynb)
[Answer](../32_Answers/Answers_26_1_03.ipynb)

#### Exercise 26.1.4

Are pseudo-polynomial algorithms intractable?
(This may be considered a trick question.)

_Write your answer here._

[Hint](../31_Hints/Hints_26_1_04.ipynb)
[Answer](../32_Answers/Answers_26_1_04.ipynb)

⟵ [Previous section](26-introduction.ipynb) | [Up](26-introduction.ipynb) | [Next section](26_2_P_and_NP.ipynb) ⟶