## 22.2 Prune the search space

Backtracking takes the brute-force search of the previous section
and adds a simple but powerful idea:
since the candidates are generated incrementally, one item at a time,
stop extending a candidate as soon as it's clear it won't lead to a solution.
This substantially reduces the number of candidates generated,
making backtracking much more efficient than brute-force.

Let's see backtracking in action on the problem of the previous section:
find all sequences of non-repeated numbers, taken from 1 to *n* > 2, such that

1. the first and last numbers are at least *n* / 2 apart (range)
2. the numbers are odd, even, odd, even, ... (parity).

The sequences don't have to be permutations of 1 to *n*:
they can include only some of the *n* numbers.

Here again is the code that checks these constraints.

In [1]:
def satisfies_range(candidate: list, n: int) -> bool:
    """Check if first and last numbers in candidate are at least n/2 apart.

    Preconditions: candidate is a list of integers; n > 2
    """
    return len(candidate) > 1 and abs(candidate[0] - candidate[-1]) >= n / 2


def satisfies_parity(candidate: list) -> bool:
    """Check if candidate is an odd, even, odd, ... sequence.

    Preconditions: candidate is a list of integers
    """
    for index in range(len(candidate)):
        if index % 2 == candidate[index] % 2:
            return False
    return True

### 22.2.1 Local and global constraints

The key insight that enables pruning the search space is that
the two constraints for this problem are of different nature.
The range constraint involves the first and last numbers of the sequence
and therefore can only be checked on the whole sequence:
it's a **global constraint**.
The second constraint is about the parity of each number,
independently of the other numbers: it's a **local constraint**.

If a partial candidate P doesn't satisfy a global constraint,
an extension of P may satisfy it because of the added items.
For example, for *n* = 3
the sequence (1, 2) doesn't satisfy the range constraint but (1, 2, 3) does.
We therefore must keep extending a candidate that fails a global constraint.

However, if partial candidate P doesn't satisfy a local constraint,
neither does any candidate C that extends P because
the item in P that breaks the local constraint is also in C.
For example, if P starts with an even number,
or has two consecutive odd numbers, then so does any extension of P.
This means that there's no point in extending a partial candidate
that violates a local constraint: it won't lead to a solution.

If a candidate fails the local constraints, a **backtracking algorithm**
goes immediately back (hence its name) to a previous partial candidate
and tries a different way to extend it.

In terms of tree traversal, if a node has a candidate that fails
the local constraints, a backtracking algorithm doesn't traverse the subtree
rooted at that node: it instead goes back to the node's parent.
From there it starts traversing the next sibling subtree.
If there's no sibling, the algorithm backtracks to the grandparent, and so on.

The changes to the previous recursive generate-and-test algorithm are minor:
I simply add a new base case. If the candidate fails the local constraints,
the algorithm returns (backtracks) immediately instead of
recursively extending the candidate.

In [2]:
def extend(candidate: list, extensions: set, n: int, solutions: list) -> None:
    """Add to solutions all valid permutations that extend candidate.

    Preconditions: n > 2 and
    - candidate is a list of integers between 1 and n
    - extensions is a set of integers between 1 and n
    - candidate and extensions have no integer in common
    """
    print("Visiting node", candidate, extensions)
    # base case 1: backtrack
    if not satisfies_parity(candidate):
        return
    # base case 2: candidate is solution
    # local constraint is satisfied, so only check global constraint
    if satisfies_range(candidate, n):
        solutions.append(candidate)
    for item in extensions:
        extend(candidate + [item], extensions - {item}, n, solutions)

The main function remains the same.

In [3]:
def valid_permutations(n: int) -> list:
    """Return all valid permutations of 1, ..., n in the order generated."""
    candidate = []
    extensions = set(range(1, n + 1))  # {1, ..., n}
    solutions = []
    extend(candidate, extensions, n, solutions)
    return solutions


print("Solutions:", valid_permutations(3))

Visiting node [] {1, 2, 3}
Visiting node [1] {2, 3}
Visiting node [1, 2] {3}
Visiting node [1, 2, 3] set()
Visiting node [1, 3] {2}
Visiting node [2] {1, 3}
Visiting node [3] {1, 2}
Visiting node [3, 1] {2}
Visiting node [3, 2] {1}
Visiting node [3, 2, 1] set()
Solutions: [[1, 2, 3], [3, 2, 1]]


Now only 10 of the 16 nodes of the full tree are visited. As you can see,
sequences (1, 3), (2) and (3, 1) aren't further extended because
they break the parity constraint.

As suggested in [Chapter&nbsp;15](../15_TMA02-1/15_1_exhaustive_search.ipynb#15.1.3-Generate),
we can further reduce the search space by not even generating those sequences,
since they will be rejected.

### 22.2.2 Avoid visits

The algorithm currently first extends a candidate and then
checks whether it should backtrack.
Since the local constraint applies to each number individually,
we can check the parity of the chosen extension number
*before* appending it to the sequence.
In other words, instead of creating and visiting a node and then backtracking
if needed, we avoid creating the node in the first place.
This further prunes the search space.

Here's the new version, without repeating the docstring.
The check for the local parity constraint moves to the for-loop:
if the chosen number can extend the current sequence, it will.

In [4]:
def extend(candidate: list, extensions: set, n: int, solutions: list) -> None:  # noqa: D103
    print("Visiting node", candidate, extensions)
    if satisfies_range(candidate, n):
        solutions.append(candidate)
    for item in extensions:
        if can_extend(item, candidate):  # added line
            extend(candidate + [item], extensions - {item}, n, solutions)

Instead of an explicit base case that goes back to a previous candidate
if the current one doesn't satisfy the local constraint, the new version
avoids extending candidates with items that fail the local constraint.

The new auxiliary function checks the local constraint on a single number
instead of a sequence and hence is simpler than `satisfies_parity`.

In [5]:
def can_extend(item: int, candidate: list) -> bool:
    """Check if extending candidate with item can lead to a solution."""
    # the number and the index where it will be must have different parity
    return item % 2 != len(candidate) % 2


valid_permutations(3)

Visiting node [] {1, 2, 3}
Visiting node [1] {2, 3}
Visiting node [1, 2] {3}
Visiting node [1, 2, 3] set()
Visiting node [3] {1, 2}
Visiting node [3, 2] {1}
Visiting node [3, 2, 1] set()


[[1, 2, 3], [3, 2, 1]]

As expected, no partial candidate with consecutive numbers of the same parity or
starting with an even number is generated.
Now only 7 of the 16 nodes are created and visited.
The search space has more than halved.

To emphasise how much more efficient backtracking can be, consider *n* = 10
and only complete candidates, i.e. permutations of the ten numbers.
There are 5 × 9! ≈ 1.8 million permutations starting with one of the five even
numbers (2, 4, 6, 8, 10) followed by a permutation of the other nine numbers.
There are further 5 × 4 × 8! ≈ 800 thousand permutations starting with two
consecutive odd numbers, followed by a permutation of the other eight numbers.
Backtracking won't generate these 2.6 million permutations and
thousands of other ones with consecutive numbers of the same parity, whereas
brute-force search generates all 10! ≈ 3.6 million permutations.

To sum up, backtracking can efficiently find sequences of items subject to
global and local constraints because it generates candidates incrementally.
Global constraints are checked on all candidate sequences or
only on the complete ones, depending on the problem.
Local constraints are checked for each extension being considered,
to avoid generating candidates that won't lead to solutions.
This usually prunes vast parts of the search space.
If a problem has no local constraints then backtracking becomes
brute-force search because all candidates have to be generated.

The next section shows a typical problem that can be solved with backtracking.

⟵ [Previous section](22_1_sequences.ipynb) | [Up](22-introduction.ipynb) | [Next section](22_3_trackword.ipynb) ⟶