## 22.6 Generate subsets

Backtracking can also solve problems on sets instead of sequences.
The only thing that changes is the core tree-traversal algorithm.
The rest – handling constraints, computing the value of each candidate,
keeping track of the best solution so far, etc. – remains the same.
So let's see how a tree traversal can generate all subsets of the given items.

When we tried to [greedily solve the knapsack problem](../18_Greed/18_1_scheduling.ipynb#Exercise-18.1.1),
we first put all items into a sequence, sorted from best to worst.
The algorithm then picked one item at a time.
If the knapsack still had capacity for that item,
it was added to the output set; otherwise it was skipped.

A tree-traversal algorithm to generate all subsets works in the same way.
It starts with an empty candidate set and
with all items in a sequence of extensions.
Recursively, the algorithm takes each item in turn from the extensions and
adds it to the candidate set or skips it.

Here's the tree of candidate–extension pairs for generating subsets of {1, 2, 3}.
In each node, the candidate is the subset constructed so far and
the sequence has the numbers yet to consider.

<p id="fig-22.6.1"></p>

*[Figure 22.6.1](../33_Figures/Figures_22_6.ipynb#Figure-22.6.1)*

![Image 22_6_subsets.png](22_6_subsets.png)

The root's candidate is the empty set and the root's extensions are
all items (here the numbers from 1 to 3), in any order.
Each node has two children, both with the first extension removed.
The left child adds the extension to the candidate; the right child doesn't.

Like for the tree of permutations, the complete candidates,
with empty extensions, are in the leaves of this tree.
The other nodes have partial candidates.
Due to skipping items, some subsets occur multiple times, e.g.
two nodes have subset {1, 2} and four nodes have the empty set.
We only consider the leaves because only then do we know
that no further items will be added.

The tree traversal for subsets looks like this,
using `s[i:]` as shorthand for `s[i:len(s)]`
([Section&nbsp;4.9.1](../04_Iteration/04_9_summary.ipynb#4.9.1-Sequence-operations)).

In [1]:
def extend(candidate: set, extensions: list, solutions: list) -> None:
    """Extend candidate with all subsets of extensions and add to solutions."""
    print("Visiting node", candidate, extensions)
    if len(extensions) == 0:  # complete candidate
        solutions.append(candidate)
    else:
        item = extensions[0]  # head of sequence
        rest = extensions[1:]  # tail of sequence
        extend(candidate.union({item}), rest, solutions)  # add item
        extend(candidate, rest, solutions)  # skip item

Like for generating permutations,
the base case is when the extensions are empty and
the reduction step (the assignment to `rest`) removes one extension.

Like for generating permutations, we must start the algorithm in the root node.

In [2]:
def all_subsets(n: int) -> list:
    """Return all subsets of 1, ..., n in the order generated."""
    candidate = set()
    extensions = list(range(1, n + 1))
    solutions = []
    extend(candidate, extensions, solutions)
    return solutions


print("Subsets:", all_subsets(3))

Visiting node set() [1, 2, 3]
Visiting node {1} [2, 3]
Visiting node {1, 2} [3]
Visiting node {1, 2, 3} []
Visiting node {1, 2} []
Visiting node {1} [3]
Visiting node {1, 3} []
Visiting node {1} []
Visiting node set() [2, 3]
Visiting node {2} [3]
Visiting node {2, 3} []
Visiting node {2} []
Visiting node set() [3]
Visiting node {3} []
Visiting node set() []
Subsets: [{1, 2, 3}, {1, 2}, {1, 3}, {1}, {2, 3}, {2}, {3}, set()]


If you follow how nodes are visited in the tree diagram,
you'll see the algorithm is making a pre-order traversal.
It chooses to add the item (left subtree) before choosing
to skip it (right subtree), so the full subset {1, 2, 3}
is generated first and the empty subset is generated last.

The algorithm could of course first choose to skip and then
choose to add the item. It would generate the same subsets, in reverse order.
You can swap the order of the code lines that add and skip an item,
to see for yourself.

Changing the order of the extensions, e.g. `extensions = list(range(n, 0, -1))`,
also changes the order in which subsets are generated.
I'll leave it to you to try it out.

Now that we have a recursive algorithm to generate each subset incrementally,
we can search for a best subset that satisfies some constraints in exactly the
same way as we did for sequences.
The next section provides an example: the knapsack problem.

⟵ [Previous section](22_5_tsp.ipynb) | [Up](22-introduction.ipynb) | [Next section](22_7_knapsack.ipynb) ⟶