# Lesson 4

This unit is about solving problems where the solution can be expressed as a sequence of steps. The unit starts at [video 229](https://www.youtube.com/watch?v=q6M_pco_5Vo&list=PLAwxTw4SYaPnJVtPvZZ5zXj_wRBjH0FxX&index=229&pp=iAQB).

## The Pouring Problem

We have two glasses that can contain 9 and 4 oz of water respectively, and we have a faucet and a sink where we can refill or empty either glass. The final goal is to have exactly 6 oz of water in the larger glass. The small one can contain any amount.

There are 6 possible actions:

1. Pouring from glass X to glass Y.
2. Pouring from glass Y to glass X.
3. Filling glass X.
4. Filling glass Y.
5. Emptying glass X.
6. Emptying glass Y.

There are two ways to transfer from X to Y and vice-versa. We can pour from X to Y until

1. Y is full.
2. X is empty. 

### Inventory of Concepts

1. Collection of glasses (we start with two but we can generalize).
2. Total capacity of glass Z (Z $\in$ {X, Y}).
3. Current water level in glass Z.
4. Our goal (6 oz in the large glass).
5. The six pouring actions.
6. A notion of a _solution_. For us, a solution is a sequence of steps from the initial to the goal configuration.

The collection of glasses and their current level provides a complete description of the current state of the world.

### Combinatorial Complexity

For each state of the world there are approximately six possible actions (not all possible at all times). If the solution requires $n$ steps, there are in the order of $n^6$ possible configurations. However we don't know in advance the value of $n$, so the actual complexity of the problem is not known in advance. These are examples of **combinatorial optimization** but we will refer to them as **search** problems.

### Exploring the Space

Our solution is a collection of states, where X = 6 and Y can be anything between 0 and 4 inclusive. At any given step of the exploration, the farthest states we have explored represent the **frontier**. We can then distinguish three sets of states.

1. The goal state(s).
2. The explored states that represent the frontier.
3. The explored states that are not in the frontier.

To find a solution, we keep expanding the frontier until it overlaps one of the solution states. In this kind of problems there are two sitations we want to be able to deal with:

1. When there is no path from the frontier to the solution.
2. If there is a path, we want to be able to find it efficiently, and we want to avoid ending in an infinite loop.

If there is a finite number of states, and if there is a path from the start configuration to the solution, we should be able to find it. We want a strategy, i.e., criteria to decide which states we will explore next.

We define **successors** the set of states you can reach from a given state, and the steps needed to get to those states.

We use `X` and `Y` to indicate the total capacity of the glasses and `x` and `y` to indicate their current level. We represent our paths as lists containing an alternation of states and actions: `[state, action, state, ...]`. The frontier is technically a set of states, but we represent it as a list of _paths_. If no solution is found, we return an empty list (for consistency with the success case).

### MY QUESTIONS

Q.: Why do we need `new_path`? Can't we just append to `path`?
A.: No, we cannot. If we did, this would happen: let's consider the case where line 19 returns $k$ states with $k > 1$, none of which is in `explored` yet. We update `path = path + [action, state]` and we append it to the frontier. We then consider the next successor, and we do the same thing. This would append the new `[action, state]` pair to the one we appended in the previous step, which is clearly wrong.

### Solution

The `successors` function returns a dictionary of `state: action` pairs.

In [2]:
def successors(x, y, X, Y):
    """Return a dict of {state: action} pairs describing what can be reached
    from the (x, y) state and how."""
    assert x <= X and y <= Y  # (x, y) is glass levels; X and Y are glass sizes
    return {
        ((0, y + x) if y + x <= Y else (x - (Y - y), y + (Y - y))): "X -> Y",
        ((x + y, 0) if x + y <= X else (x + (X - x), y - (X - x))): "X <- Y",
        (X, y): "fill X",
        (x, Y): "fill Y",
        (0, y): "empty X",
        (x, 0): "empty Y",
    }

The `pour_problem()` function takes the full capacities of the two containers, the goal, the start configuration, and returns a list of state, action pairs.

In [3]:
from collections import deque

Fail = []

def pour_problem(X, Y, goal, start=(0, 0)):
    """X and Y are the capacity of glasses; (x, y) is current fill levels
    and represents a state. The goal is a level that can be in either glass.
    Start at start state and follow successors until we reach the goal.
    Keep track of frontier and previously explored; fail when no frontier."""
    if goal in start:
        return [start]  # The trivial path.
    explored = set()  # set of states we have visited
    # A path is a list [state, action, state, ...]. A frontier is a list of paths.
    frontier = deque([[start]])  # ordered list of paths we have blazed
    while frontier:
        path = frontier.popleft()
        # The last element of a path is the last visited state.
        (x, y) = path[-1]  # Last state in the first path of the frontier
        for state, action in successors(x, y, X, Y).items():
            # If the state is in explored do nothing.
            if state not in explored:
                explored.add(state)
                path2 = path + [action, state]
                if goal in state:
                    return path2
                else:
                    frontier.append(path2)
    return Fail

In [32]:
pour_problem(9, 4, 6, (9, 4))

[(9, 4),
 'empty Y',
 (9, 0),
 'X -> Y',
 (5, 4),
 'empty Y',
 (5, 0),
 'X -> Y',
 (1, 4),
 'empty Y',
 (1, 0),
 'X -> Y',
 (0, 1),
 'fill X',
 (9, 1),
 'X -> Y',
 (6, 4)]

[Video 235](https://www.youtube.com/watch?v=i46-urAXvPU&list=PLAwxTw4SYaPnJVtPvZZ5zXj_wRBjH0FxX&index=235) describes the use of doctests, which are a convenient way to test our code. Norvig calls them via `print(doctest.testmod())`. **TODO** review video 235.

## The Bridge Problem

This is the problem where there are four people on one side of the bridge (`here`) and they want to cross. People can only cross one or two at a time.

1. Person.
2. A collection of people. Two actually:
    - A collection of people on the `here` side.
    - A collection of people on the `there` side.
3. The light.
4. The concepts of states and paths we have seen before.

How about the representaiton? We can represent the four people by the number of minutes it takes them to cross the bridge. For a collection of people we have various choices:

- Tuple.
- List.
- Set.
- Frozenset.

Which ones would be well suited? The representation should be hashable. Why? Because if we want to use the same methods as in the pour problem, we are going to use a `explored` set, and sets must contain hashable objects.

Tuples and frozensets are hashable. Sets and lists are not hashable. Note that sets, like lists, are pass-by-reference data structures.

In [1]:
s1 = set([10, 20, 30])
s2 = s1
s1.pop()
print(s2)
del s1, s2

{20, 30}


All four of the above representations would be fine, but only tuples and frozensets are hashable. Norvig opts for a representation of a state as `(here, there, t)` where `here` represents everything on this side, `there` is everything on the other side, and `t` is the total elapsed time. `here` and `there` will be represented as frozensets. We include the flashlight in the frozenset as the string `'light'`. At the beginning,`here` is the frozenset `{1, 2, 5, 10, 'light'}` and `there` is the empty frozenset.

What are the successors of the initial state? At the end of the first step the light will go from `here` to `there`, with at least one person. All combinations of sending 1 or 2 people to the other side are acceptable. There are ${4 \choose 2} = 6$ choices for 2 people and 4 for one person, for a total of 10.

In [None]:
from itertools import combinations

def foo(fzset):
    people = fzset.difference({'light'})


def successors(state):
    """Return a dict of {state: action} pairs. A state is a (here, there, t)
    tuple, where here and there are frozensets of people (indicated by their
    times) and/or the 'light', and t is a number indicating the elapsed time.
    Action is represented by '->' for here to there and '<-' for there to here."""
    here, there, t = state