# Lesson 2

## The Zebra Puzzle

We consider a famous puzzle known as the [Zebra Puzzle](https://en.wikipedia.org/wiki/Zebra_Puzzle).

1. There are five houses.
2. The Englishman lives in the red house.
3. The Spaniard owns the dog.
4. Coffee is drunk in the green house.
5. The Ukrainian drinks tea.
6. The green house is immediately to the right of the ivory house.
7. The Old Gold smoker owns snails.
8. Kools are smoked in the yellow house.
9. Milk is drunk in the middle house.
10. The Norwegian lives in the first house.
11. The man who smokes Chesterfields lives in the house next to the man with the fox.
12. Kools are smoked in the house next to the house where the horse is kept.
13. The Lucky Strike smoker drinks orange juice.
14. The Japanese smokes Parliaments.
15. The Norwegian lives next to the blue house.

Now, who drinks water? Who owns the zebra? 

We will develop a solution, discuss whether it is good enough, whether brute-force would work here, and do some back-of-the-envelope calculations.

## Concept Inventory

As usual, we start making an inventory of the relevant concepts we need to deal with. The first concept is **houses**: there are 5 of them. Then there are **properties** of the inhabitants of these houses, and precisely:

- Nationality
- Color of the house they live in.
- Pets they own.
- Drinks they drink.
- What they smoke.

Then there is a notion of **assignment** of the properties to houses. We can think of assigning, say, **color** blue to house number two, or Englishman to the red house. There is the **location** of the house (the first one, the middle one, etc, the house on the right or next to another house, and so on.)

The question is now, do we need to separate out the different types of assignment?

- Property name / description: Nationality / lived in.
- Do we just need the notion of a property group? For example, the Englishman, Norwegian, Japanese, Spaniard, and Ukranian belong to the same property, but we don't need to name it that way.
- Can we ignore this notion of grouping altogether.

The first two choices would both be reasonable choices, but the third one would not. For example, if red is assigned to house n.2, then blue cannot be assigned to house n.2, but orange juice can be assigned to house n.2. There is therefore an idea that properties inside a group are mutually exclusive and properties outside of a group are not, and this is an important concept to consider.

The key point is getting the assignments right. It requires a lot of clever reasoning, but we want to try a "trial and error" apperoach. If we put the Englishman in house 1, where can we put the Spaniard? Since they are members of the same property group, i.e., nationality, they cannot be in the same house, therefore we can place the Spaniard in house 2, 3, 4, or 5. For a single property group there are $5!$ possibilities. Since we have 5 property groups, we have ${5!}^5 = 24883200000$ possibilities, i.e., almost 25 billion! A back of the envelop estimate would be the following:

- $5! = 120 \approx 10^2$
- ${5!}^5 \approx {10^2}^5 = 10^{10}$
- We have approximated down from 120 to 100 in the beginning, so we may need to approximate up. Let's say 20 billion.

Modern computers can roughly do 1 billion operations per second on a "good second", i.e., one without page faults or cache misses. If we were in the range of millions, we would be fine. This is the "happy valley of computation". If we were in the range of trillions, we could say upfront that the brute force approach is not feasible. We are in the middle, so it's not clear.

How can we represent assignment? Here are three possibilities:

1. We represent houses as an array of sets, i.e., `house[1].add('red')`.
2. We represent houses as classes with properties, i.e., `house[1].color = 'red'`.
3. We just assign the house number to a `red` variable, i.e., `red = 1`.

All three would work, but the last one is simpler, and we will go with this one, at least as long as we don't have evidence that it is *too* simple, and that we need something more complicated.

The code below would take hours to run, so don't try to call `zebra_puzzle()`.

In [1]:
import itertools


def imright(h1, h2):
    "House h1 is immediately right of h2 if h1 - h2  == 1."
    return h1 - h2 == 1


def nextto(h1, h2):
    "Two houses are next to each other if they differ by 1."
    return abs(h1 - h2) == 1


def zebra_puzzle():
    "Return a tuple (WATER, ZEBRA) indicationg their house numbers."
    houses = first, _, middle, _, _ = [1, 2, 3, 4, 5]
    # The numbers below represent the constraints to the problem.
    # Returns the first (WATER, ZEBRA) tuple that satisfies all contraints.
    orderings = list(itertools.permutations(houses))  # 1
    return next((WATER, ZEBRA)
                for (red, green, ivory, yellow, blue) in orderings
                for (Englishman, Spaniard, Ukranian, Japanese, Norwegian) in orderings
                for (dog, snails, fox, horse, ZEBRA) in orderings
                for (coffee, tea, milk, oj, WATER) in orderings
                for (OldGold, Kools, Chesterfields, LuckyStrike, Parliaments) in orderings
                if Englishman is red              # 2
                if Spaniard is dog                # 3
                if coffee is green                # 4
                if Ukranian is tea                # 5
                if imright(green, ivory)          # 6
                if OldGold is snails              # 7
                if Kools is yellow                # 8
                if milk is middle                 # 9
                if Norwegian is first             # 10
                if nextto(Chesterfields, fox)     # 11
                if nextto(Kools, horse)           # 12
                if LuckyStrike is oj              # 13
                if Japanese is Parliaments        # 14
                if nextto(Norwegian, blue))       # 15

Using a generator expression has a few advantages:

1. It requires less indentation than would be necessary if we wrote all the `if` and `for` statements.
2. It does not need to perform all computations, as would be the case for a list comprehension. In other words, generator expressions can stop early.
3. It is easier to edit, for example organizing the constraints as shown above. We have an aligned structure rather than a nested indentation, so we can move things up and down as needed.

The program above is slow because it runs through all the $5!^5$ combinations, and then it filters them out. We can make the code above a bit faster. For example, the first two `for` statements go through the orderings of house color and nationality. The first `if` statement checks whether the Englishman lives in the red house. If he does not, there is no point in going through the other `for` loops, therefore, we can move that `if` statement to line 23.

There are more redundancies that can be removed just by moving certain lines up. The final result is shown below.

In [2]:
%%time

def zebra_puzzle():
    "Return a tuple (WATER, ZEBRA) indicationg their house numbers."
    houses = first, _, middle, _, _ = [1, 2, 3, 4, 5]
    # The numbers below represent the constraints to the problem.
    # Returns the first (WATER, ZEBRA) tuple that satisfies all contraints.
    orderings = list(itertools.permutations(houses))  # 1
    return next((WATER, ZEBRA)
                for (red, green, ivory, yellow, blue) in orderings
                if imright(green, ivory)          # 6
                for (Englishman, Spaniard, Ukranian, Japanese, Norwegian) in orderings
                if Englishman is red              # 2
                if Norwegian is first             # 10
                if nextto(Norwegian, blue)        # 15
                for (coffee, tea, milk, oj, WATER) in orderings
                if coffee is green                # 4
                if Ukranian is tea                # 5
                if milk is middle                 # 9
                for (OldGold, Kools, Chesterfields, LuckyStrike, Parliaments) in orderings
                if Kools is yellow                # 8
                if LuckyStrike is oj              # 13
                if Japanese is Parliaments        # 14
                for (dog, snails, fox, horse, ZEBRA) in orderings
                if Spaniard is dog                # 3
                if OldGold is snails              # 7
                if nextto(Chesterfields, fox)     # 11
                if nextto(Kools, horse))          # 12

print(zebra_puzzle())

(1, 5)
CPU times: user 133 µs, sys: 13 µs, total: 146 µs
Wall time: 149 µs


Norvig shows a function `timecall()` to measure the time used by any function call.

In [3]:
import time

def timecall(fn, *args):
    "Call a function and return the elapsed time."
    t0 = time.clock()
    result = fn(*args)
    f1 = time.clock()
    return time1 - time0, result

There are several aspects we need to consider when writing a program.

1. Is the program correct?
2. Is the program efficient?
3. Is the program easy to debug?

We don't want the code responsible for these different aspects to be all mixed up. The idea of keeping them separate is called "aspect oriented programming".

Where do we do the counting? When we iterate over `orderings`. We can wrap `orderings` in a function call that produce some debugging information for us. Norvig uses a one-character function name `c(orderings)`.

## Generator Functions

The function `c()` will take advantage of generator functions. As an example of generator function, suppose we want a function like `range()`, but that returns also the upper bound. We can write it like this:

In [4]:
def ints(start, end):
    i = start
    while i <= end:
        yield i
        i += 1

list(ints(3, 7))

[3, 4, 5, 6, 7]

If we want to add the possibility of an infinite sequence, we can set `end=None` and modify the function above as follows:

In [5]:
def ints(start, end=None):
    i = start
    while i <= end or end is None:
        yield i
        i += 1

## Nitty Gritty For Loops

What a `for` loop actually does is to take an *iterable* `x` (strings, lists, generators and so on), and call something like

```python
it = iter(x)
```

And then it runs something like 

```python
while True:
    z = next(it)
    print(z) # or something else
```

When the iterable is consumed, it returns the `StopIteration` exception.

In [6]:
def zebra_puzzle():
    "Return a tuple (WATER, ZEBRA) indicationg their house numbers."
    houses = first, _, middle, _, _ = [1, 2, 3, 4, 5]
    # The numbers below represent the constraints to the problem.
    # Returns the first (WATER, ZEBRA) tuple that satisfies all contraints.
    orderings = list(itertools.permutations(houses))  # 1
    return next((WATER, ZEBRA)
                for (red, green, ivory, yellow, blue) in c(orderings)
                if imright(green, ivory)          # 6
                for (Englishman, Spaniard, Ukranian, Japanese, Norwegian) in c(orderings)
                if Englishman is red              # 2
                if Norwegian is first             # 10
                if nextto(Norwegian, blue)        # 15
                for (coffee, tea, milk, oj, WATER) in c(orderings)
                if coffee is green                # 4
                if Ukranian is tea                # 5
                if milk is middle                 # 9
                for (OldGold, Kools, Chesterfields, LuckyStrike, Parliaments) in c(orderings)
                if Kools is yellow                # 8
                if LuckyStrike is oj              # 13
                if Japanese is Parliaments        # 14
                for (dog, snails, fox, horse, ZEBRA) in c(orderings)
                if Spaniard is dog                # 3
                if OldGold is snails              # 7
                if nextto(Chesterfields, fox)     # 11
                if nextto(Kools, horse))          # 12

def instrument_fn(fn, *args):
    # c.starts counts the number of times we start iterating over orderings
    # c.items counts the number of times we go through a loop 
    c.starts, c.items = 0, 0
    result = fn(*args)
    print(f'{fn.__name__} got {result} with {c.starts:05} iterations over {c.items:07} items')

def c(sequence):
    """Generate items in sequence, keeping counts as we go. c.starts is the number 
    of sequences started. c.items is the number of items generated."""
    c.starts += 1
    for item in sequence:
        c.items += 1
        yield item

Note that `c()` does not work alone (for example if you try `c([1, 2, 3])`) because `c.starts` and `c.items` are not in scope. It only works when called from `instrument_fn`, as shown below.

In [7]:
instrument_fn(zebra_puzzle)

zebra_puzzle got (1, 5) with 00025 iterations over 0002775 items


## Summary

In this lesson we have

1. Made the concept inventory.
2. Refined ideas.
3. Chosen the simplest implementation we could think of.
4. Did a back-of-the-envelope calculation to see how long it would run.
5. We refined the code to make it faster.
6. We built tools for timing, counts and so on.
7. We introduced this idea of "separation of aspects" to keep the program clean.