# Nested Comprehensions

I love comprehensions. They let us replace many lines of loops with a single concise sentence.
Python is considered a "declarative" language - that means that instead of issuing commands like a cooking recipe ("take 2 eggs, break them, mix in a bowl") it *declares* variables, functions or calculations in a single expression ("make a bowl of 2 mixed eggs").

Python has blessed us with the ability to make a comprehension not just out of loops, but also out of *nested* loops.
To my surprise, I found out that nested comprehensions behave counter-intuitively - very uncharacteristic for Python, which I always considered a very English-like language.

Take for example the following nested comprehension:

In [1]:
l = [(i, j) for i in range(3) for j in range(3, 6)]

On which index are we iterating first? If we would look at the 3 first items in the list - will they have different `i`'s or different `j`'s?
My first answer was that `i` varies faster - we generate the tuple `(i,j)` for 3 different `i`'s, then we do that repeatedly for 3 different `j`'s.

I was wrong:

In [2]:
print(l)

[(0, 3), (0, 4), (0, 5), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)]


Let's see another example.
We want to generate a flat list of all uppercase letters from a list of words.

Intuitively I would write something like this:

In [2]:
animals = ['dog', 'cat', 'ape']
[char.upper() for char in animal for animal in animals]

NameError: name 'animal' is not defined

But this raises an error - `animal` is referenced before it is defined.
Why?

Because Python interprets this in the opposite order - this comprehension really translates to:

In [3]:
l = []
for char in animal:
    for animal in animals:
        l.append(char)

NameError: name 'animal' is not defined

Indeed, PEP 202 (https://peps.python.org/pep-0202/) states that nested comprehensions behave like nested loops - the last index varying the fastest.
There is a core logic here - Python always reads "top-to-bottom, left-to-right", and to keep consistency with nested loops (which read top-to-bottom) comprehensions can be read as converting this form exactly to left-to-right.

But in the case of nested comprehensions its easy to get confused because the holy "left-to-right" rule is not kept.
The resulting expression is positioned *in the start* of the comprehension, instead of at the end like in nested loops.
In our example - if this was a single loop we would phrase it as the (very english) sentence "take `char.upper()` for `char` in `animal`"; so naturally we feel like a double loop should just continue with "... for every `animal` in `animals`".

That's why the correct syntax reads pretty weird:

In [4]:
[char.upper() for animal in animals for char in animal]

['D', 'O', 'G', 'C', 'A', 'T', 'A', 'P', 'E']

## Why Shouldn't It Be Different

Maybe the Python developers have made a mistake? Will it be more simple to write comprehension with the *first* index varying the fastest?

It's obvious that for the interpreter itself it's easier to read comprehensions left-to-right: it can use the same code that parses regular loops for parsing comprehensions (only ignoring the `:` and indentations).

But there may be another good reason why not to make comprehensions read in the opposite order.
Often in loops we use a mechanism called "early skipping" - if some condition is met (or not met), we skip the rest of the loop for that iteration:

In [2]:
for i in range(3):
    if i > 0:
        # Do a lot of work...
        print("Work is done")

Work is done
Work is done


Comprehensions can also do early skipping:

In [2]:
# the list will contain the return value of 'print', which is not really useful;
# We use 'print' only to see work getting done
l = [print("Work is done") for i in range(3) if i > 0]

Work is done
Work is done


If we have *nested* loops, then we can have multiple early skips - in each iteration we can decide if we want to continue to the next loops. 

Say we have a dictionary of words; we want to iterate only over food categories and get every food starting with the letter 'A'.

In [4]:
dictionary = {
    'Fruits': ['Apple','Banana','Coconut'],
    'Vegetables': ['Artichoke', 'Broccoli', 'Cucumber'],
    'Animals': ['Ant', 'Baboon', 'Cheetah']
}

Solving this with regular loops is straightforward:

In [6]:
for category in dictionary:
    if category in ['Fruits', 'Vegetables']:
        for item in dictionary[category]:
            if item.startswith('A'):
                print(item)

Apple
Artichoke


How will this look with nested comprehensions?

In [5]:
[
    item
    for category in dictionary
    if category in ['Fruits', 'Vegetables']
    for item in dictionary[category]
    if item.startswith('A')
]

['Apple', 'Artichoke']

Suddenly, it is very convenient that the comprehension is arranged exactly as top-to-bottom syntax!
Not only that it reads exactly the same as nested loops, but it is also super clear that when we reach a non-edible category we early-skip it.

Let's assume you were convinced by my suggestion to switch the order of the comprehension; the result would look like this:

In [None]:
[
    item
    for item in dictionary[category]
    if item.startswith('A')
    for category in dictionary
    if category in ['Fruits', 'Vegetables']
]

I find it harder to understand the **order** in which iterations are early-skipped here - are we first filtering for every item starting with 'A' or for every food category?

Remember that comprehensions, like loops, can nest for very long.
Will it be convenient to read this many lines only to find out that the last line is a condition that is always skipped?
I believe not; in loops with skipping conditions, its better to do early skipping, well, *early*.

So although the expression starts with `item` - which hints us towards English-like syntax ("for each item in the category, for each category in the dictionary...") - in complicated cases it's convenient that we can convert comprehensions to the familiar top-to-bottom form (convenient for programmers, at least).
And if you are used to the `black` style formatting where each iteration or condition gets its own line - that makes it even more easier :)

Hopefully this notebook un-entangled the syntax of nested comprehensions, and may your iterations always be simple and concise.