# Python structure generators (comprehensions)

In Python, many compound structures (e.g. lists, sets) can be declared not only in a static way, but also algorithmically. In that case we don’t simply write down what elements a compound data structure should contain — we declare how to create them.

When we generate lists this way, in many languages it’s usually called a *list comprehension*. Python can generate other structures the same way too, so you can also have *set comprehensions*.

To generate list elements, we can use the `for`, `in`, and `if` keywords. The expression `expression for item in container` is interpreted as: compute the expression for every element of the container, and in the expression we will call the current element `item`.

So:



In [None]:
primes = [2, 3, 5, 7, 11, 13, 17, 19]

prime_squares = [x*x for x in primes]

print(prime_squares)

[4, 9, 25, 49, 121, 169, 289, 361]


After the `in` keyword, anything can appear that you can take elements from.
Such an object is called an *iterable*. (It’s called that because you can *iterate* over it, i.e. step through it.)
Examples of iterables are `list`, `set`, `dict`, or generators.

Let’s see it with a generator:

In [None]:
numbers = range(10)

# In Python, the exponentiation operator is **
cubes = [num ** 3 for num in numbers]
print(cubes)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]


During generation we can also specify a condition — i.e. when we want to include an element in the list and when we don’t:

In [None]:
data = [-1, 8, 6, 12, -99, 103]

positive_data = [x for x in data if x > 0]

print(positive_data)

[8, 6, 12, 103]


It’s also possible to take elements not only from one iterator, but from multiple at the same time and compute all combinations:

In [None]:
adjectives = ['small', 'big', 'beautiful']
animals = ['dog', 'cat']

[adj + " " + animal for adj in adjectives for animal in animals]

['small dog',
 'small cat',
 'big dog',
 'big cat',
 'beautiful dog',
 'beautiful cat']

You can read the above roughly as: we produce the elements of the list by concatenating the adjective, a space, and an animal, while the adjective takes the elements of `adjectives` one by one, and the animal takes the elements of `animals` one by one.

In [None]:
[adj + " " + animal
 for adj in adjectives
 for animal in animals
]

In [None]:

# Is there a number whose first and last digit are the same,
# and which is the sum of the cube and square of a natural number less than 100?

numbers = [x**3 + x**2 for x in range(100)]
matches = [x for x in numbers if str(x)[0] == str(x)[-1]]

print(matches)

[0, 2, 252, 6156, 20412, 230702, 242172, 291852, 689216]


In [None]:

# Is there a pair of (prime < 100) and (square < 100),
# such that when I add them, the result is divisible by 1234?

primes_lt_100 = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
squares = [x*x for x in range(100)]
matches = [(n, p) for n in squares for p in primes_lt_100 if (n+p) % 1234 == 0]
print(matches)




[(2401, 67), (9801, 71)]


In [None]:
# What numbers can be formed from sets a and b such that
# we choose exactly one element from each and multiply them?

a = {8, 12, 9, 31, 14}
b = {2, 4, 11, 7, 6, 3}

products_set = { x*y for x in a for y in b}

print(products_set)

{132, 16, 18, 24, 154, 27, 28, 32, 36, 42, 48, 54, 56, 186, 62, 63, 72, 84, 341, 88, 217, 93, 98, 99, 124}


Why did we use a `set` here and not a `list`?

## Generator

The `for-in` structure can also be used inside simple parentheses (even as a function argument).
So we can write something like:
```python
g = (x*x for x in element_list)
```
Here `g` will not be a tuple (as we know, the essence of tuples is the comma, not the parentheses), but something else: a generator. A generator isn’t really a “container structure”, but it’s a similar concept. It can yield elements without actually having a concrete in-memory structure behind it. You can also use a generator in other comprehensions or in a `for` loop.
It’s worth noting that printing it (e.g. with `print`) is usually not useful: it will only show that it is a generator, not its “contents”.

Generators are tricky because they can be exhausted!
For example, if you created it to generate elements of a finite list, it will yield them one by one — and after they run out, it won’t yield anything.

We’ve seen something similar before: `range()`.
When we wrote `range(10000)`, there weren’t 10,000 actual integers stored in memory — it just created them and yielded them one by one.

In [10]:
elements = [1, 9, 3, 2, 5]
g = (x*x for x in elements) # not a tuple, but a generator
type(g)

generator

In [None]:
print(g) # this is not very useful....

In [12]:
# but of course you can convert it to anything:
list(g) # or set(g) or tuple(g) ....

[]

In [14]:
# but now your generator is exhausted and
# if you try again, there is nothing in it...
list(g)

[]

Because this “exhaustion” is very confusing, we usually only use this generator form when we are sure we need it only once.
For example, as an argument to a function that expects an iterable.
Here, however, it’s very useful (and we don’t even have to write the parentheses again):

In [None]:
# sum of squares:
sum(x*x for x in elements)

In [None]:
# all letter combinations from two sets:
",".join(p+q for p in "abc" for q in "xyz")

In [None]:
range_values = range(10)
list(range_values) # convert to a list
list(range_values) # and try again

Task:

We have gears with the following tooth counts:
8,11,13,17,18,24,36,68,72

We want to place exactly two gears in sequence.
Can we achieve a fourfold (4:1) gear ratio? If yes, how?

Create a list comprehension that generates the correct gear pairs.

Hint: if two 8-tooth gears are placed in sequence, that is a 1:1 ratio. If a 16-tooth driving gear is followed by an 8-tooth driven gear, that is 8:16, i.e. a 1:2 ratio, because while the 16-tooth turns once, it turns the 8-tooth twice (twice as fast and half as strong).

