# Comprehensions and Generators

Two of the most idiomatic concepts in Python are _comprehensions_ and _generators_. The goal of comprehensions is to improve the readability of loops that create iterables (and reduce the runtime of such a loop given the right circumstances). Generators are used to create an iterable that only produces its elements when they are needed.

Let's walk through some examples to illustrate the concepts.


## Comprehensions

Comprehensions can be created for most iterables. The most common one is the list comprehension. Let's show how it works by calculating the squares of a set of numbers:


In [None]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

squares = []
for number in numbers:
    squares.append(number**2)
print(squares)

Now let's write a list comprehension that does the same thing as the loop above:


In [None]:
squares = [number**2 for number in numbers]
print(squares)

We can also include a condition and thus filter some of the values:


In [None]:
squared_evens = [number**2 for number in numbers if number % 2 == 0]
print(squared_evens)

There are also set comprehensions:


In [None]:
duplicate_numbers = numbers + numbers
squared_evens = {number**2 for number in duplicate_numbers if number % 2 == 0}
print(duplicate_numbers)
squared_evens

And even dictionary comprehensions:


In [None]:
fruits = ["apple", "banana", "orange", "grapes"]
prices = [0.99, 0.50, 0.89, 2.99]

fruit_prices = {fruit: price for fruit, price in zip(fruits, prices)}
print(fruit_prices)

The `zip()` function creates a new iterable from the passed iterables (in this case two lists). This is an example of another important concept in Python: Generators.


### Summary:

- Comprehensions are _syntactic sugar_ for for loops, i.e., an optional way of writing certain for loops that create iterables to make them more readable


## Generators


When working with iterables, we often only need one element at a time. Especially for larger iterables, it would be wasteful to first create every single element in the iterable, and then go through it _again_ when we actually want to use those elements.

Here is an example of such a situation:


In [None]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

numbers_squared = [number**2 for number in numbers]
print(numbers_squared)

Not only is it cumbersome to write out the sequence of numbers to create `numbers`, we also create the entire list `numbers`, and then "touch" every element again when we calculate `numbers_squared`.

A much more elegant way to do this is by using the `range` object:


In [None]:
numbers = range(1, 12)  # Lower boundary is included, upper boundary is not
print(numbers)

The numbers are not immediately created, but they are generated when we iterate through the range:


In [None]:
for number in numbers:
    print(number)

So now we can implement the example above much more concisely, flexibly, and efficiently:


In [None]:
numbers_squared = [number**2 for number in range(1, 12)]
print(numbers_squared)

Generators can be found in many places. The `zip()` object is also a generator, as it creates each tuple when it is needed:


In [None]:
print(zip(fruits, prices))

In [None]:
for pair in zip(fruits, prices):
    print(pair)

Another example from `pathlib`:


In [None]:
from pathlib import Path

files = Path("..").glob("*/**")
print(files)

There may be a lot of files in a directory, so creating an iterable with all of the files at once might take quite a while. Most of the time, it is much more efficient to generate the next file when it's needed:


In [None]:
for file in Path("..").glob("*/**"):
    print(file)

If you ever need any generators values all at once, you can create a list from it, which will internally run a for loop and produce all items:


In [None]:
all_files = list(Path("..").glob("*/**"))
print(all_files)

If you just want to print the values, you can also use unpacking with the `*` operator:


In [None]:
print(*range(1, 100))

In [None]:
print(*zip(fruits, prices))

<table >
<tbody>
  <tr>
    <td style="padding:0px;border-width:0px;vertical-align:center">    
    Created by Simon Stone for Dartmouth College Library under <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons CC BY-NC 4.0 License</a>.<br>For questions, comments, or improvements, email <a href="mailto:researchdatahelp@groups.dartmouth.edu">Research Data Services</a>.
    </td>
    <td style="padding:0 0 0 1em;border-width:0px;vertical-align:center"><img alt="Creative Commons License" src="https://i.creativecommons.org/l/by/4.0/88x31.png"/></td>
  </tr>
</tbody>
</table>
