# Week 3: Loops

# Advanced: Comprehensions

Welcome to the *Week 3 Advanced Python Notebook*. This notebook is designed for students who already have substantial experience with writing loops in Python and are confidence on the material from the [`Beginner`](./week_03_loops_beginner.ipynb) and [`Intermediate`](./week_03_loops_intermediate.ipynb) material.  

Your task today is to carefully read through the content and complete the exercises at the end. These exercises are more challenging and are intended to deepen your understanding of how Python handles data behind the scenes.  

> **Important:** This notebook is only recommended if you are already very confident with Python. Before beginning, you must have attempted at least $4$ exercises from both the [`Beginner`](./week_02_loops_beginner.ipynb) and [`Intermediate`](./week_02_loops_intermediate.ipynb) notebooks. If you have not done so, please return to those notebooks first, as the material here builds directly on the `Beginner` material and is substantially more complex than the `Intermediate`.  

In this notebook, you will explore comprehensions. Comprehensions are a short-hand syntax for creating collections from `for` loops and are extremely useful for condensing long code and improving readbility.  

Work through the examples carefully, and take your time with the exercises. They are designed to stretch your understanding and prepare you for advanced applications of Python.  


### Table of Contents

 - [Welcome Page](./week_03_home.ipynb)

 - [Beginner: For and While Loops](./week_03_loops_beginner.ipynb)
 - [Intermediate: Advanced Iteration](./week_03_loops_intermediate.ipynb)
 - [**Advanced: Comprehensions**](./week_03_loops_advanced.ipynb)
     - [List Comprehensions](#List-Comprehensions)
     - [Set Comprehensions](#Set-Comprehensions)
     - [Dictionary Comprehensions](#Dictionary-Comprehensions)
     - [Combining Conditionals and Comprehensions](#Combining-Conditionals-and-Comprehensions)
     - [Generator Comprehensions](#Generator-Comprehensions)
     - [Exercises](#Exercises)
 - [Slides](./week_03_slides.ipynb) ([Powerpoint](./Lecture3_Loops.pptx))


## List Comprehensions

In this section, we shall look at a nice short-hand for creating a list using a `for` loop known as a *list comprehension*. To get started let's consider the following `for` loop:

In [None]:
# List of numbers
numbers = [1,3,5,7,11]

# Create a new empty list (we are going to fill this with numbers in a moment)
numbers_squared = []

# Loop through the list of numbers
for number in numbers:
    
    # Add the square of each number to the new list
    numbers_squared.append(number**2)
    
# Print the result
print(numbers_squared)

Given a list of numbers, `numbers`, this code looks through the list one-by-one, squaring each element and saving it in the `numbers_squared` list. This code is nice, but quite long for a fairly simple operation.

The same logic can be written more concisely with a *list comprehension* as follows:

In [None]:
# List of numbers
numbers = [1,3,5,7,11]

# Square the numbers
numbers_squared = [number**2 for number in numbers]

# Print the result
print(numbers_squared)

As can be seen, we still have a lot of the syntax from the original `for` loop (we keep the keywords `for` and `in` for instance), but the code is now much shorter and potentially more readable.

In general the syntax for a list comprehension is as follows:

> ```result = [expression for item in iterable]```

Let's break this down a little. In this syntax, we have:
  - `iterable`: The sequence we loop over (e.g. a `list`, `string`, `range`,... etc).
  - `item`: The variable representing the current element from the iterable.
  - `expression`: What we want to calculate using `item`. 
  - `for... in`: These express the `for` loop itself.
  - `[]`: This indicates that we are building a list.
  - `result`: The variable we save the output to.
  

Here are two more examples of a list comprehensions:

In [None]:
# Long string.
my_string = "Hello. I am a long string. I contain many letters and words. I even contain a few sentences."

# List comprehension 1
list_comprehension1 = [len(w) for w in my_string.split()]

# List comprehension 2
list_comprehension2 = [len(w) for w in my_string.split(".")]

# Print the results
print(list_comprehension1)
print(list_comprehension2)

 > **Test your understanding:** Using your knowledge of list comprehensions, explain what the above code is doing. What are the `list_comprehension1` and `list_comprehension2` lists counting? Why is the last element of `list_comprehension2` equal to zero?

List comprehensions are not the only type of comprehension in Python. The same concise syntax can be used to build *sets* and *dictionaries*.

## Set Comprehensions

A *set comprehension* looks almost identical to a *list comprehension*, except that it uses curly braces `{}` instead of square brackets `[]`.

In [None]:
# Example: square numbers using a set comprehension
numbers = [1, 2, 3, -2, 4, 1]

squares = {n**2 for n in numbers}
print(squares)

Notice that sets automatically remove duplicates. For instance, even both $-2$ and $2$ appear in the list, $4=2^2=(-2)^2$ appears only once in the set.

The general syntax for a *set comprehension* is given by:

> ```result = {expression for item in iterable}```

This is identical to the expression for list comprehensions, except the square brackets have been replaced by curley braces.

## Dictionary Comprehensions

A *dictionary comprehension* allows us to generate key–value pairs in a single line. The syntax is very similar to list and set comprehensions, but instead of just an expression, we provide a `key: value` pair for each item.

In [None]:
# Example: mapping words to their lengths
words = ["apple", "banana", "pear", "cherry"]

word_lengths = {word: len(word) for word in words}
print(word_lengths)

To see what's happening under the hood, here's the same example written using a standard `for` loop:

In [None]:
words = ["apple", "banana", "pear", "cherry"]

word_lengths = {}  # start with an empty dictionary
for word in words:
    word_lengths[word] = len(word)

print(word_lengths)

Both versions create a dictionary that stores the fruit name as the key and its length as the value. The comprehension just combines the loop and assignment into a single, concise expression.

Again the syntax is similar here, but this time we have:

> ```result = {key_expression: value_expression for item in iterable}```

Notice that we are still using curly braces `{}`, just like in a set comprehension. The difference here is the colon (`:`) inside the braces. The colon tells Python we are building key-value pairs for a *dictionary* rather than a values for a *set*.

## Combining Conditionals and Comprehensions

So far, we've seen comprehensions that apply the same operation to every element of an iterable. But often we want to be more selective. We might instead want to:

 - Only include items that meet some condition.
 - Or transform items differently depending on a condition.

Comprehensions let us do both of these by combining them with *conditional expressions*.

**Filtering Items:** A conditional at the end of a comprehension can be keep items which satisfy a certain criteria and disregard others. For instance, the below list comprehension keeps only the even numbers:

In [None]:
# Example: keep only even numbers
numbers = range(10)
evens = [n for n in numbers if n % 2 == 0]
print(evens)

This comprehension says:

 - loop through the numbers `0–9`,
 - check if each one is divisible by `2`,
 - and only keep those which are divisible by `2`.

The equivalent for loop looks like this:

In [None]:
numbers = range(10)
evens = []
for n in numbers:
    if n % 2 == 0:
        evens.append(n)

print(evens)

This same idea also works for sets and dictionaries:

In [None]:
# Set comprehension: keep even squares only
even_squares = {n**2 for n in numbers if n % 2 == 0}
print(even_squares) 

# Dict comprehension: keep only long words
words = ["ant", "elephant", "cat", "giraffe"]
long_words = {w: len(w) for w in words if len(w) > 3}
print(long_words) 

 > **Test your understanding:** Make sure you understand the above code before moving on. How would you modify the set comprehension to select odd numbers instead?

**Conditional Expressions Inside a Comprehension:** Instead of filtering, you can also use a conditional expression inside the comprehension to decide how to transform each item. Here every element is included, but its value depends on the condition.

In [None]:
# Example: classify numbers as "even" or "odd"
classification = {n: ("even" if n % 2 == 0 else "odd") for n in range(6)}
print(classification)

This comprehension says:

 - loop through the numbers `0–5`,
 - use the number as the key,
 - and store `"even"` or `"odd"` as the value depending on whether it divides by `2`.

The equivalent for loop is given by:

In [None]:
classification = {}
for n in range(6):
    if n % 2 == 0:
        classification[n] = "even"
    else:
        classification[n] = "odd"

print(classification)

 > **Note:** Further information on conditional expressions can be found in the [week 2 advanced notebook](../02/week_02_booleans_and_conditionals_advanced.ipynb).

## Generator Comprehensions

All the comprehensions we have seen so far (lists, sets, and dictionaries) build the entire collection as soon as you define them. Python loops through the iterable right away, computes every result, and stores them in memory.

That is fine for small data, but if the sequence is very large the computation can use a lot of memory. Often we do not need to keep everything at once. Instead, it is better to produce values only when they are actually needed.

A generator comprehension allows us to do exactly that. It looks just like a list comprehension, but uses round parentheses `()` instead of square brackets `[]`.

In [None]:
# Generator comprehension: squares of numbers
numbers = range(5)
squares = (n**2 for n in numbers)

print(squares)           # <generator object ...>
print(list(squares))     # [0, 1, 4, 9, 16]

Here, squares is not a list. It is a *generator object*. Instead of computing and storing all the values at once, it creates them one at a time as you iterate over it. This makes generator comprehensions memory efficient, since nothing is stored unnecessarily. They only ever evaluate on demand, producing values when the program asks for them.

For example:

In [None]:
# Create a generator comprehension
generator_comprehension = (n**2 for n in range(1000000))

# Sum of squares without ever building a list in memory
total = sum(generator_comprehension)
print(total)

This calculates the sum of a million squares efficiently because Python generates each square only when the `sum()` function requests it, and then discards it right away.

Generator comprehensions are preferable when you need to perform computations that are larger than you can fit in memory. However, generator comprehensions can also be slower than list comprehensions, as they require more access to physical memory. In practice, which tool is best will depend upon the task you are doing. 

## Exercises

**Question 1:** From `range(1, 21)`, use a comprehension to create a set of the squares of the even numbers only.

In [None]:
# Write your code here...

**Question 2:** Below is a large integer.

In [None]:
my_integer = 347892472349234

Using a list comprehension and the `sum` function, work out the sum of the digits of `my_integer`.

In [None]:
# Write your code here...

*Hint: Convert the integer to a string first and recall that strings are iterable!*

**Question 3:** In this question we shall call an integer that equals the sum of the cubes of it's digits "pluriperfect". For example $153$ is pluriperfect as $153=1^3+5^3+3^3$. Using the `range` function together with list comprehensions, compute all pluriperfect numbers between 
$0$ and $999$.

In [None]:
# Write your code here...

*Hint: For this question, you can use the `sum` function, which sums the items in a list. You may find that you need more than one list comprehension for this task!*

**Question 4:** Below is a long string:

In [None]:
my_string = "This is a lengthy string. It has many sentences. " + \
            "MaNY LetteRs aRe UpPer caSe. It might " + \
            "have these characters as well ?!,:-'... and numbers like " + \
            "123456789"
print(my_string)

Using an appropriate choice of comprehension and conditional statements, compute and print the unique vowels in `my_string`. Your code should account for spaces, punctuation, numbers and case.

In [None]:
# Write your code here...

*Hint: You might want to consider the `lower` function from week 1 and the membership test `in "aeiou"`.*

**Question 5:** The built-in function `max()` can be applied to any generator or iterable covered in this notebook. It returns the largest value in a numerical collection. Suppose you want to modify your code from question 3 to find the largest pluriperfect number less than `10000000000`. Which comprehension would be most appropriate for this task, and why? *You do not need to implement your answer for this question.*