## CMPINF 2100 Week 02

### List Comprehensions

## Overview

This notebook contrasts for-loops with the more efficient and streamlined list comprehension. The notebook begins by showing a few short comings associated with the traditional for-loop to motivate why we would consider using an alternative **iteration** procedure.

## For-loop pitfalls

Create a list of integers.

In [1]:
my_list = [1, 2, 3, 4, 5]

print( my_list )

[1, 2, 3, 4, 5]


Suppose we want to square every element in the list and store the result to another list. We can do that by initializing a new list, iterating over the elements of the original list, squaring those elements, and appending the result to the new list. This approach is shown in the cell below. The iterating variable, `i`, is the value of the element in the `my_list` list.

In [2]:
# square every element in the list

my_list_squared = [] # initialize

for i in my_list:
    my_list_squared.append(i ** 2)

In [3]:
print( my_list_squared )

[1, 4, 9, 16, 25]


What happens if we forget to initialize the new list though? We will get an error! The body of the for-loop in the below applies the `.append()` method to the `squared_again` object. However, the `squared_again` object does not exist in the environment. Thus, there is no variable to apply `.append()` to! This throws an error!

In [4]:
for i in my_list:
    squared_again.append(i ** 2)

NameError: name 'squared_again' is not defined

The approach above is using the `.append()` method of a list, `squared_again`, which does not exist in memory. The `%whos` magic command confirms `squared_again` does not exist.

In [5]:
%whos

Variable          Type    Data/Info
-----------------------------------
i                 int     1
my_list           list    n=5
my_list_squared   list    n=5


One way to overcome this is issue is to define a function. The new list is initialized within the function before iterating over the user provided list with a for-loop.

In [6]:
def square_all_elements(input_list):
    out_list = [] # initialize output list
    for i in input_list:
        out_list.append(i ** 2)
    return out_list

The function makes it easy to apply the operation to new cases. We do not have to reinitialize an object over and over again. This saves us a step and prevents possible errors from occuring.

In [7]:
square_all_elements(my_list)

[1, 4, 9, 16, 25]

In [8]:
square_all_elements([2, 5, 12])

[4, 25, 144]

## List comprehensions

However, a downside of the function based approach is we have to go through extra effort of defining a function for executing a relatively basic task. After all, the above example consisted of squaring elements. Why do we need to define a function to perform something that sounds so simple? We need to be worried about appropriately commenting and naming the function that way we know what it does and the data types it accepts. 

Instead, a more streamlined approach is the **list comprehension**. The list comprehension does not require any initializations. List comprehensions can seem tricky and strange at first. However, once you master their operation you will find that they simplify your code. 

List comprehensions require that we breakdown the iteration process into the 3 main ingredients:
* the action to perform
* the variable to iterate with
* the sequence to iterate over

We unpack those 3 main ingredients and write them "in a single line" using the following structure:

`<output> = [ <action to perform using a variable> for <variable> in <sequence> ]`

Please note the `<>` is a placeholder. The word/phrase within `<>` is what we will change depending on the situation.

Let's see how we can make use of the list comprehension to square the elements within our list. The action is squaring a value, such as `i`. The value `i` is the variable we are iterating within. The value, `i`, comes from a sequence of values **in** `my_list`. Thus, our key ingredients are squaring the value, `i ** 2`, **for** each value **in** `my_list`. Those ingredients are compiled into the list comprehension syntax below. The result is assigned to the `squared_comp` object.

In [9]:
squared_comp = [ i ** 2 for i in my_list ]

List comprehensions are lists! That's why the 3 ingredients were surrounded by `[]` in the above cell.

In [10]:
print( type(squared_comp) )

<class 'list'>


The elements within the returned list have values equal to the result of the action. The squared values are printed below.

In [11]:
print( squared_comp )

[1, 4, 9, 16, 25]


With the list comprehension we do not have to worry about initialization! This makes it easier to apply the structure and operation to a new situation compared with the for-loop approach. For example, let's square every element in a different list.

In [12]:
[ i ** 2 for i in [2, 5, 12] ]

[4, 25, 144]

The real benefit of list comprehensions is that the **action** can be whatever we want! For example, let's consider multiplying 3 by a value. That value will be 2 raised to a power (an exponent). The cell below multiplies 3 by 2 raised to the 3rd power.

In [13]:
3 * (2 ** 3)

24

The action is performing the following steps:

In [14]:
3 * (2 * 2 * 2)

24

However, I want to iterate over different varying powers! I want to start with with the 0th power and end with the 5th power. Let's use a for-loop first to make it easy to print the calculatioins performed at each step:

In [16]:
for d in range(6):
    print( "3 * (2 ** %d) = %d" % (d, 3 * (2 ** d) ) )

3 * (2 ** 0) = 3
3 * (2 ** 1) = 6
3 * (2 ** 2) = 12
3 * (2 ** 3) = 24
3 * (2 ** 4) = 48
3 * (2 ** 5) = 96


The for-loop works, but the results are NOT stored. The results are simply printed! We need to initialize a list for storage or use a function to handle all operations for us. However, the action performed within the for-loop is **DIFFERENT** from our previous action of squaring elements! Thus, we **cannot** use `square_all_elements()` function we previously created! 

However, a list comprehension is capable of applying the action for each value of the iterating variable! The results are stored in a list without any initialization!

In [18]:
show_math = [ "3 * (2 ** %d) = %d" % (d, 3 * (2 ** d) ) for d in range(6) ]

In [19]:
show_math

['3 * (2 ** 0) = 3',
 '3 * (2 ** 1) = 6',
 '3 * (2 ** 2) = 12',
 '3 * (2 ** 3) = 24',
 '3 * (2 ** 4) = 48',
 '3 * (2 ** 5) = 96']

The action can be simpler though. The action could be just the returned value, rather than the string which details the mathematical steps.

In [20]:
just_results = [ 3 * (2 ** d) for d in range(6) ]

In [21]:
just_results

[3, 6, 12, 24, 48, 96]

We can also easily change the sequence we iterate over! The iteration can be applied up to the 10th power:

In [22]:
[ 3 * (2 ** d) for d in range(11) ]

[3, 6, 12, 24, 48, 96, 192, 384, 768, 1536, 3072]

## Why does this matter?

For-loops are essential programming tools. For-loops allow us to repeat an action many, many, many times. We do not need to copy and paste code. We can let the for-loop handle the procedure for us. However, for-loops are tedious. For-loops require careful thought and planning to properly execute. Sometimes we just want to accomplish a task in a single line of code! The list comprehension provides that streamlined syntax for applying actions over sequences!