# List and Dictionary Comprehension

If you want to type along with me, use [this notebook](https://humboldt.cloudbank.2i2c.cloud/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fbethanyj0%2Fdata271_sp25&branch=main&urlpath=tree%2Fdata271_sp25%2Flectures%2Fdata271_lec04_live.ipynb) instead. 
If you don't want to type and want to follow along just by executing the cells, stay in this notebook. 

## List comprehension
*List comprehensions* are a convenient and widely used Python feature. They allow you to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter into once concise expression. They take the basic form

```python
[expr for value in collection if condition]
```

where `collection` is an *iterable* object in python (something we can loop through like a list) and `value` is an *iterator variable* (a variable that temporarily takes on the value of each element of `collection` as we iterate through it).
This is equivalent to running the following for loop:
```python
result = []
for value in collection:
    if condition:
        result.append(expr)
```

So you can see that they make your code much more efficient, condensing potentially multiple lines of code into a single line!


The filter condition can be omitted, leaving only the expression. For example, given a list of strings, we could convert all of the strings to uppercase like this:

In [1]:
strings = ['a','as','bat','car','dove','python']

[x.upper() for x in strings]

['A', 'AS', 'BAT', 'CAR', 'DOVE', 'PYTHON']

In [2]:
# as a for loop

new_list = []
for x in strings:
    new_list.append(x.upper())
    
new_list

['A', 'AS', 'BAT', 'CAR', 'DOVE', 'PYTHON']

Or we could use the filter conditions. For example given the same list of strings. we could filter out strings with length 2 or less and convert them to uppercase like this:

In [3]:
[x.upper() for x in strings if len(x)>2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

In [4]:
# as a for loop

new_list2 = []
for x in strings:
    if len(x) > 2:
        new_list2.append(x.upper())
        
new_list2

['BAT', 'CAR', 'DOVE', 'PYTHON']

Just as the iterator variable doesn't really matter with for loops, the iterator variable doesn't really matter with list comprehensions too. It is good practice to choose a variable that is descriptive. 

In [5]:
[i.upper() for i in strings if len(i) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

In [6]:
[string.upper() for string in strings if len(string) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

Sometimes list comprehension is an efficient way to generate lists. 

In [7]:
range(0,31,2)

range(0, 31, 2)

In [8]:
# generate a list of even numbers between 0 and 30 (inclusive)
list1=[x for x in range(0,31) if x%2 == 0]
list1

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30]

In [9]:
# or you can do this without a condition
list2 = [2*num for num in range(0,16)]
list2

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30]

In [10]:
# or another way
list3 = [num for num in range(0,31,2)]
list3

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30]

Note that you can also create these new lists from dictionaries, sets, or tuples.

In [11]:
my_dict = {'first':1,'second':2,'third':3}

[key + str(val) for key,val in my_dict.items()]

['first1', 'second2', 'third3']

In [12]:
my_set = {1,2,3,4,5,5}

[i for i in my_set]

[1, 2, 3, 4, 5]

In [13]:
my_tup = (1,2,3,4,5)

[i for i in my_tup]

[1, 2, 3, 4, 5]

## Dictionary comprehension
A dictionary comprehension is a natural extension and works exactly the same way as list comprehension, but we need to specify both the keys and values. A dictionary comprehension looks like this:
```python
{key-expr:value-expr for value in collection if condition}
```
Like list comprehensions, dictionary comprehensions are mostly for convenience, but they similarly can make code easier to both read and write. Here are some examples:

In [14]:
# Generate a dict that has 0 to 9 as keys and the square of the key as values
dict1 = {x : x**2 for x in range(10)}
dict1

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

In [15]:
# conditions work with dictionary comprehensions too
dict2 = {x : x**2 for x in range(10) if x > 4}
dict2

{5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

This can be very handy for creating a lookup map for the location of elements in a list:

In [16]:
# Create a lookup map of strings for their location in the strings list. This can be very nice when lists get big!
lookup = {value:index for index, value in enumerate(strings)}
lookup

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

In [17]:
# How this might be used 
list_related_to_strings = [len(i) for i in strings]
list_related_to_strings[lookup['bat']]

3

In [18]:
# Can use a dictionary as your iterator
{key.upper():value**2 for key,value in my_dict.items()}

{'FIRST': 1, 'SECOND': 4, 'THIRD': 9}

In [19]:
# Can use a set as your iterator
{str(i):i for i in {1,2,3,4,5,5}}

{'1': 1, '2': 2, '3': 3, '4': 4, '5': 5}

You can also use tuples to iterate through. Test this out on your own if you want. 

## Nested List Comprehension
Suppose we have a list of lists containing some English and Spanish names. 

In [20]:
all_data = [['John','Emily','Michael','Mary','Steven'],['Maria','Juan','Javier','Natalia','Pilar']]

Suppose we wanted to get a single list containing all names with two or more a's in them. How would we do this strictly with for loops?

In [21]:
names_we_want = []

for names in all_data:
    for name in names:
        if name.count("a") >=2:
            names_we_want.append(name)
    
names_we_want

['Maria', 'Natalia']

In [22]:
# or if we use some list comprehension

names_we_want = []

for names in all_data:
    enough_as = [name for name in names if name.count("a") >=2]
    names_we_want.extend(enough_as)
    
names_we_want

['Maria', 'Natalia']

It turns out you can actually wrap this whole operation up in a single *nested* list comprehension, which will look like

In [23]:
[name for names in all_data for name in names if name.count("a") >=2]

['Maria', 'Natalia']

At first, nested list comprehensions are a bit hard to wrap your head around. The `for` parts of the list comprehension are arranged according to the order of nesting. The outer most goes first, the inner most goes last. Any filter condition goes at the very end (just like we saw in ordinary list comprehension). Here is another example where we flatten a list of tuples of integers into just a list of integers. 

In [24]:
some_tuples = [(1,2,3),(4,5,6),(7,8,9)]
flattened = [x for tup in some_tuples for x in tup]
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Keep in mind that the order of the `for` expressions would be the same if you wrote a nested for loop instead of a list comprehension. 

In [25]:
flattened = []

for tup in some_tuples:
    for x in tup:
        flattened.append(x)
        
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

You can have arbitrarily many levels of nesting, though if you have more than two or three levels of nesting, you should probably start to question whether this makes sense from a code readibility standpoint. Would someone be able to approach your code and understand what is going on? If the answer is no, it's a good idea to rework some things.

## If-else statements
You can include if else statements in list comprehensions with the following syntax:


```python
[expr if condition else expr2 for value in collection]
```

In [26]:
# Squares the number if it is even, puts 0 if it is not 
[num**2 if num % 2 == 0 else 0 for num in range(10)]

[0, 0, 4, 0, 16, 0, 36, 0, 64, 0]

In [27]:
# as a for loop

result = []
for i in range(10):
    if i % 2 == 0:
        result.append(i**2)
    else:
        result.append(0)
result

[0, 0, 4, 0, 16, 0, 36, 0, 64, 0]

You can see that the if-else statement comes before the for in the example above. This may seem different than how you observed `if` come up in list comprehensions before. Let's look into this a little more. Let's try putting our if (not the else) from above at the end of the list comprehension. 

In [28]:
[num**2 for num in range(10) if num % 2 == 0]

[0, 4, 16, 36, 64]

We see here that using `if` in this way resulted in something slightly different. When `if` is used at the end of a list comprehension, it serves as a sort of first-pass filter, where nothing is added to the list if the condition is not met. On the other hand, when the `if` (and else) comes before the `for`, it is used to adjust what is added to the list based on a condition. So in summary, an `if` statement at the end is used as a filter, and an `if-else` at the beginning allows us to change the output based on a condition. Play around with this on your own to gain some understanding. 

What if we want to transform something with an elif statement in it, we have to do transform it a bit.

In [29]:
# squares the number if it is divisible by 3, keeps the number of it has a remainder of 1, 0 otherwise

result = []
for i in range(10):
    if i % 3 == 0:
        result.append(i**2)
    elif i % 3 == 1:
        result.append(i)
    else:
        result.append(0)

result

[0, 1, 0, 9, 4, 0, 36, 7, 0, 81]

In [30]:
# rewrite this without any elifs

result = []
for i in range(10):
    if i % 3 == 0:
        result.append(i**2)
    else:
        if i % 3 == 1:
            result.append(i)
        else:
            result.append(0)

result

[0, 1, 0, 9, 4, 0, 36, 7, 0, 81]

In [31]:
# so with list comprehension
[num**2 if num % 3 == 0 else num if num % 3 == 1 else 0 for num in range(10)]

[0, 1, 0, 9, 4, 0, 36, 7, 0, 81]

Note that you can also use this along with conditions as a filter. 

In [32]:
[num**2 if num % 3 == 0 else num if num % 3 == 1 else 0 for num in range(10) if num % 2 == 0]

[0, 0, 4, 36, 0]

In [33]:
# if-else with dictionary comprension -- changing values 
{i:i**3 if i < 5 else i**2 for i in range(10)}

{0: 0, 1: 1, 2: 8, 3: 27, 4: 64, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

In [34]:
# if-else with dictionary comprension -- changing keys and values 
{(i if i < 5 else i**2):(i**3 if i < 5 else i**2) for i in range(10)}

{0: 0, 1: 1, 2: 8, 3: 27, 4: 64, 25: 25, 36: 36, 49: 49, 64: 64, 81: 81}

## Activities

**Activity 1:** Create a list containing each word of the string *Peter piper picked a pair of pickled peppers*. Then create a dictionary containing each word as a key and the length of each word as the value. 

In [38]:
peter = 'Peter piper picked a pair of pickled peppers'
peter_list = peter.split()

{word:len(word) for word in peter_list}

{'Peter': 5,
 'piper': 5,
 'picked': 6,
 'a': 1,
 'pair': 4,
 'of': 2,
 'pickled': 7,
 'peppers': 7}

**Activity 2:** Take the following list:

`doctor = ['house', 'cuddy', 'chase', 'thirteen', 'wilson']`

Use list comprehension to produce a list of the first character of each string in `doctor`.

In [40]:
doctor = ['house', 'cuddy', 'chase', 'thirteen', 'wilson']
[word[0] for word in doctor]

['h', 'c', 'c', 't', 'w']

**Activity 3:** Using the range of numbers from 0 to 9 as your iterable and `i` as your iterator variable, write a list comprehension that produces a list of numbers consisting of the squared values of `i`.

In [41]:
[i**2 for i in range(0,9)]

[0, 1, 4, 9, 16, 25, 36, 49, 64]

**Activity 4:** Matrices can be represented as a list of lists in Python. For example a 5 x 5 matrix with values 0 to 4 in each row can be written as:

```{pyth}
matrix = [[0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4]]
```

Recreate this matrix by using nested list comprehensions. Recall that you can create one of the rows of the matrix with a single list comprehension. To create the list of lists, you simply have to supply the list comprehension as the output expression of the overall list comprehension:

`[[output expression] for iterator variable in iterable]`

In [42]:
matrix = [[i for i in range(5)] for x in range(5)]
matrix

[[0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4]]

**Activity 5:** Use `member` as the iterator variable in a list comprehension to create a list that only includes the members of `fellowship` that have 7 characters or more.

In [43]:
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']
[member for member in fellowship if len(member) >=7]

['samwise', 'aragorn', 'legolas', 'boromir']

**Activity 6:** Using the same `fellowship` list, using a list comprehension and an if-else conditional statement in the output expression, create a list that keeps members of fellowship with 7 or more characters and replaces others with an empty string. Use member as the iterator variable in the list comprehension.

In [44]:
[member if len(member) >=7 else '' for member in fellowship]

['', 'samwise', '', 'aragorn', 'legolas', 'boromir', '']