<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Python List & Dictionary Comprehensions

---

### Learning Objectives
*After this lesson, you will be able to:*
- Create list comprehensions 
- Create dictionary comprehensions 
- Use conditional logic (`if`/`else`) within list & dictionary comprehensions
- Use `zip()` and `enumerate()` within list & dictionary comprehensions
- Use nested list & dictionary comprehensions 

---

### Lesson Guide

- [Warm-Up on Python Basics](#warm-up)
- [Basic List Comprehensions](#list_comprehensions)
- [Basic Dictionary Comprehensions](#dictionary_comprehensions)
- [Conditional Logic within Comprehensions](#conditional_comprehensions)
- [Zip and Enumerate within Comprehensions](#zip_enumerate)
- [Nested Comprehensions](#nested_comprehensions)

<a id='warm-up'></a>

### Warm-Up on  Python Basics

---

In the next 10-15min try to write the code for the questions below on the Python basics that you reviewed yesterday.

#### Warm-Up A:  Remove the last element in `lstA` below, then sort it, insert the number `22` into the 5th position, and take a slice of the 7th through the 10th elements (inclusive). 

**Hint:** You can use the function `dir()` to find out which attributes and methods are available for any python object.

In [1]:
lstA = [13,15,-4,8,23,25,17,44,-7,-10,0,1,5,0,2,8,45]

In [2]:
dir(lstA)

['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

In [3]:
lstA.sort()
lstA

[-10, -7, -4, 0, 0, 1, 2, 5, 8, 8, 13, 15, 17, 23, 25, 44, 45]

In [4]:
ls = sorted(lstA[:-1])

In [5]:
ls.insert(4, 22)

In [6]:
ls[6:10]

[1, 2, 5, 8]

#### Warm-Up B:  Remove Diesel from `dictB` below.  Add Teddy to the dictionary with a value of 5. Get a list of the key, value tuples now in the dictionary.

In [7]:
dictB = {'Mabel':10,
         'Wilbur':12,
         'Diesel':4,
         'Schatzie':9}

In [8]:
dictB.pop('Diesel')


4

In [9]:
dictB['Teddy'] = 5

In [10]:
list(dictB.items())

[('Mabel', 10), ('Wilbur', 12), ('Schatzie', 9), ('Teddy', 5)]

<a id='list_comprehensions'></a>

### Basic List Comprehensions

---

List comprehensions are a simple and powerful syntax that allow for fast, efficient, and intuitive manipulation of array-like data types.

They are very useful replacements for iteration control statements!

In [21]:
#Let's write a for-loop to take the list below and return a list where each element has been squared:
numbers_A = [1,2,3,4,5,6,10,12]

ls_ = []
for n in numbers_A:
    ls_.append(n**2)


In [22]:
ls_

[1, 4, 9, 16, 25, 36, 100, 144]

In [23]:
#Now, let's do the same thing with a list comprehension:
numbers_B = [1,3,5,7,9,11,15]


[n**2 for n in numbers_B]

[1, 9, 25, 49, 81, 121, 225]

- Within the brackets these elements are similar to a for loop:
  1. The **operation per element** or **expression for the outcome** comes first: `n**2`
  2. Next is the **for loop variable assignment**: `for n`
  3. Last comes the **list of elements to iterate over**: `in numbers_B`

#### Quick Practice: Try these basic list comprehensions!

In [24]:
#Multiply every element in this list by 10, and then subtract 4:
numbers = [6,10,8,5,3]

In [25]:
[n*10-4 for n in numbers]

[56, 96, 76, 46, 26]

In [26]:
#Use .capitalize() to get a list of the names with the first letters capitalized:
names = ['alex','TOM','kate','Emily','hilde']


In [27]:
[s.capitalize() for s in names]

['Alex', 'Tom', 'Kate', 'Emily', 'Hilde']

<a id='dictionary_comprehensions'></a>

### Basic Dictionary Comprehensions

---

You can also use comprehensions to create dictionaries instead of lists!
You'll need to use `{}` instead of `[]`, and you'll need to determine what you want the key:value pair to look like!

In [30]:
#let's write a for-loop to create a dictionary that stores how many 'e's there are in the words below:
words_A = ['exasperated','angry','elated','incredulous']


In [31]:
'exasperated'.count('e')

3

In [32]:
d = dict()
for w in words_A:
    d[w] = w.count('e')

In [33]:
d

{'exasperated': 3, 'angry': 0, 'elated': 2, 'incredulous': 1}

In [34]:
#now let's do the same thing with a dictionary comprehension:
words_B = ['embarrassed','exhausted','overjoyed','embittered']
{w:w.count('e') for w in words_B}

{'embarrassed': 2, 'exhausted': 2, 'overjoyed': 2, 'embittered': 3}

In [35]:
#now let's do the same thing again, but this time, let's count both the 'e's and the 'a's:
words_B = ['embarrassed','exhausted','overjoyed','embittered']
{w:w.count('e')+w.count('a') for w in words_B}

{'embarrassed': 4, 'exhausted': 3, 'overjoyed': 2, 'embittered': 3}

#### Quick Practice: Try these basic dictionary comprehensions!

In [36]:
#Create a dictionary storing the length of each word in the list below:
words = ['bus','train','airplane','tram','helicopter']
{w:len(w) for w in words}

{'bus': 3, 'train': 5, 'airplane': 8, 'tram': 4, 'helicopter': 10}

In [37]:
#Create a dictionary that stores the length of each of the surnames in the list below, but with the names capitalized:
#ie: 'Grant': 5, etc
surnames = ['grant','Sketchley','REUSTLE','huse','Mellgard']
{n.upper():len(n) for n in surnames}

{'GRANT': 5, 'SKETCHLEY': 9, 'REUSTLE': 7, 'HUSE': 4, 'MELLGARD': 8}

In [38]:
#Create a dictionary that stores the square and the cube of each of the numbers below:
numbers = [1,2,3,4,5]

{n:(n**2, n**3) for n in numbers}


{1: (1, 1), 2: (4, 8), 3: (9, 27), 4: (16, 64), 5: (25, 125)}

<a id='conditional_comprehensions'></a>

### Conditional Logic within Comprehensions

---

You can use if/else statements within comprehensions, just the same way that you can in a for loop! 

A rule of thumb is:
- If the 'if' is related to **changing the outcome** you actually have, then it goes at **the beginning of your comprehension** after the expression for the outcome
- If the 'if' is **filtering out some of the values** (for example, you ONLY want to find the square roots of the positive numbers in a list, and skip all the negatives), then it goes right **at the end of your comprehension**

In [39]:
#Let's write a for-loop to binarize the list of numbers below depending on whether they are above or below 10
#(If the number is below 10, we replace it with 0; if it's above 10, we replace it with 1)
numbers_A = [5,7,8,19,30]

new_A = []
for n in numbers_A:
    if n < 10:
        new_A.append(0)
    else:
        new_A.append(1)
new_A

0
0
0
1
1


In [40]:
#Now let's do the same thing with a list comprehension:
numbers_B = [34,2,8,13,20]
[0 if n < 10 else 1 for n in numbers_B]

[1, 0, 0, 1, 1]

In [41]:
#Let's write a dictionary comprehension to store whether each word is 'short' or 'long' in the list below
#If the length of the word is over six letters, then we'll say it's 'long'; otherwise it's 'short'
#We want to skip over any items that aren't words
lst_A = ['ostentatious','house','industrial', None,'dog',8,'eat']


In [42]:
{w:'s' if len(w) < 6 else 'l' for w in lst_A if isinstance(w, str)}

{'ostentatious': 'l', 'house': 's', 'industrial': 'l', 'dog': 's', 'eat': 's'}

In [43]:
#Now let's try the same thing as above, but this time,
# if the word is between 4 and 6 letters, classify it as 'medium'
lst_A = ['ostentatious','house','industrial',None,'dog',8,'eat']
{w:'s' if len(w) < 4 else 'm' if len(w) <6 else 'l' \
 for w in lst_A if isinstance(w, str)}

{'ostentatious': 'l', 'house': 'm', 'industrial': 'l', 'dog': 's', 'eat': 's'}

#### Quick Practice: Try these comprehensions with conditionals!

In [44]:
#Write a dictionary comprehension to store the length of each of the words in the list below, 
#but only for the words that end in 't'!
words = ['cat','dog','elephant','rabbit','lizard']

{w:len(w) for w in words if w[-1] == 't'}


{'cat': 3, 'elephant': 8, 'rabbit': 6}

In [45]:
#Write a list comprehension to multiply all the even numbers by 2 and all the odd numbers by 3
#BUT only do this for the positive numbers!
#(remember, you can use % to find the remainder after division for two numbers, 
#so 10%5 would be 0 because 5 fits into 10 evenly with no remainder)
numbers = [4,5,3,10,-6,7]


In [46]:
[n*2 if n % 2 == 0 else n*3 for n in numbers if n > 0]

[8, 15, 9, 20, 21]

<a id='zip_enumerate'></a>

### Zip and Enumerate within Comprehensions

---

The functions `zip()` and `enumerate()` can be really helpful for list and dictionary comprehensions!

`zip()` is great for pairing together items from two different lists.

`enumerate()` is helpful when you want to use both the items and also the position of the item in the list

In [75]:
#Let's write a for-loop to create a dictionary that stores the populations of the cities below:
cities_A = ['Tokyo','Shanghai','Jakarta','Delhi','Seoul']
populations_A = [37.8,34.9,31.7,26.5,25.5]

d = dict()
for c, p1  in zip(cities_A, populations_A):
    d[c] = p1

In [76]:
d

{'Tokyo': 37.8,
 'Shanghai': 34.9,
 'Jakarta': 31.7,
 'Delhi': 26.5,
 'Seoul': 25.5}

In [48]:
#Let's write a dictionary comprehension to store the population of the cities below to the nearest million, 
#but ONLY if they're more than 22 million
cities_B = ['Karachi','Guangzhou','Beijing','Shenzhen','Mexico City']
populations_B = [25.1,25.0,24.9,23.3,21.5]
{c: p for c, p in zip(cities_B, populations_B) if p > 22 }

{'Karachi': 25.1, 'Guangzhou': 25.0, 'Beijing': 24.9, 'Shenzhen': 23.3}

In [49]:
list(zip(cities_B, populations_B))

[('Karachi', 25.1),
 ('Guangzhou', 25.0),
 ('Beijing', 24.9),
 ('Shenzhen', 23.3),
 ('Mexico City', 21.5)]

In [50]:
#Let's combine the two lists of cities together and then write a list comprehension to get a list of strings 
#that looks like ['1 Tokyo','2 Shanghai',...]

# [str(i+1) + " " + c for i, c in enumerate(cities_A + cities_B)]

[f"the city number {i+1} is {c}" for i, c in enumerate(cities_A + cities_B)]


['the city number 1 is Tokyo',
 'the city number 2 is Shanghai',
 'the city number 3 is Jakarta',
 'the city number 4 is Delhi',
 'the city number 5 is Seoul',
 'the city number 6 is Karachi',
 'the city number 7 is Guangzhou',
 'the city number 8 is Beijing',
 'the city number 9 is Shenzhen',
 'the city number 10 is Mexico City']

In [79]:
for i in enumerate(cities_A + cities_B):
    print(i)
    


(0, 'Tokyo')
(1, 'Shanghai')
(2, 'Jakarta')
(3, 'Delhi')
(4, 'Seoul')
(5, 'Karachi')
(6, 'Guangzhou')
(7, 'Beijing')
(8, 'Shenzhen')
(9, 'Mexico City')


In [None]:
[f'{i+1}  {c}' for i, c in enumerate(cities_A + cities_B)]

In [78]:
#Let's create a dictionary that holds each city as the key, 
#and a tuple containing the ranking of the city and its population as the value
#but ONLY for the top 8 cities
#each entry should look like:  'Delhi': (4, 26.5)




{'Tokyo': (1, 37.8),
 'Shanghai': (2, 34.9),
 'Jakarta': (3, 31.7),
 'Delhi': (4, 26.5),
 'Seoul': (5, 25.5)}

In [82]:
# first combine all the cities and all the populations

all_cities = cities_A + cities_B
all_pop = populations_A+populations_B

In [85]:
# enumerate starting from 1 where i is the sequence number and c is the city
{i:c for i, c in enumerate(all_cities,1)}

{1: 'Tokyo',
 2: 'Shanghai',
 3: 'Jakarta',
 4: 'Delhi',
 5: 'Seoul',
 6: 'Karachi',
 7: 'Guangzhou',
 8: 'Beijing',
 9: 'Shenzhen',
 10: 'Mexico City'}

In [88]:
# since we want the city name first , then the number and population
{c:(i,p) for (i,c),p in zip(enumerate(all_cities, 1),all_pop) if i <= 8}

{'Tokyo': (1, 37.8),
 'Shanghai': (2, 34.9),
 'Jakarta': (3, 31.7),
 'Delhi': (4, 26.5),
 'Seoul': (5, 25.5),
 'Karachi': (6, 25.1),
 'Guangzhou': (7, 25.0),
 'Beijing': (8, 24.9)}

In [87]:
{t[0]: (i, t[1]) for i, t in enumerate(zip(all_cities, all_pop), 1)
 if i <=8}

{'Tokyo': (1, 37.8),
 'Shanghai': (2, 34.9),
 'Jakarta': (3, 31.7),
 'Delhi': (4, 26.5),
 'Seoul': (5, 25.5),
 'Karachi': (6, 25.1),
 'Guangzhou': (7, 25.0),
 'Beijing': (8, 24.9)}

#### Quick Practice: Try these comprehensions with zip and enumerate!

In [52]:
#create a dictionary that stores each person's name with the total number of hours they worked last week
#each entry should look like:   'Ollie': 25 
employees = ['Faye','Ollie','Roberto']
hours = [(5,8,10,10,8),(4,0,6,10,5),(8,8,7,9,10)]

{e:sum(h) for e, h in zip(employees, hours)}

{'Faye': 41, 'Ollie': 25, 'Roberto': 42}

In [54]:
#the following is a list of 20 students in order of how well they did on an exam
#the top three students and the bottom three students will change sets
#create a list of only the top three students and the bottom three students, with their ranking
#each entry should look like:   ('Matt', 1)
students = ['Matt','Keri','Raushaun','CJ','Sean',
            'Abdullah','Chris','Mabel','Anna','Liza',
            'Sam','Alfie','Emma','Michael','Boris',
            'Fred','Demi','Renata','Kush','Precious']

[(s, i) for i, s in enumerate(students, 1) 
 if i <=3 or i >= len(students)-2 ]

[('Matt', 1),
 ('Keri', 2),
 ('Raushaun', 3),
 ('Renata', 18),
 ('Kush', 19),
 ('Precious', 20)]

<a id='with_pandas'></a>

### Application of Comprehensions with Pandas

---

It's very easy to create a dataframe using a dictionary, so dictionary comprehensions in particular may come in handy!

Here's an example below:

In [13]:
import pandas as pd

column_names = ['height','weight','age']
values = [[62, 54, 60, 50], [180, 120, 200, 100], [33, 40, 25, 28]]

In [14]:
{col:vals for col, vals in zip(column_names, values)}

{'height': [62, 54, 60, 50],
 'weight': [180, 120, 200, 100],
 'age': [33, 40, 25, 28]}

In [15]:
records = pd.DataFrame({col:vals for col, vals in zip(column_names, values)})
records

Unnamed: 0,height,weight,age
0,62,180,33
1,54,120,40
2,60,200,25
3,50,100,28
