# **List, Sets & Dictionary Comprehension**

List, Set & Dict Comprehension gives you an easier and shorter syntax for looping through iterable data structures and creating a new iterable object from the same.

![](https://miro.medium.com/max/765/1*m4oZHplTTZxVr_VvAkwjHQ.png)

## **1) List Comprehension**

Many simple “for loops” in Python can be replaced with list comprehensions. You can often hear that list comprehension is “more Pythonic”. 

The general syntax of a List Comprehension is as follows:

**`[f(x) for x in sequence]`**

### **1.1) What's the difference between for loops and list comprehension though?**

Here's an example of generating a list of squares of the first n integers using two methods:
1. Using a For Loop
2. Using a List Comprehension

*Using a For Loop*

In [35]:
import time

#Starting time - to measure execution time
start_time = time.time()

#Initialize a new list
NUMBERS = list(range(1000000))

#Intializing a new list 
squares = []

#Loop through range of 23 integers
for i in NUMBERS:
    #Appending squares value into the list
    squares.append(i**2)

print("--- For Loop took %s seconds ---" % (time.time() - start_time))

--- For Loop took 0.2338252067565918 seconds ---


*Using List Comprehension*

In [36]:
#Starting time - to measure execution time
start_time = time.time()

#Initializing a new list 
squares = []

#Using List Comprehension for getting our squares list
squares = [i**2 for i in NUMBERS]

print("--- List Comprehension took %s seconds ---" % (time.time() - start_time))

--- List Comprehension took 0.176161527633667 seconds ---


Which execution took lesser time - For Loop or List comprehension?

*Your Answer*

Generally, list comprehension has faster runtime than for loops. 

### **1.2) Having conditionals and more than one operation in List Comprehension**

In the case we want to have conditionals and additional operations within our list Comprehension, the following syntax can be kept in mind. 


**`[f(x) if condition else g(x) for x in sequence]`**

Let's look at an example for looping through the first 100 integers and storing doubled evens and halved odds.



In [37]:
# Creating list comprehension for the looping through the first 100 integers 
# and storing doubled evens and halved odds.
A_list = [2*i if i%2 == 0 else 0.5*i for i in range(100)]

### The above in for loop looks like:
# for i in range(100):
#   if i%2 == 0:
#        A_list.append(i*2)
#   else:
#       A_list.append(i*0.5)

print(A_list)

[0, 0.5, 4, 1.5, 8, 2.5, 12, 3.5, 16, 4.5, 20, 5.5, 24, 6.5, 28, 7.5, 32, 8.5, 36, 9.5, 40, 10.5, 44, 11.5, 48, 12.5, 52, 13.5, 56, 14.5, 60, 15.5, 64, 16.5, 68, 17.5, 72, 18.5, 76, 19.5, 80, 20.5, 84, 21.5, 88, 22.5, 92, 23.5, 96, 24.5, 100, 25.5, 104, 26.5, 108, 27.5, 112, 28.5, 116, 29.5, 120, 30.5, 124, 31.5, 128, 32.5, 132, 33.5, 136, 34.5, 140, 35.5, 144, 36.5, 148, 37.5, 152, 38.5, 156, 39.5, 160, 40.5, 164, 41.5, 168, 42.5, 172, 43.5, 176, 44.5, 180, 45.5, 184, 46.5, 188, 47.5, 192, 48.5, 196, 49.5]


## **2) Set Comprehension**

Set Comprehension is similar to how we did list comprehension. You can use them to iterate over sets. The only difference between them is that set comprehensions use curly brackets { }.

In [38]:
# Using Set comprehensions to create an output set which contains only 
# the even numbers that are present in the input list.
input_list = [1, 2, 3, 4, 4, 5, 6, 6, 6, 7, 7] 

#set comprehension
set_using_comp = {var for var in input_list if var % 2 == 0} 

#Printing our new set  
print("Output Set using set comprehensions:", set_using_comp)

Output Set using set comprehensions: {2, 4, 6}


Conditionals also work the same way we saw in list comprehension. Suppose you want to convert all elements of the tags set to lowercase except for 'Mercedes'.

In [39]:
# Using Set comprehensions to create an output set which converts everything to
# lower case except for Mercedes
cars = {'Porsce', 'Ferrari', 'Mercedes', 'RENAULT'}
new_tags = {car.lower() for car in cars if car != 'Mercedes'}

#printing set comprehension output
print(new_tags)

{'porsce', 'ferrari', 'renault'}


## **3) Dictionary Comprehension**

Dictionaries can be iterated over using dictionary comprehensions which look thus: 

**`output_dict = {key:value for (key, value) in iterable if (key, value satisfy this condition)}`**

In [40]:
# Using Dictionary comprehensions to create an output dictionary 
# which contains only the odd numbers that are present in the input 
# list as keys and their cubes as values
  
input_list = [1,2,3,4,5,6,7] 
dict_using_comp = {var:var ** 3 for var in input_list if var % 2 != 0} 
  
print("Output Dictionary using dictionary comprehensions:", dict_using_comp)

Output Dictionary using dictionary comprehensions: {1: 1, 3: 27, 5: 125, 7: 343}


## **Futher Examples**

Let's go back to our Julius Ceasar text to see how dictionary comprehension, list comprehension and set comprehension can be used in Data science.

In [41]:
## Read a file, parse lines, and get all UNIQUE words
worddict = dict() # make a dictionary to get the occurence of the top 100 unique words

#Reading from txt file
fd = open("ceasar.txt")
#Storing all the lines
lines = fd.readlines()
fd.close()

# strip newline characters and other whitespace off the edges
cleaned_lines = [line.strip() for line in lines] 

# make a list of lists. 
# each inner list if the list of words on that line
list_of_lines_words = [line.split() for line in lines]

#Making a flat list of words
flat_list = []
for sublist in list_of_lines_words:
    for item in sublist:
        flat_list.append(item)

a) Let's create a dictionary of 100 words and corresponsing count of the same.

In [42]:
#Recording start time
start_time = time.time()

worddict = []

#Looping through 100 words
worddict = {i:flat_list.count(i) for i in flat_list[:100]}

print("--- Dictionary Comprehension took took %s seconds ---" % (time.time() - start_time))

--- Dictionary Comprehension took took 0.0196530818939209 seconds ---


b) We implement an if/else conditional in the dictionary comprehension. If the length is greater than 5, the value becomes the length. Otherwise, we assign the word ‘short’ as the value.

In [43]:
words_dict = {i:len(i) if len(i) > 5 else 0 for i in flat_list[:100]}

words_dict

{'Skip': 0,
 'to': 0,
 'content': 7,
 'Search': 6,
 'or': 0,
 'jump': 0,
 'to…': 0,
 'Pull': 0,
 'requests': 8,
 'Issues': 6,
 'Marketplace': 11,
 'Explore': 7,
 '@Sakzsee': 8,
 'teropa': 6,
 '/': 0,
 'nlp': 0,
 '6': 0,
 '100': 0,
 '157': 0,
 'Code': 0,
 'Actions': 7,
 'Projects': 8,
 'Wiki': 0,
 'Security': 8,
 'Insights': 8,
 'nlp/resources/corpora/gutenberg/shakespeare-caesar.txt': 54,
 'Tero': 0,
 'Parviainen': 10,
 'added': 0,
 'nltk': 0,
 'data': 0,
 'Latest': 6,
 'commit': 6,
 'bbe04d6': 7,
 'on': 0,
 'Sep': 0,
 '19,': 0,
 '2010': 0,
 'History': 7,
 '0': 0,
 'contributors': 12,
 '3523': 0,
 'lines': 0,
 '(2774': 0,
 'sloc)': 0,
 '110': 0,
 'KB': 0,
 '[The': 0,
 'Tragedie': 8,
 'of': 0,
 'Julius': 6,
 'Caesar': 6,
 'by': 0,
 'William': 7,
 'Shakespeare': 11,
 '1599]': 0,
 'Actus': 0,
 'Primus.': 7,
 'Scoena': 6,
 'Prima.': 6,
 'Enter': 0,
 'Flauius,': 8,
 'Murellus,': 9,
 'and': 0,
 'certaine': 8,
 'Commoners': 9,
 'ouer': 0,
 'the': 0,
 'Stage.': 6,
 'Flauius.': 8,
 'Hence:': 6,

c) You can also perform operations on both the key and values in a dictionary. In our case, let us make all the keys lower and get the remainder of the value when divided by 2.

In [45]:
word_dict = {i.lower():j%2 for i, j in words_dict.items()}

word_dict

{'skip': 0,
 'to': 0,
 'content': 1,
 'search': 0,
 'or': 0,
 'jump': 0,
 'to…': 0,
 'pull': 0,
 'requests': 0,
 'issues': 0,
 'marketplace': 1,
 'explore': 1,
 '@sakzsee': 0,
 'teropa': 0,
 '/': 0,
 'nlp': 0,
 '6': 0,
 '100': 0,
 '157': 0,
 'code': 0,
 'actions': 1,
 'projects': 0,
 'wiki': 0,
 'security': 0,
 'insights': 0,
 'nlp/resources/corpora/gutenberg/shakespeare-caesar.txt': 0,
 'tero': 0,
 'parviainen': 0,
 'added': 0,
 'nltk': 0,
 'data': 0,
 'latest': 0,
 'commit': 0,
 'bbe04d6': 1,
 'on': 0,
 'sep': 0,
 '19,': 0,
 '2010': 0,
 'history': 1,
 '0': 0,
 'contributors': 0,
 '3523': 0,
 'lines': 0,
 '(2774': 0,
 'sloc)': 0,
 '110': 0,
 'kb': 0,
 '[the': 0,
 'tragedie': 0,
 'of': 0,
 'julius': 0,
 'caesar': 0,
 'by': 0,
 'william': 1,
 'shakespeare': 1,
 '1599]': 0,
 'actus': 0,
 'primus.': 1,
 'scoena': 0,
 'prima.': 0,
 'enter': 0,
 'flauius,': 0,
 'murellus,': 1,
 'and': 0,
 'certaine': 0,
 'commoners': 1,
 'ouer': 0,
 'the': 0,
 'stage.': 0,
 'flauius.': 0,
 'hence:': 0,
 'ho