# List Comprehension in Python

List comprehension provides a concise and elegant way to construct lists. Every advanced Python programmer should know how to use it.

### Example 1:

Let's say we want to create a list of the numbers 0 through 4 squared. One way we can do that is like this:

In [1]:
# Create an empty list
list1 = []

# Run a for loop to get squared version of numbers 
for i in range(0,5):
    list1.append(i*i)
    
# Display the results
print(list1)

[0, 1, 4, 9, 16]


We can also achieve the same thing in one line of code using list comprehension:

In [2]:
# Creating the list using list comprehension
list2 = [i*i for i in range(0,5)]

# Display the result
print(list2)

[0, 1, 4, 9, 16]


### List Comprehension Syntax

The general structure when using list comprehension is:

* `new_list = [expression for member in iterable]`

The iterable can be any iterable. So, it can be a `range()` object, a list, a set, or a generator object. It is very flexible. 

We can also have a very flexible expression. If the expression is simple, we can just keep it in the brackets. However, it is complex, we can do this:

In [3]:
# Defining the complex function
def cube(i):
    return i*i*i

# Making the new list using the function we defined
cubes = [cube(i) for i in range(0,5)]

# Displaying the result 
print(cubes)

[0, 1, 8, 27, 64]


As you can see, we can write a function for our expression. Furthermore, the list comprehension can have an optional `if` conditional at the end:

* `new_list = [expression for member in iterable (if conditional)]`

This will filter the elements. So, let's say we want to all the even numbers between 0 and 20:

In [4]:
# Making the list
evens = [i for i in range(0,20) if i%2 == 0]

# Display the result
print(evens)

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]


### Modifying Numbers

We can also write the `if-else` conditional in the middle. It looks like:

* `new_list = [expression (if else conditional) for member in iterable]`

This will modify the elements, it will not filter them. For example:

In [5]:
# Making the list
a = [1,2,3,4,5,6,7,8]

# Modifying the numbers in a, smaller than 4 == 0
b = [0 if i < 4 else i for i in a]

# Display the result
print(b)

[0, 0, 0, 4, 5, 6, 7, 8]


### Unique Vowels

List comprehensions works for a variety of data types and data structures, such as strings, sets, and dictionaries.

Let's look at an example where we have a quote and we want to extract the unique vowels in the quote:

As a note, the syntax is the same:

* `{expression for member in iterable}`

We don't have any elements more than one time because sets have only unique elements. 

In [6]:
# Making the quote
quote = "hello everybody"

# Extracting unique vowels using a set
unique_vowels = {i for i in quote if i in 'aeiou'}

# Display the result
print(unique_vowels)

{'o', 'e'}


List comprehension also works for dictionaries. So, let's say we want to have a dictionary of squared numbers with the values as the key and the squared values as value.

In [7]:
# Making the dictionary
squared = {i: i*i for i in range(0,5)}

# Displaying the result
print(squared)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}


### Nested List Comprehension

Let's quickly talk about nested list comprehension. We can do a nested list comprehension. This is typically used when you want to have a 2-D matrix.

First, you will have the outer list comprehension and then an inner list comprehension. In the first, you will iterate over the first number and in the second you will iterate over the second member. 

In [8]:
# Making a 2-D matrix using nexted list comprehension
matrix2d = [[i*j for i in range(0,5)] for j in range(0,3)]

# Display the result
print(matrix2d)

[[0, 0, 0, 0, 0], [0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]


This is usually very confused and it is not recommended to use it. But we should know that this type of functionality exists. 

### When Not to Use List Comprehension

List comprehension is very nice, but it works by loading the entire output list into our memory. This can get a very large object. Sometimes it is better to use a generator for this. 

Let's say we have a list with squared numbers for a large range:

* `[i*i for i in range(0,1000)`

And we want to calculate the sum of this list. We can then do:

* `s = sum([i*i for i in range(0,1000)])`

In [9]:
# Making the list
s = sum([i*i for i in range(0,1000)])
        
# Displaying the result
print(s)

332833500


Although this works, it takes up a lot of memory. The better way to do this is to use a generator. We can use generator expressions which uses parantheses instead of brackets. 

* `s = sum((i*i for i in range(0,1000)))`

This will do the same thing but create a generator object instead of a list object:

In [10]:
# Making the list
s = sum((i*i for i in range(0,1000)))
        
# Displaying the result
print(s)

332833500


As you can see, this does the same thing. Let's look at the size of both of these objects:

In [11]:
# List object
l = [i*i for i in range(0,1000)]

# Generator object
g = (i*i for i in range(0,1000))

# Loading library
import sys

# Getting sizes
print(sys.getsizeof(l))
print(sys.getsizeof(g))

8856
112


As we can see, our list is almost 80 times bigger in terms of byes than our generator object. This is a small example, but if our iterator `range()` was much larger, then you can see what type of problems it can cause in terms of memory. 

So, just keep in mind that sometimes generators can be better.

### Testing Your Code

List comprehension can sometimes be faster, but this is not always the case. You shoud always test your code if you want to improve the speed. Let's time our functions.

First, we will time our list comprehension.

In [12]:
# Importing library
from timeit import default_timer as timer

# Creating a list using list comprehension and timing it
start = timer()
a = [i*i for i in range(0,1_000_000)] # You can use underscores in place of commas
end = timer()

# Display the results
print(end - start)

0.06872350000000038


Now, we will time our typical `for` loop.

In [13]:
# Importing library
from timeit import default_timer as timer

# Creating an empty list
a = []

# Running the for loop and timing it
start = timer()
for i in range(0, 1_000_000):
    a.append(i*i)
end = timer()

# Display the results
print(end - start)

0.1394597000000002


As we can see, list comprehension is about 2 times faster than using a traditional `for` loop. This is not always a case. This is why testing your code is a good practice. 

### More List Comprehension Examples 

Based on a list of fruits, you want a new list, containing only the fruits with the letter "a" in the name:

In [14]:
# Traditional for loop
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = []

for x in fruits:
    if "a" in x:
        newlist.append(x)

print(newlist)

# List comprehension 
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]

newlist = [x for x in fruits if "a" in x]

print(newlist)

['apple', 'banana', 'mango']
['apple', 'banana', 'mango']


Only accepting the items that are not "apple":

In [15]:
# Traditional for loop
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = []

for x in fruits:
    if x != "apple":
        newlist.append(x)

print(newlist)

# List comprehension 
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]

newlist = [x for x in fruits if x != "apple"]

print(newlist)

['banana', 'cherry', 'kiwi', 'mango']
['banana', 'cherry', 'kiwi', 'mango']


The expression with more conditions:

In [16]:
# Traditional for loop
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = []

for x in fruits:
    if x != "banana":
        newlist.append(x)
    else:
        newlist.append("orange")
        

print(newlist)

# List comprehension 
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]

newlist = [x if x != "banana" else "orange" for x in fruits]

print(newlist)

['apple', 'orange', 'cherry', 'kiwi', 'mango']
['apple', 'orange', 'cherry', 'kiwi', 'mango']


Let's convert integers to strings:

In [17]:
# Traditional for loop
numbers = [1,2,3,4,5,6,7,8]
newlist = []

for i in numbers:
    newlist.append(str(i))

print(newlist)

# Using list comprehension
numbers = [1,2,3,4,5,6,7,8]

newlist = [str(i) for i in numbers]

print(newlist)

['1', '2', '3', '4', '5', '6', '7', '8']
['1', '2', '3', '4', '5', '6', '7', '8']


### Towards Data Science: 11 Examples to Master List Comprehension

Example 1: Only printing numbers greater than 5.

In [18]:
a = [4,6,7,3,2]
b = [x for x in a if x > 5]
print(b)

[6, 7]


Example 2: Multiplying (by 2) and printing numbers greater than 5.

In [19]:
a = [4,6,7,3,2]
b = [x*2 for x in a if x > 5]
print(b)

[12, 14]


Example 3: Extracting words that start with "c".

In [20]:
names = ['Ch','Dh','Eh','cb','Tb','Td']
new_names = [name for name in names if name.lower().startswith('c')]
print(new_names)

['Ch', 'cb']


Example 4: The iterable does not have to be a list. It can be any python iterable. For instance, we can iterate over a 2-dimensional NumPy array which is actually a matrix. We iterate over the rows in matrix A and take the maximum number.

In [21]:
import numpy as np
A = np.random.randint(10, size=(4,4))
print(A)
max_element = [max(i) for i in A]
print(max_element)

[[8 1 3 5]
 [3 6 0 0]
 [3 6 8 3]
 [6 1 5 9]]
[8, 6, 8, 9]


Example 5: Lists can store any data type. Let’s do an example with a list of lists. We create a list of the maximum values in each list.

In [22]:
vals = [[1,2,3],[4,5,2],[3,2,6]]
print(vals)
vals_max = [max(x) for x in vals]
print(vals_max)

[[1, 2, 3], [4, 5, 2], [3, 2, 6]]
[3, 5, 6]


Example 6: We can have multiple conditions in a list comprehension. We get the strings that end with the letter “b” and have a length greater than 2.

In [23]:
names = ['Ch','Dh','Eh','cb','Tb','Td','Chb','Tdb']
new_names = [name for name in names if name.lower().endswith('b') and len(name) > 2]
print(new_names)

['Chb', 'Tdb']


Example 7: We can combine multiple conditions with other logical operators. Getting names that start with "c" or end with "b". 

In [24]:
names = ['chb', 'ydb', 'thd', 'hgh']
new_names = [name for name in names if name.endswith('b') | name.startswith('c')]
print(new_names)

['chb', 'ydb']


Example 8: We can also have nested list comprehensions which are a little bit more complex. They represent nested for loops. Have a list of lists and we want to take out each element from the nested lists.

In [25]:
vals = [[1,2,3],[4,5,2],[3,2,6]]
print(vals)
vals_exp = [y for x in vals for y in x]
print(vals_exp)

[[1, 2, 3], [4, 5, 2], [3, 2, 6]]
[1, 2, 3, 4, 5, 2, 3, 2, 6]


This nested list comprehension looks like:

![image.png](attachment:image.png)

Image by [Soner Yildirim](https://towardsdatascience.com/11-examples-to-master-python-list-comprehensions-33c681b56212)

This above example can be done using the explode function in `pandas`. It can be done with the explode function as follows: 

* `pd.Series(vals).explode()`

It returns a pandas series but you can easily convert it to a list.

Example 9: We can also add conditions in nested list comprehensions. We only want the strings in nested lists whose length is greater than 3.

In [26]:
text = [['bar','foo','fooba'],['Rome','Madrid','Houston'], ['aa','bb','cc','dd']]
text_1 = [y for x in text if len(x)>3 for y in x]
print(text_1)

['aa', 'bb', 'cc', 'dd']


The `for` loop for the above example would look like:

![image.png](attachment:image.png)

Example 10: We can also put a condition on individual elements.

In [27]:
text = [['bar','foo','fooba'],['Rome','Madrid','Houston'], ['aa','bb','cc','dd']]
text_2 = [y for x in text for y in x if len(y)>4]
print(text_2)

['fooba', 'Madrid', 'Houston']


We now have strings that are longer than 4 characters. Since the condition is on individual elements, the equivalent nested for/if loops:

![image.png](attachment:image.png)

Example 11: We may also need to put conditions on both nested lists and individual items.

In [28]:
text = [['bar','foo','fooba'],['Rome','Madrid','Houston'], ['aa','bb','cc','dd']]
text_3 = [y.upper() for x in text if len(x) == 3 for y in x if y.startswith('f')]
print(text_3)

['FOO', 'FOOBA']


The equivalent for/if loops:

![image.png](attachment:image.png)