# 1. Know which version of Python you're using

Although you can simply check the python version from console with 'python --version' command, you can also check this in code like below.

In [2]:
import sys

print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=9, micro=7, releaselevel='final', serial=0)
3.9.7 (default, Sep 16 2021, 16:59:28) [MSC v.1916 64 bit (AMD64)]


# 2. Avoid complex expressions

If your expressions get complicated, split them into smaller functions and move them to a helper function.
Suppose you have a dictionary my_values as below.
You want to look for some particular keys. For the key, the value returned can be a list with only one number, an empty list or None if the key is not present.

In [4]:
my_values = {'red': ['5'], 'green': [''], 'blue': ['0']}

print("Red: ", my_values.get('red'))
print("Green: ", my_values.get('green'))
print("Opacity: ", my_values.get('opacity'))

Red:  ['5']
Green:  ['']
Opacity:  None


Now, suppose you want to get only the first integer value for a key and 0 if the key is not present.
Python can do it in one single line of code.

In [17]:
my_values = {'red': ['5'], 'green': [''], 'blue': ['0']}

print("Red :", int(my_values.get('red', [''])[0] or 0))
print("Green :", int(my_values.get('green', [''])[0] or 0))
print("Opacity :", int(my_values.get('opacity', [''])[0] or 0))

Red : 5
Green : 0
Opacity : 0


Although short, this code does a lot of things in one line and is confusing.
Instead, consider the following helper function.

In [20]:
def get_first_int(values, key, default=0):
    query_result = values.get(key, [''])
    if query_result[0]:
        query_result = int(query_result[0])
    else:
        query_result = default
    return query_result

my_values = {'red': ['5'], 'green': [''], 'blue': ['0']}

print("Red :", get_first_int(my_values, 'red'))
print("Green :", get_first_int(my_values, 'green'))
print("Opacity :", get_first_int(my_values, 'opacity'))

Red : 5
Green : 0
Opacity : 0


# 3. Tricks with slice operations

If you have a list and want to change a few of the entries in between, you can do that with slices. The number of list elements that you want to replace need not be same as the number of list elements that will be added.

In the example below, I am replacing 4 letters with 3 numbers and it works fine

In [1]:
my_list = [1, 2, 3, 4, 'a', 'b', 'c', 'd', 8, 9, 10]

my_list[4:8] = [5, 6, 7]

print(my_list)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


While referencing a list, slices are forgiving of the limit which is not possible for direct referencing.

In [2]:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print(my_list[:20])

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


# 4. Avoid using start, end and stride in a single slice

Specifying start, end and stride in a slice can be extremely confusing. If you need all three parameters, consider doing two assignments, one to slice and another to stride with positive value(negative strides are confusing!).

In [19]:
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
b = a[::2]
print(b)
c = b[1:-1]
print(c)

['a', 'c', 'e', 'g']
['c', 'e']


# 5. Use list comprehensions instead of map and filter.

Suppose you have a list of integers and you need a list with the square of each integer. Following are two ways of doing it, but the list comprehension is much more readable.

Note: The map function returns a map object, so it needs to be converted to a list before print() in the example.

In [18]:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

squares_lc = [x**2 for x in my_list]
print(squares_lc)

squares_map = map(lambda x: x**2, my_list)
print(list(squares_map))

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


Now, suppose your requirement is to get only the squares of the even numbers. This can be done from my_list in two ways as shown below, but using list comprehensions is again more readable.

In [17]:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

even_squares_lc = [x**2 for x in my_list if x%2 == 0]
print(even_squares_lc)

even_squares_filter = map(lambda x: x**2, filter(lambda x: x%2 == 0, my_list))
print(list(even_squares_filter))

[4, 16, 36, 64, 100]
[4, 16, 36, 64, 100]


# 6. Avoid more than two expressions in a list comprehension

List comprehensions support multiple levels of loops and multiple conditions per loop level.
The rule of thumb is to avoid using more than two expressions in a list comprehension. This could be two conditions, two loops, or one condition and one loop. As soon as it gets more complicated than this, better to use if an for statements or write a helper function.

Example 1: suppose you want to simplify a matrix into one flat list of all cells.

In [16]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9]


Example 2: suppose you want to square the value in each cell of a two dimentional matrix.

In [2]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
squared = [[x**2 for x in row] for row in matrix]
print(squared)

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]


Example 3: suppose you want to filter the above matrix such that the only cells remaining are those divisible by 3 in rows that sum to 10 or higher. This has too many expressions and conditions.

In [15]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
filtered = [[x for x in row if x % 3 == 0] for row in matrix if sum(row) >=10]
print(filtered)

[[6], [9]]


# 7. Use generator expressions for large comprehensions

List comprehensions can cause problems for large inputs by using too much memeory.
Generator expressions avoid memory issues by producing output one at a time like an iterator.
Generator expressions execute very quickly when chained together.

A common use case of generators is to work with data streams or large files, like CSV files. These text files separate data into columns by using commas. This format is a common way to share data. Now, what if you want to count the number of rows in a CSV file? 

In [14]:
rows = (row for row in open("data_sheet.csv"))

print(rows)

for i in range(5):
    print(next(rows))

<generator object <genexpr> at 0x000002029BE3E9E0>
permalink,company,numEmps,category,city,state,fundedDate,raisedAmt,raisedCurrency,round

lifelock,LifeLock,,web,Tempe,AZ,1-May-07,6850000,USD,b

lifelock,LifeLock,,web,Tempe,AZ,1-Oct-06,6000000,USD,a

lifelock,LifeLock,,web,Tempe,AZ,1-Jan-08,25000000,USD,c

mycityfaces,MyCityFaces,7,web,Scottsdale,AZ,1-Jan-08,50000,USD,seed



# 8. Prefer enumerate over range

When you have to iterate over a data structure like a list of strings, you can directly loop it over with a for.
Sometimes, you want to iterate over a list and also want to know index of the current item. One way of doing this is:

In [13]:
flavor_list = ["vanilla", "strawberry", "mango", "chocolate"]
for i in range(len(flavor_list)):
    print(f"{i + 1} : {flavor_list[i]}")

1 : vanilla
2 : strawberry
3 : mango
4 : chocolate


A more elegant way of achieving the same result is to use enumerate like this:

In [12]:
flavor_list = ["vanilla", "strawberry", "mango", "chocolate"]
for i, flavor in enumerate(flavor_list):
    print(f"{i} : {flavor_list[i]}")

0 : vanilla
1 : strawberry
2 : mango
3 : chocolate


# 9. Use zip to process iterators in parallel

Often in python, you need to work with many lists of related objects. List comprehensions make it easy to take a source list and get a derived list.

The requirement is to get the longest name from the names list. One way of doing is to iterate over the list using enumerate:

In [5]:
names = ['Cecilia', 'Lise', 'Marie']
letters = [len(n) for n in names]

longest_name = None
max_letters = 0

for i, name in enumerate(names):
    count = letters[i]
    if count > max_letters:
        longest_name = names[i]
        max_letters = count
        
print(longest_name)

Cecilia


Another way of doing this is using the zip function and process the two lists in parallel. This is clearer as it needs no indexing.

In [4]:
names = ['Cecilia', 'Lise', 'Marie']
letters = [len(n) for n in names]

longest_name = None
max_letters = 0

for name, count in zip(names, letters):
    if count > max_letters:
        longest_name = name
        max_letters = count
        
print(longest_name)

Cecilia


Note: zip truncates its output silently if you supply it with iterators of different length.
The zip_longest function from the itertools built-in module lets you iterate over multiple iterators regardless of length.

# 10. Avoid 'else' blocks after 'for' and 'while' loops

In a for loop, the else block will run only if the loop runs completely and does not encounter a break statement.
In a while loop, the else block will run if the initial condition is false.

The rationale for these behaviors is that else blocks after loops are useful when you are using loops to search for something. Suppose you want to determine if two numbers are co-prime (their only common divisor is 1)

In [6]:
a = 4
b = 9

for i in range(2, min(a, b) + 1):
    print('testing', i)
    if a % i == 0 and b % i == 0:
        print("Not co-prime")
        break
else:
    print("Co-prime")

testing 2
testing 3
testing 4
Co-prime


In practice, this code shouldn't be written this way. Instead, write a helper function to do the calculation.

In [7]:
def coprime(a, b):
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            return False
    return True

if coprime(4, 9):
    print("Co-prime")

Co-prime


This approach is much clearer to readers. Simple constructs like loops should be self evident. Avoid using the else block after loops because their behavior isn't intuitive and can be confusing.