## The Zen of Python

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## Know which version of python you're using

In [2]:
!python --version

Python 2.7.12 :: Anaconda 4.2.0 (64-bit)


In [3]:
import sys
sys.version

'2.7.12 |Anaconda 4.2.0 (64-bit)| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]'

In [4]:
sys.version_info

sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0)

In [5]:
sys.version_info.major, sys.version_info.minor

(2, 7)

## Follow the PEP8 style guide
Pylint https://www.pylint.org/ is a tool for enforcement of the PEP8 style guide.

## Know the difference between bytes, str and unicode
In Python 3 there are two types that represent character sequences: bytes (8-bit values) and str (unicode characters). In Python 2 we have str (8 bit values) and unicode. In other words, Python 3 str instances are akin to Python 2 unicode characters. The core of your program should use unicode character types (str in Python 3, unicode in Python 2). You'll often need helper functions to convert between these cases.

In [4]:
# for Python 3 we'll need a method that takes str or bytes and always returns a str
def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode("utf-8")
    else:
        value = bytes_or_str
    return(value)

# for Python 2 you'll need a method that always takes a str or unicode and returns unicode
def to_str(unicode_or_str):
    if isinstance(unicode_or_str, str):
        value = unicode_or_str.decode("utf-8")
    else:
        value = unicode_or_str
    return(value)

## Write helper functions instead of complex expressions

## Know how to slice sequences

In [6]:
letters = "abcdefgh"
a = [i for i in letters] # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print("First four:", a[:4])
print("Last four:", a[-4:])
print("Middle two:", a[3:-3])

('First four:', ['a', 'b', 'c', 'd'])
('Last four:', ['e', 'f', 'g', 'h'])
('Middle two:', ['d', 'e'])


## Avoid using start, end, stride in a single slice

In [7]:
a = ["red", "orange", "yellow", "green", "blue", "purple"]
odds = a[::2]
evens = a[1::2]
print(odds)
print(evens)

['red', 'yellow', 'blue']
['orange', 'green', 'purple']


## Use list comprehensions instead of map and filter 

In [11]:
a = range(1, 11) # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [x**2 for x in a]
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


map requires creating a lambda function which is visually noisy

In [12]:
squares = map(lambda x: x**2, a) # map the lambda function to each item in a
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


In [25]:
a = range(1, 101)

In [26]:
%timeit [x**2 for x in a]

100000 loops, best of 3: 5.72 µs per loop


In [27]:
%timeit map(lambda x: x**2, a) # map also looks slower for many modifications of a

100000 loops, best of 3: 13.5 µs per loop


List comprehensions also let you easily filter items

In [29]:
a = range(1, 11)
even_squares = [x**2 for x in a if x % 2 == 0]
print(even_squares)

[4, 16, 36, 64, 100]


filter can be used along with map to achieve the same outcome but is harder to read.

In [30]:
even_squares =  map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))
print(even_squares)

[4, 16, 36, 64, 100]


In [33]:
a = range(1, 101)

In [36]:
%timeit [x**2 for x in a if x % 2 == 0]

The slowest run took 6.66 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.19 µs per loop


In [37]:
%timeit map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))

100000 loops, best of 3: 18.1 µs per loop


## Avoid more than two expressions in list comprehensions
The rule of thumb is to avoid using more than two expressions in a list comprehension.

In [2]:
# this is relatively easy to read
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
flat

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [4]:
# still relatively easy to read
squared = [[x**2 for x in row] for row in matrix]
squared

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]

If you add extra loops then you should really break it over several lines. It'll likely be clearer in such a scenario to use nested for loops.


In [6]:
# we can also use if statements in the list comprehensions
filtered = [[x for x in row if x % 3 == 0]
            for row in matrix if sum(row) >= 10]
filtered

[[6], [9]]

## Consider generator expressions for large comprehensions.
List comprehensions create a whole new list for each value in the input sequence which can be memory intensive for large input. To get around this we can use generator expressions: evaluate to an iterator which yields one item at a time.

In [32]:
it = (x**2 for x in range(10))
it

<generator object <genexpr> at 0x00000000041C95E8>

Here the iterator can be advanced one step at a time.

In [33]:
it = (x**2 for x in range(10))
for i in it:
    print(i)

0
1
4
9
16
25
36
49
64
81


In [34]:
# alternatively
# https://stackoverflow.com/questions/661603/how-do-i-know-if-a-generator-is-empty-from-the-start
it = (x**2 for x in range(10))
while True:
    try:
        value = it.next() # get the next value in the iterator
    except StopIteration: # no more values in the iterator
        break
    print(value)

0
1
4
9
16
25
36
49
64
81


## Prefer enumerate over range

In [26]:
letters = [i for i in "ABCDE"]
letters

['A', 'B', 'C', 'D', 'E']

In [27]:
for index, value in enumerate(letters):
    print("Value at index %d is %s" % (index, value))

Value at index 0 is A
Value at index 1 is B
Value at index 2 is C
Value at index 3 is D
Value at index 4 is E


In [28]:
# this is more elegant than
for i in range(len(letters)):
    print("Value at index %d is %s" % (i, letters[i]))

Value at index 0 is A
Value at index 1 is B
Value at index 2 is C
Value at index 3 is D
Value at index 4 is E


In [31]:
# we can also specify the starting counter for enumerate
for index, value in enumerate(letters, 10): # start from index 10
    print("Value at index %d is %s" % (index, value))

Value at index 10 is A
Value at index 11 is B
Value at index 12 is C
Value at index 13 is D
Value at index 14 is E


## Use zip to process iterators in parallel

In [36]:
sent = "Here are some words!"
words = sent.split(" ")
words

['Here', 'are', 'some', 'words!']

In [38]:
wordLength = [len(i) for i in words]
wordLength

[4, 3, 4, 6]

In [39]:
for word, numChars in zip(words, wordLength):
    print(word, numChars)

('Here', 4)
('are', 3)
('some', 4)
('words!', 6)


## Avoid else blocks after for and while loops

In [42]:
for i in range(3):
    print(i)
else: # this will run if the loop finishes
    print("The else block runs!")

0
1
2
The else block runs!


In [41]:
for i in range(3):
    print(i)
    if i == 1:
        break
else: # this will run if the loop finishes
    print("The else block runs!")

0
1


In the second case the loop broke early and the else block did not run. This feature is generally not available in other languages and can be confusing. Better to write helper functions.

## Take advantage of each block in try / except / else / finally
* The finally section is great for cleanup such as closing open resources (database connections / files).
* When the try block doesn't raise an exception the else block will run. The else block helps you minimise the amount of code in the try block and improves readibility.