- https://github.com/bslatkin/effectivepython

### Outline
- Item 1. Know Which Version of Python You're Using
- Item 2. Follow the PEP 8 Style Guide
- Item 3. Know the Diffirences Between `bytes`, `str` and `unicode`
- Item 4. Write Helper Functions Instead of Complex Expressions
- Item 5. Know How to Slice Sequences
- Item 6. Avoid Using `start`, `end` and `stride` in a Single Slice
- Item 7. Use List Comprehensions Instead of `map` and `filter`
- Item 8. Avoid More Than Two Expressions in List Comprehensions
- Item 9. Consider Generator Expressions for Large Comprehensions
- Item 10. Prefer `enumerate` Over `range`
- Item 11. Use `zip` to Process Iterators in Parallel
- Item 12. Avoid `else` Blocks After `for` and `while` Loops
- Item 13. Take Advantages of Each Block in `try/except/else/finally`

In [1]:
import logging
from pprint import pprint
from sys import stdout as STDOUT

## Item 1. Know Which Version of Python You're Using

In [2]:
import sys
print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=6, micro=3, releaselevel='final', serial=0)
3.6.3 |Anaconda, Inc.| (default, Nov  8 2017, 18:10:31) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]


## Item 2. Follow the PEP 8 Style Guide

- [PEP 8 -- Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/)
- [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html)
- [PEP 20 -- The Zen of Python](https://www.python.org/dev/peps/pep-0020/)

## Item 3. Know the Diffirences Between `bytes`, `str` and `unicode`

- In Python 3,
  - `bytes` contains sequences of 8-bit values
  - `str` contains sequences of Unicode characters
  - `bytes` and `str` can't be usesd together operators
- In Python 2,
  - `str` contains sequences of 8-bit values
  - `unicode` contains sequences of Unicode characters
  - `str` and `unicode` can be used together if `str` contains 7-bit ASCII chars
- Use **helper** functions to convert all character seqeunce
- Read or write binary data, using: `rb` or `wb`

In [3]:
# Convert to str
def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of str

print(repr(to_str(b'foo')))
print(repr(to_str('foo')))

'foo'
'foo'


In [4]:
# Convert to bytes
def to_bytes(bytes_or_str):
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of bytes

print(repr(to_bytes(b'foo')))
print(repr(to_bytes('foo')))

b'foo'
b'foo'


## Item 4. Write Helper Functions Instead of Complex Expressions

- Python's syntax makes it all too easy to write single-line expressions thar are overly complicated and difficult to read
- Move complex expressions into helper functions, especially if you nead to use the same logic repeatedly
- The `if/else` expression provides a more readable alternative to using Boolean operators like `or` and `and` in expressions

In [5]:
# Prepare example data
from urllib.parse import parse_qs
my_values = parse_qs('red=5&blue=0&green=',
                     keep_blank_values=True)
print(repr(my_values))

{'red': ['5'], 'blue': ['0'], 'green': ['']}


In [6]:
# Difficult t to read
red = my_values.get('red', [''])
red = int(red[0]) if red[0] else 0
green = my_values.get('green', [''])
green = int(green[0]) if green[0] else 0
opacity = my_values.get('opacity', [''])
opacity = int(opacity[0]) if opacity[0] else 0
print('Red:     %r' % red)
print('Green:   %r' % green)
print('Opacity: %r' % opacity)

Red:     5
Green:   0
Opacity: 0


In [7]:
# Using Helper function
def get_first_int(values, key, default=0):
    found = values.get(key, [''])
    if found[0]:
        found = int(found[0])
    else:
        found = default
    return found


# Example 8
read = get_first_int(my_values, 'read')
green = get_first_int(my_values, 'green')
opacity = get_first_int(my_values, 'opacity')

print('Red:   %r' % red)
print('Green:   %r' % green)
print('Opacity:   %r' % opacity)

Red:   5
Green:   0
Opacity:   0


## Item 5. Know How to Slice Sequences

- Avoid being verbose. Don't supply the start and end indexes.

In [8]:
# Example 1
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('First four:', a[:4])
print('Last four: ', a[-4:])
print('Middle two:', a[3:-3])

First four: ['a', 'b', 'c', 'd']
Last four:  ['e', 'f', 'g', 'h']
Middle two: ['d', 'e']


In [9]:
# Result of the slicing a list is a whole new list
b = a[4:]
print('Before:   ', b)
b[1] = 99
print('After:    ', b)
print('No change:', a)

Before:    ['e', 'f', 'g', 'h']
After:     ['e', 99, 'g', 'h']
No change: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


In [10]:
# Length of the slice assignments don't need to be same
print('Before ', a)
a[2:7] = [99, 22, 14]
print('After  ', a)

Before  ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After   ['a', 'b', 99, 22, 14, 'h']


In [11]:
# Copying original list is leav out start and end indexes.
b = a[:]
assert b == a and b is not a

In [12]:
# Referenced
b = a
print('Before', a)
a[:] = [101, 102, 103]
assert a is b           # Still the same list object
print('After ', a)      # Now has different contents

Before ['a', 'b', 99, 22, 14, 'h']
After  [101, 102, 103]


## Item 6. Avoid Using `start`, `end` and `stride` in a Single Slice

In [13]:
# Striding
a = ['red', 'orange', 'yellow', 'green', 'blue', 'purple']
odds = a[::2]
evens = a[1::2]
print(odds)
print(evens)

['red', 'yellow', 'blue']
['orange', 'green', 'purple']


In [14]:
# Reversing byte string (workds ASCII too)
x = b'mongoose'
y = x[::-1]
print(y)

b'esoognom'


In [15]:
# but won't work Unicode
try:
    w = '謝謝'
    x = w.encode('utf-8')
    y = x[::-1]
    z = y.decode('utf-8')
except:
    logging.exception('Expected')
else:
    assert False

ERROR:root:Expected
Traceback (most recent call last):
  File "<ipython-input-15-dc6d675da098>", line 6, in <module>
    z = y.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9d in position 0: invalid start byte


In [16]:
# Striding with backwards
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print(a[::2])   # ['a', 'c', 'e', 'g']
print(a[::-2])  # ['h', 'f', 'd', 'b']

['a', 'c', 'e', 'g']
['h', 'f', 'd', 'b']


- Specifying `start`, `end` and `stride` in a slice can be extremely confusing.
- Prefer using positive `stride` values in slices without `start` or `end` indexes.
Avoid using negative `stride` values, if possible.
- Using `islice` from `itertools` build-in module

## Item 7. Use List Comprehensions Instead of `map` and `filter`

In [17]:
# Python List comprehension
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [x**2 for x in a]
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


In [18]:
# List comprehension is clearer than map
squares = map(lambda x: x ** 2, a)
print(list(squares))

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


In [19]:
# Filtering in list comprehension
even_squares = [x**2 for x in a if x % 2 == 0]
print(even_squares)

[4, 16, 36, 64, 100]


In [20]:
# using filter requeres map
alt = map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))
assert even_squares == list(alt)

In [21]:
# Dictionaries and sets have same comprehensions as list
chile_ranks = {'ghost': 1, 'habanero': 2, 'cayenne': 3}
rank_dict = {rank: name for name, rank in chile_ranks.items()}
chile_len_set = {len(name) for name in rank_dict.values()}
print(rank_dict)
print(chile_len_set)

{1: 'ghost', 2: 'habanero', 3: 'cayenne'}
{8, 5, 7}


- List comprehensions are clearer then the `map` and `filter` built-in functions, because they don't require extra `lambda` expressions
- `Dictionaries` and `sets` alse support comprehension expressions

## Item 8. Avoid More Than Two Expressions in List Comprehensions

In [22]:
# Matrix to flat list
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [23]:
# Change each matrix's element
squared = [[x**2 for x in row] for row in matrix]
print(squared)

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]


In [24]:
# Multiple list comprehensions is harder to read
my_lists = [
    [[1, 2, 3], [4, 5, 6]],
    [[7, 8, 9], [10, 11, 12]],
]
flat = [x for sublist1 in my_lists
        for sublist2 in sublist1
        for x in sublist2]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]


In [25]:
# Indentation is clearer
flat = []
for sublist1 in my_lists:
    for sublist2 in sublist1:
        flat.extend(sublist2)
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]


In [26]:
# List comprehension is short, but difficult to read
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
filtered = [[x for x in row if x % 3 == 0]
            for row in matrix if sum(row) >= 10]
print(filtered)

[[6], [9]]


- List comprehensions support multiple levels of loops and multiple conditions per loop level
- List comprehensions that more the two expressions are difficult to read and should be avoided

## Item 9. Consider Generator Expressions for Large Comprehensions

In [27]:
# Bad example of using list comprehensions. (if input is too large)
import random
with open('my_file.txt', 'w') as f:
    for _ in range(10):
        f.write('a' * random.randint(0, 100))
        f.write('\n')

value = [len(x) for x in open('my_file.txt')]
print(value)

[2, 47, 38, 77, 1, 23, 80, 37, 3, 59]


In [28]:
# Generator Expression
it = (len(x) for x in open('my_file.txt'))
print(it)

# get elements by iterator
print(next(it))
print(next(it))

<generator object <genexpr> at 0x109701e08>
2
47


In [29]:
# Generator expressions can be composed together
roots = ((x, x**0.5) for x in it)

# it will also advace interior iterator
print(next(roots))

(38, 6.164414002968976)


- List comprehensions can **cause problems for large inputs** by using too **much memory**.
- **Generator expressions avoid memory issues** by producing outputs one at a time **as an iterator**.
- **Iterators** returned by generator expressions are **stateful**.
- Generator expressions can be **composed by passing the iterator** from one generator expression into the **for** subexpression of another.
- Generator expressions **execute very quickly** when it **chainged** together.

## Item 10. Prefer `enumerate` Over `range`

In [30]:
# Iterate over set of integers
from random import randint
random_bits = 0
for i in range(64):
    if randint(0, 1):
        random_bits |= 1 << i
print(bin(random_bits))

0b101000001001010111101110110011111101111001001000000100110010100


In [31]:
# Iterator over sequence
flavor_list = ['vanilla', 'chocolate', 'pecan', 'strawberry']
for flavor in flavor_list:
    print('%s is delicious' % flavor)

vanilla is delicious
chocolate is delicious
pecan is delicious
strawberry is delicious


In [32]:
# Bad: Iterating and using indexes
for i in range(len(flavor_list)):
    flavor = flavor_list[i]
    print('%d: %s' % (i + 1, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry


In [33]:
# Good: Using enumerate
for i, flavor in enumerate(flavor_list):
    print('%d: %s' % (i + 1, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry


In [34]:
# Setting index
for i, flavor in enumerate(flavor_list, 1):
    print('%d: %s' % (i, flavor))

1: vanilla
2: chocolate
3: pecan
4: strawberry


- *Enumerate* provides concise syntax for **looping over an iterator** and **getting the index** of each item from the iterators as you go.
- **Prefare *enumerate*** instead of looping over *range* and indexing into a sequince.
- You **can supply a second parameter** to *enumerate* to specify then number from which to begin counting. (zero is default)

## Item 11. Use `zip` to Process Iterators in Parallel

In [36]:
# Creating related lists
names = ['Cecilia', 'Lise', 'Marie']
letters = [len(n) for n in names]
print(letters)

[7, 4, 5]


In [37]:
# Bad: Iterate over both list in parallel by indexing both
longest_name = None
max_letters = 0

for i in range(len(names)):
    count = letters[i]
    if count > max_letters:
        longest_name = names[i]
        max_letters = count

print(longest_name)

Cecilia


In [38]:
# Bad: Enumerate helps, but still not good
longest_name = None
max_letters = 0
for i, name in enumerate(names):
    count = letters[i]
    if count > max_letters:
        longest_name = name
        max_letters = count
print(longest_name)

Cecilia


In [39]:
# Good: Using zip is ideal
longest_name = None
max_letters = 0
for name, count in zip(names, letters):
    if count > max_letters:
        longest_name = name
        max_letters = count
print(longest_name)

Cecilia


In [41]:
# Note: If lists are different lengths
names.append('Rosalind')
for name, count in zip(names, letters):
    print(name, count)

Cecilia 7
Lise 4
Marie 5


- The **zip*** built-in function can be used to iterate over **multiple iterators in parallel**.
- In Python 3, *zip* is **lazy generator** that produces tuples. In Python 2, *zip* returns the full result as a list of tuples.
- *zip* **truncates** its output silently if you supply it with iterators of **different lengths**.
- Use ***zip_longest*** function from the ***itertools***.

## Item 12. Avoid `else` Blocks After `for` and `while` Loops

In [42]:
# Note: Python supports 'else' for 'for loop'
for i in range(3):
    print('Loop %d' % i)
else:
    print('Else block!')

Loop 0
Loop 1
Loop 2
Else block!


In [43]:
# Note: Using 'break' skips 'else'
for i in range(3):
    print('Loop %d' % i)
    if i == 1:
        break
else:
    print('Else block!')

Loop 0
Loop 1


In [44]:
# Note: immedietly run 'else'

for x in []:
    print('Never runs')
else:
    print('For Else block!')
    
# Example 4
while False:
    print('Never runs')
else:
    print('While Else block!')

For Else block!
While Else block!


In [46]:
# Bad: Usage of 'else'
a = 4
b = 9

for i in range(2, min(a, b) + 1):
    print('Testing', i)
    if a % i == 0 and b % i == 0:
        print('Not coprime')
        break
else:
    print('Coprime')

Testing 2
Testing 3
Testing 4
Coprime


In [48]:
# Good: Both works fine

# Example 1
def coprime(a, b):
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            return False
    return True
print(coprime(4, 9))
print(coprime(3, 6))


# Example 2
def coprime2(a, b):
    is_coprime = True
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            is_coprime = False
            break
    return is_coprime
print(coprime2(4, 9))
print(coprime2(3, 6))

True
False
True
False


- Python has special syntax that allows ***else*** block to immediately follow ***for*** and ***while*** loop interior blocks.
- The *else* block after a loop **only runs** if the loop did not encounter a **break** statement.
- **Avoid using *else*** blocks after loops because their behavior isn't intuitive and can be confusing.

## Item 13. Take Advantages of Each Block in `try/except/else/finally`

In [51]:
# Note: FINALLY block
handle = open('random_data.txt', 'w', encoding='utf-8')
handle.write('success\nand\nnew\nlines')
handle.close()

handle = open('random_data.txt')  # May raise IOError
try:
    data = handle.read()  # May raise UnicodeDecodeError
finally:
    handle.close()        # Always runs after try:

In [52]:
# Note: ELSE block
import json

def load_json_key(data, key):
    try:
        result_dict = json.loads(data)  # May raise ValueError
    except ValueError as e:
        raise KeyError from e
    else:
        return result_dict[key]         # May raise KeyError

# JSON decode successful
assert load_json_key('{"foo": "bar"}', 'foo') == 'bar'
try:
    load_json_key('{"foo": "bar"}', 'does not exist')
    assert False
except KeyError:
    pass  # Expected

# JSON decode fails
try:
    load_json_key('{"foo": bad payload', 'foo')
    assert False
except KeyError:
    pass  # Expected

In [53]:
# Note: All together
import json
UNDEFINED = object()

def divide_json(path):
    handle = open(path, 'r+')   # May raise IOError
    try:
        data = handle.read()    # May raise UnicodeDecodeError
        op = json.loads(data)   # May raise ValueError
        value = (
            op['numerator'] /
            op['denominator'])  # May raise ZeroDivisionError
    except ZeroDivisionError as e:
        return UNDEFINED
    else:
        op['result'] = value
        result = json.dumps(op)
        handle.seek(0)
        handle.write(result)    # May raise IOError
        return value
    finally:
        handle.close()          # Always runs

# Everything works
temp_path = 'random_data.json'
handle = open(temp_path, 'w')
handle.write('{"numerator": 1, "denominator": 10}')
handle.close()
assert divide_json(temp_path) == 0.1

# Divide by Zero error
handle = open(temp_path, 'w')
handle.write('{"numerator": 1, "denominator": 0}')
handle.close()
assert divide_json(temp_path) is UNDEFINED

# JSON decode error
handle = open(temp_path, 'w')
handle.write('{"numerator": 1 bad data')
handle.close()
try:
    divide_json(temp_path)
    assert False
except ValueError:
    pass  # Expected

- The ***try/finally*** compound statement lets you run **cleanup code** regardless of whether **exceptions** were raised in the *try* block.
- The ***else*** block helps you **minimize the amount of code** in *try* blocks visially distinguish the **success case** from ***try/except** block.
- An ***else*** block can be used to perform **additional actions** after a successful *try* block but before common cleanup in *finally* block.