## 'Effective Python' Notes

### Chapter 1: Pythonic thinking

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


#####  Python version 
On command line:  
$ python --version  
Python 3.6.8 :: Anaconda custom (64-bit)

##### PEP8

**Whitespace:**  

- 4 spaces for indentation
- Lined <= 79 characters
- Continuation of long lines indented by 4 extra spaces
- In file, function and classes separated by 2 blank lines
- In class, methods separated by 1 blank line
- One space before and after variable assignments

**Naming:**

- functions, variables, attributes in lowercase_underscore format.
- protected instance attributes in \_leading_underscore format.
- private instance attributes in \_\_double_leading_underscore format. 
- Classes in CapitalizedWord format. 
- Module-level constants in ALL_CAPS format. 

**Expressions and statements**:

- inline negation is better than negation of positive expressions: 
    - **if a is not b** vs. if not a is b
- check empty values using **if not somelist** rather than if len(somelist) == 0. 
    - Similarly, use **if something** for non-empty values (evaluates to True if not empty). 
- spread **if** statemend, **for** and **while** loops and **except** statements over multiple lines for clarity. 
- Put **import** statements at top of file.
- Use absolute, not relative names, when importing modules. 

Pylint is a good linter to enforce the PEP 8 style guide.

##### bytes, str and unicode

- In python3, two types represnt sequences of characters: bytes and str.
    - Instances of **bytes** contrain raw **8-bit** values (binary). These are **machine-readable** and can thus be directly stored on disk.
    - Instances of **str** contain **Unicode** characters. These are **human-readable** and must thus be encoded (to Byte object) before they can be stored on disk.
        - An encoding is a format to represent audio, images, text, etc in bytes.
        - PNG, JPEG, MP3, WAV, ASCII, UTF-8 are forms of encoding. 
        - Strings are converted to bytes using either ASCII or UTF-8 encoding.
        
- Encoding and decoding of Unicode (str) should be done at the _furthest boundary of your interfaces._ Core of program should use unicode character types and should not assume anything about character encoding. Output text encoding should ideally by UTF-8.

- Note that operators involving file handles (returned by the **open** function) default to UTF-8 encoding. 
    - must use 'wb', 'rb' parameters when reading and writing binary files (rather than 'w' and 'b').

In [13]:
# encode string to bytes using either UTF-8 or ASCII
a = 'string_to_be_encoded'
print(a)
print(type(a))
c = a.encode('UTF-8')
print(c)
print(type(c))
d = a.encode('ASCII')
print(d)
print(type(d))
# decode
e = c.decode()
print(a==e)
# No need to specify encoding type when decoding
f = d.decode()
print(a==f)

string_to_be_encoded
<class 'str'>
b'string_to_be_encoded'
<class 'bytes'>
b'string_to_be_encoded'
<class 'bytes'>
True
True


In [17]:
def to_str(bytes_or_string):
    if isinstance(bytes_or_string, bytes):
        new_val = bytes_or_string.decode('UTF-8')
    else: 
        new_val = bytes_or_string
    return new_val

def to_bytes(bytes_or_string):
    if isinstance(bytes_or_string, str):
        new_val = bytes_or_string.encode('UTF-8')
    else:
        new_val = bytes_or_string
    return new_val
        

In [18]:
print(type(a))
print(type(c))
a_to_str = to_str(a)
c_to_str = to_str(c)
print(type(a_to_str))
print(type(c_to_str))
a_to_bytes = to_bytes(a)
c_to_bytes = to_bytes(c)
print(type(a_to_bytes))
print(type(c_to_bytes))

<class 'str'>
<class 'bytes'>
<class 'str'>
<class 'str'>
<class 'bytes'>
<class 'bytes'>


##### write helper functions instead of complex expressions

- As soon as expressions get complex, split them into smaller pieces and write a helper function.
- parse_qs returns: Dictionary keys are the unique query variable names and the values are lists of values for each name.

In [38]:
from urllib.parse import parse_qs
my_values = parse_qs('red=5&blue=0&green=', keep_blank_values=True)
print(f'{my_values}\n')

print(f"Red: {my_values.get('red')}")
print(f"Blue: {my_values.get('blue')}")
print(f"Green: {my_values.get('green')}") 
print(f"Yellow: {my_values.get('yellow')}") 

{'red': ['5'], 'blue': ['0'], 'green': ['']}

Red: ['5']
Blue: ['0']
Green: ['']
Yellow: None


In [42]:
# Empty string, empty list and 0 evaluate to False. 
# If you want 0 in all 3 cases, one (BAD) option is to use a Boolean expression
# Note dict's get method has parameter of value to return if key does not exist, 
# which we must use here
print(f"Red: {my_values.get('red', [''])[0] or 0}")
print(f"Blue: {my_values.get('blue', [''])[0] or 0}")
print(f"Green: {my_values.get('green', [''])[0] or 0}") 
print(f"Yellow: {my_values.get('yellow', [''])[0] or 0}") 

Red: 5
Blue: 0
Green: 0
Yellow: 0


In [45]:
# BETTER option: write a helper function!
# And we'll add additional logic to ensure that int is returned

def get_first_int(values, key):
    found = values.get(key, [''])
    if found[0]:
        found = int(found[0])
    else:
        found = 0
    return found
        

In [48]:
print(f"Red: {get_first_int(my_values, 'red')}")
print(f"Blue: {get_first_int(my_values, 'blue')}")
print(f"Green: {get_first_int(my_values, 'green')}")
print(f"Yellow: {get_first_int(my_values, 'yellow')}")

Red: 5
Blue: 0
Green: 0
Yellow: 0


##### slice sequences

- Simplest use: for built-in classes **list, str, bytes**. 
- Can be extended to any class that implements the \_\_getitem\_\_ and \_\_setitem\_\_ methods.
- start is **inclusive** and end is **exclusive**.
- Slicing makes a copy; original onject unaffected.

In [55]:
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print(a[:4])
print(a[-4:])
print(a[3:-3])
print(a[-1:])

['a', 'b', 'c', 'd']
['e', 'f', 'g', 'h']
['d', 'e']
['h']


In [58]:
# Slicing deals with start and end indeces beyound the boundaries of a list
# Indexing beyond boundaries causes an exception

print(a[:20])
print(a[-20:])
# print(a[20])

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


In [62]:
# leaving out start and end makes copy of whole list
b = a[:]
assert b == a
assert b is not a

##### stride of a slice syntax

- stride allows you to take every n^th item when slicing a sequence
- Avoid using start, end and stride together. It's confusing!
- Avoid using negative stride; it's also confusing!

In [67]:
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
evens = a[::2]
odds = a[1::2]
print(evens)
print(odds)
print(a[2::2])
# works, but confusing
print(a[-2::-2])
# works, but very confusing
print(a[-2:2:-2])

['a', 'c', 'e', 'g']
['b', 'd', 'f', 'h']
['c', 'e', 'g']
['g', 'e', 'c', 'a']
['g', 'e']


##### Use list comprehension instead of map and filter

- list comprehension = compact syntac for deriving one list from another. 
- **list comprehensions** are clearer than **map** functions.

In [77]:
# list comprehension is preferred here
a = [1, 2, 3]
a_sqr = [x**2 for x in a]
print(a_sqr)
# map works, but is more confusing - requires a lambda function for the computation
a_sqr_map = map(lambda x: x**2, a)
print(list(a_sqr_map))

[1, 4, 9]
[1, 4, 9]


In [80]:
# filtering using list comprehension is also easier than using map (not shown)
a_sqr_even = [x**2 for x in a if x%2 == 0]
print(a_sqr_even)

[4]


- Dictionaries and sets also have comprehensions

In [85]:
my_dict = dict(one = 1, two = 2, three = 3)
print(my_dict)
reverse_my_dict = {v:k for k, v in my_dict.items()}
print(reverse_my_dict)

{'one': 1, 'two': 2, 'three': 3}
{1: 'one', 2: 'two', 3: 'three'}


In [86]:
{type(v) for v in reverse_my_dict.values()}

{str}

##### avoid more than two expressions in list comprehension

- below examples are reasonable

In [91]:
# expressions run in order provided from left to right
list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for one_list in list_of_lists for x in one_list]
print(flat)

# list comprehension is equivalent to 
flat_list_w_loop = []
for one_list in list_of_lists:
    for x in one_list:
        flat_list_w_loop.append(x)
        
print(flat_list_w_loop)

[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [94]:
# square each 
squared = [[x**2 for x in one_list] for one_list in list_of_lists]
print(squared)

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]


In [105]:
# multiple if statement allowed in list comprehensions
# two ways to include multiple if statements
filtered_1 = [x for x in flat_list_w_loop if x > 4 and x % 2 == 0]
filtered_2 = [x for x in flat_list_w_loop if x > 4 if x % 2 == 0]
print(filtered_1)
print(filtered_2)

# else statement in list comprehension
# not that if statement must move
filtered_3 = [x if x > 4 and x % 2 == 0 else 0 for x in flat_list_w_loop]
print(filtered_3)


[6, 8]
[6, 8]
[0, 0, 0, 0, 0, 6, 0, 8, 0]


##### Consider generator expressions for large comprehensions

- List comprehensions create a whole new list.
- For large inputs, this can consume a significant amount of memory and crash your program. 
- _**Generator expressions**_ allow list comprehension to return iterator.*
- Syntax: list-comprehension syntax between '( )'.

In [109]:
list_a = [0, 1, 2, 3, 4, 5]
it_a = (x**2 for x in list_a)
print(it_a)
for i in range(3):
    print(next(it_a))

<generator object <genexpr> at 0x10b462e60>
0
1
4


##### Prefer iterator over range

In [116]:
item_list = ['apple', 'pear', 'kiwi', 'orange']

# print items in list
for item in item_list:
    print(item)
    
# Now, ptint item with number    
# This is not advised - clunky!
print('\n')
for i in range(len(item_list)):
    print(f"Item {i}: {item_list[i]}")
    
# Instead, use enumerate:
print('\n')
for i, item in enumerate(item_list):
    print(f"Item {i}: {item}")    

apple
pear
kiwi
orange


Item 0: apple
Item 1: pear
Item 2: kiwi
Item 3: orange


Item 0: apple
Item 1: pear
Item 2: kiwi
Item 3: orange


##### Use zip to process iterators in parallel

- Note that zip returns an iterator in python 3

In [121]:
students = ['Mark', 'Jack', 'Bill']
grades = ['A', 'B', 'C']

for s, g in zip(students, grades):
    print(f"{s}: {g}")
    
# Note zip behavior with unequal lists.
# Zip yields tuples until a list is exhausted. Longer list will get chopped!
students.append('Lenny')
print(students)
for s, g in zip(students, grades):
    print(f"{s}: {g}")


Mark: A
Jack: B
Bill: C
['Mark', 'Jack', 'Bill', 'Lenny']
Mark: A
Jack: B
Bill: C


##### Avoid else block after for and while loops

- Python (unusually) allows else block outside for and while loops.   
- Don't use this! It's confusing and has unpredictable behavior.

##### Use each block in try/except/else/finally

- There are 4 potential times to take an action when handling an exception (captured in try/except/else/finally).  
- Various combinations are useful.

1. Use **try/finally** when you want exception to propagate up, but you want to run clean up code even when the exception occurs. 

Common example: want to close a file handle even when exception occurs. 

- Note that exception raised by read method will propagate up to the calling code. 
- close method is guaranteed to run. 
- note that the fact that the open method is outside the try/finally block ensures that the close method will not be executed if we get an error when opening the file. 

In [122]:
handle = open('file1.txt') # May raise FileNotFoundError
try: 
    data = handle.read() # May raise UnicodeDecodeError
finally:
    handle.close() # Always runs after try

2. Use **try/except/else** to make it clear which exceptions will be hnadled by your code and which exceptions will propagate up. When the try block doesn't raise an exxception, the else block will run. 

3. Use **try/except/else/finally** when you want to do it all in one compound statement.