Questions for the group:
- how do you work with this book?
- how useful do you find it? especially people new to Python?

My general impression:
- I think this book is useful if you're already familiar with Python, I'd find it hard to use as a learning tool if I didn't know the language
- it lacks examples outside of simple syntax (so far)
- it assumes knowledge on the reader's part in an uneven level (e.g. doesn't assume you know what version of Python you're running, but throws in bitwise operations in the middle of examples without explanation)
- it reads more like a set of tips that would be useful to look up than a comprehensive understanding of topics (e.g. exception handling assumes you already know the syntax and doesn't provide a general explanation of how errors propagate in Python)

# Pythonic thinking

In [10]:
# Install pycodestyle for PEP8 to work
# add a global setting in pycodestyle config to ignore the no new line rule

# pip install pycodestyle
# pip install pycodestyle_magic
%load_ext pycodestyle_magic

The pycodestyle_magic extension is already loaded. To reload it, use:
  %reload_ext pycodestyle_magic


## Item 2: PEP8

Whitespace:
- four spaces in a tab
- four spaces for syntactic indenting
- 79 chars lines
- long expressions indented by four extra spaces on the next line
- in a file, functions and classes separated by two blank lines
- in a class, methods separated by one blank line
- no spaces around: list indexes, function calls, keyword argument assignments
- one space before and after variable assignments

Naming:
- `lowercase_underscore` - functions, variables, attributes
- `_leading_underscore` - protected instance (accessible within the class and sub-classes)
- `__double_leading_underscore` - private instance (accessible within the class)
- `CapitalizedWord` - classes and exceptions
- `ALL_CAPS` - module-level constants
- `self` - instance methods in classes
- `cls` - class methods

Expressions and statements:
- inline negation `(if a is not b)`
- don't check empty by checking length, use `if not <item>` (empty values implicitly evaluate to `False`)
- don't check non-empty by checking length, use `if <item>`
- avoid single-line `if`, `for`, `while`, and `except` - multiple lines for clarity
- put `import` at the top of the file
- use absolute names for importing modules, e.g. `from bar import foo`
- for relative imports use explicit syntax, e.g. `from . import foo`
- imports: standard library modules, third-party modules, your own modules; each subsection alphabetical

## Item 3: `bytes`, `str`, and `unicode`

Python 3:
- `bytes` - raw 8-bit values, binary serialization format represented by a sequence of 8-bit integers, good for storing data and sending it to other applications, can only use ASCII literal characters
- `str` - Unicode characters, no binary encoding associated (represented internally as a sequence of Unicode codepoints), default type when creating a string

Python 2:
- `unicode` - Unicode characters, no binary encoding associated
- `str` - raw 8-bit values

Represent Unicode characters as binary:
- most common encoding is `UTF-8`
- Unicode -> binary - `encode`
- binary -> Unicode - `decode`

Encode and decode at the furthest boundary of interfaces - the core should use Unicode (`str` in Python 3 and `unicode` in Python 2), not assume a specific character encoding. Accepting of alternative text encodings, strict about output encoding (ideally `UTF-8`).

Two most commmon cases:
- operate on raw 8-bit values with `UTF-8` encoded characters (or a different encoding)
- operate on Unicode characers that have no specific encoding

It's useful to have helper methods that convert input to `bytes` and `str` (Python 3, Python 2 would need `unicode` and `str`).

In [26]:
%%pycodestyle


def to_str(bytes_or_str):
    """
    @input - bytes or str
    @output - str
    """
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value


def to_bytes(bytes_or_str):
    """
    @input - bytes or str
    @output - bytes
    """
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value

There are gotchas when dealing with raw 8-bit values and Unicode.

1. In Python 2, `unicode` and `str` seem to be the same type when a `str` only contains 7-bit ASCII characters:
    - you can combine them together using `+`
    - you can compare them using equality and inequality operators
    - you can use `unicode` in format strings like '%s'


2. In Python 3, `bytes` and `str` instances are never equivalent (not even when empty), so you must be more deliberate about types. You can't combine them or compare them. They're not friends. They don't mix.


3. Operations involving file handles (`open`) default to `UTF-8` encoding in Python 3, binary encoding in Python 2.
    - in Python 2, `with open(file, 'w')` will work for binary data, would fail in Python 3
    - in Python 3, you need to open in write binary mode instead `with open(file, 'wb')`
    - same with reading - Python 2 uses `r` for binary, Python 3 requires `rb` mode

In [30]:
print(type('Kitten!'))
print(type(b'Kitten!'))

<class 'str'>
<class 'bytes'>


In [36]:
string = '$%@#'

# print bytes
encoded = string.encode('utf-8')
print(encoded)

# print str
encoded.decode()

b'$%@#'


'$%@#'

### But why?
There is no way to determine what type of encoding byte strings are. Technically, at the lowest level, everything is made of bytes, but to be practically usable in applications we need to know the encoding. And using Unicode strings helps us preserve this information.

Sources:
- https://timothybramlett.com/Strings_Bytes_and_Unicode_in_Python_2_and_3.html
- https://medium.com/better-programming/strings-unicode-and-bytes-in-python-3-everything-you-always-wanted-to-know-27dc02ff2686

### The Unicode sandwich
As mentioned before, you want `bytes` on input and output, but manipulate `str` in your application.

```
------------------
   bytes (input)    <-- data from the outside world
------------------
     decode()
------------------
str (manipulation)  <-- access to many useful string manipulation libraries and methods
------------------
     encode()
------------------
  bytes (output)    <-- send it back to the outside world
------------------
```

Fun talk about Unicode: https://nedbatchelder.com/text/unipain.html

## Item 4: Helper functions
Python is pretty expressive, which may encourage you to write single-line expressions with a lot of logic. But readability outweights brevity.

In [52]:
from urllib.parse import parse_qs

In [76]:
my_values = parse_qs('red=5&blue=0&green=', keep_blank_values=True)
print(repr(my_values))

print('\nRed:     ', my_values.get('red')) 
print('Green:   ', my_values.get('green'))
print('Opacity: ', my_values.get('opacity'))

red = my_values.get('red', [''])[0] or 0
green = my_values.get('green', [''])[0] or 0
opacity = my_values.get('opacity', [''])[0] or 0

print('\nRed:     %r' % red) 
print('Green:   %r' % green)
print('Opacity: %r' % opacity)

red = int(my_values.get('red', [''])[0] or 0)
green = int(my_values.get('green', [''])[0] or 0)
opacity = int(my_values.get('opacity', [''])[0] or 0)

print('\nRed:     %r' % red) 
print('Green:   %r' % green)
print('Opacity: %r' % opacity)

# ternary operator is a bit clearer, but not too much of an improvement
red = my_values.get('red', ['']) 
red = int(red[0]) if red[0] else 0
green = my_values.get('green', ['']) 
green = int(green[0]) if green[0] else 0
opacity = my_values.get('opacity', [''])
opacity = int(opacity[0]) if opacity[0] else 0

print('\nRed:     %r' % red) 
print('Green:   %r' % green)
print('Opacity: %r' % opacity)

{'red': ['5'], 'blue': ['0'], 'green': ['']}

Red:      ['5']
Green:    ['']
Opacity:  None

Red:     '5'
Green:   0
Opacity: 0

Red:     5
Green:   0
Opacity: 0

Red:     5
Green:   0
Opacity: 0


In [72]:
def get_first_int(values, key, default=0):
    found = values.get(key, [''])
    if found[0]:
        found = int(found[0])
    else:
        found = default
    return found


get_first_int(my_values, 'green')

0

## Item 5: Slicing sequences
Slicing gets a subset of a sequence with minimal effort, built-in for `list`, `str`, and `bytes`. Can be extended to any Python class with `__getitem__` and `__setitem__`.

Basic syntax: `somelist[start:end]` (start inclusive, end exclusive)

In [125]:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

print('First four                  ', letters[:4])
print('Last four                                       ', letters[-4:])
print('Middle two                                 ', letters[3:-3])

# Would throw AssertionError if not correct
assert letters[:5] == letters[0:5]
assert letters[5:] == letters[5:len(letters)]

print("All                         ", letters[:])
print("First five                  ", letters[:5])
print("All but last                ", letters[:-1])
print("Everything after first four                     ", letters[4:])
print("Last three                                           ", letters[-3:])
print("Third to fifth                        ", letters[2:5])
print("Third to second last                  ", letters[2:-1])
print("Third last to second last                            ",letters[-3:-1])

# calling letters[20] directly causes an IndexError
# but slicing out of bonds is fine
print("First twenty items          ", letters[:20])
print("Last twenty items           ", letters[-20:])

# copy of the original list
letters[-0:]

First four                   ['a', 'b', 'c', 'd']
Last four                                        ['e', 'f', 'g', 'h']
Middle two                                  ['d', 'e']
All                          ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
First five                   ['a', 'b', 'c', 'd', 'e']
All but last                 ['a', 'b', 'c', 'd', 'e', 'f', 'g']
Everything after first four                      ['e', 'f', 'g', 'h']
Last three                                            ['f', 'g', 'h']
Third to fifth                         ['c', 'd', 'e']
Third to second last                   ['c', 'd', 'e', 'f', 'g']
Third last to second last                             ['f', 'g']
First twenty items           ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Last twenty items            ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Slicing a list creates a new list. References to objects from the original list are maintained, but a change in the result of slicing won't affect the original list.

In [115]:
letters_two = letters[4:]
print("Before:                         ", letters_two)
letters_two[1] = 'kitten'
print("After:                          ", letters_two)
print("No change:  ", letters)

Before:                          ['e', 'f', 'g', 'h']
After:                           ['e', 'kitten', 'g', 'h']
No change:   ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


When using assignemts, slices will replace the specified range. They don't need to be the same size (unlike tuple assignments `a, b = c[:2]`). Values before and after the slice will be preserved and the list will grow or shrink.

In [119]:
print("Before ", letters)
letters[2:7] = ['tiny', 'lil', 'kitten']
print("After  ", letters)

Before  ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After   ['a', 'b', 'tiny', 'lil', 'kitten', 'h']


Leaving both start and end empty will make a copy of the original list.

In [123]:
a = [1, 2]
b = a[:]

assert b == a and b is not a

Assigning a slice with no start or end will replace the entire contents with a copy of what's referenced (instead of a new list)

In [124]:
b = a
print("Before a ", a)
print("Before b ", b)
a[:] = ['tiny', 'lil', 'kitten']
assert a is b
print("After a  ", a)
print("After b  ", b)

Before a  [1, 2]
Before b  [1, 2]
After a   ['tiny', 'lil', 'kitten']
After b   ['tiny', 'lil', 'kitten']


## Item 6: Avoid using `start`, `end`, and `stride` in a single slice
Pyton has special syntax for the stride of a slice `somelist[start:end:slice]`. It let's you take every n-th ite when slicing a sequence.

In [2]:
a = ['red', 'orange', 'yellow', 'green', 'blue', 'purple']
odds = a[::2]
evens = a[1::2]
print(odds)
print(evens)

['red', 'yellow', 'blue']
['orange', 'green', 'purple']


That can cause some unexpected behaviour, e.g. to reverse a byte string is to slice it with the slice of `-1`.

In [3]:
x = b'kitten'
y = x[::-1]
print(y)

b'nettik'


It works well for bytestrings and ASCII, but will break Unicode `UTF-8` strings.

In [5]:
w = 'ąę'
x = w.encode('utf-8')
y = x[::-1]
z = y.decode('utf-8')

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 0: invalid start byte

In [11]:
# It's not always obvious when stride, start, and end come to play
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('Get every second item from start: ', letters[::2])
print('Get every second item from end (going backwards): ', letters[::-2])
print('Get every second item starting starting at third item: ', letters[2::2])
print('Get every second item from second last (going backwards): ', letters[-2::-2])
print('Get every second item from second last (going backwards), and end at second step', letters[-2:2:-2])
print('Get every second item from an empty set (going backwa)', letters[2:2:-2])

Get every second item from start:  ['a', 'c', 'e', 'g']
Get every second item from end (going backwards):  ['h', 'f', 'd', 'b']
Get every second item starting starting at third item:  ['c', 'e', 'g']
Get every second item from second last (going backwards):  ['g', 'e', 'c', 'a']
Get every second item from second last (going backwards), and end at second step ['g', 'e']
Get every second item from an empty set (going backwa) []


In [13]:
# Consider using one assignment for stride and another to slice
# this will create an extra shallow copy, might be better to use itertools
print(letters[::2])
print(letters[::2][1:-1])

['a', 'c', 'e', 'g']
['c', 'e']


Prefer:
- using positive `stride` in slices without `start` or `end`
- avoiding negative `stride`
- avoiding `start`, `end` and `stride` in a single slice - prefer two steps or `islice` from `itertools`

## Item 7: List comprehensions

In [14]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [num**2 for num in numbers]

print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


List comprehensions have a simpler syntax than maps, that doesn't require creating a lambda function.

In [18]:
squares =  map(lambda x: x**2, numbers)
print(list(squares))

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


List comprehension allow for easily filtering items from the input list.

In [19]:
even_squares = [num**2 for num in numbers if num % 2 == 0]
print(even_squares)

[4, 16, 36, 64, 100]


Maps can use a `filter` function for the same effect.

In [20]:
even_squares = map(lambda x: x**2, filter(lambda x: x % 2 == 0, numbers))
print(list(even_squares))

[4, 16, 36, 64, 100]


In [21]:
# Dictionary comprehension
chile_ranks = {'ghost': 1, 'habanero': 2, 'cayenne': 3}
rank_dict = {rank: name for name, rank in chile_ranks.items()}
print(rank_dict)

{1: 'ghost', 2: 'habanero', 3: 'cayenne'}


In [22]:
# Set comprehension
chile_len_set = {len(name) for name in rank_dict.values()}
print(chile_len_set)

{8, 5, 7}


## Avoid more than two expressions in list comprehensions
List comprehensions avoid multiple levels of looping.

In [24]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [26]:
squared = [[x**2 for x in row] for row in matrix]
print(squared)

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]


Including another loop might get noisy.

In [31]:
lists = [[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]
flat = [x for level_1 in lists
       for level_2 in level_1
       for x in level_2]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]


In [32]:
flat_2 = []
for level_1 in lists:
    for level_2 in level_1:
        flat_2.extend(level_2)
print(flat_2)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]


In [34]:
# multiple if conditions
# two conditions at the same loop level are an implicit `and`
first = [x for x in numbers if x > 4 if x % 2 == 0]
second = [x for x in numbers if x > 4 and x % 2 == 0]
first == second

True

In [37]:
# avoid expressions like this as they're difficult to comprehend
filtered = [[x for x in row if x % 3 == 0] 
            for row in matrix if sum(row) >= 10]
print(filtered)

[[6], [9]]


## Item 9: Generator expressions for large comprehensions
Using list comprehensions is good for small sets, but with large amounts of data can hoard memory, as the full input list needs to be held in memory.

In [41]:
my_file = open("my_file.txt","w+")
my_file.write('a' * 100 + '\n')
my_file.write('b' * 22 + '\n')
my_file.write('c' * 74 + '\n')
value = [len(x) for x in open("my_file.txt")]
print(value)

[101, 23, 75]


Pyton provides generator expressions, a generalization of list comprehensions and generators (generators are a simple way of creating iterators). They don't materalize the whole sequence, instead they evaluate to an iterator that yields one iterm at a time from the expression.

Generator expressions are like building generators on the fly.

In [45]:
it = (len(x) for x in open("my_file.txt"))
print(it)
print(next(it))
print(next(it))
print(next(it))
print(next(it))

<generator object <genexpr> at 0x1096fad00>
101
23
75


StopIteration: 

In [47]:
it_2 = (len(x) for x in open("my_file.txt"))
roots = ((x, x**0.5) for x in it_2)

print(next(roots))
print(next(roots))

(101, 10.04987562112089)
(23, 4.795831523312719)


Chaining generators is fast and can be used for composing functionality on a large stream of input. However, they're stateful, so you must be careful to use them only once.

## Item 10: Prefer enumerate over range
Range is useful for loops iterating over a set of integers.

In [88]:
from random import randint
random_bits = 0
for i in range(64):
    if randint(0, 1):
        # do a bitwise OR
        # left shift by i
        random_bits |= 1 << i
print(random_bits)

4015867568697019049


In [90]:
# you can loop directly over a sequence
flavour_list = ['strawberry', 'mango', 'raspberry', 'chocolate']
for flavour in flavour_list:
    print ('%s is delicious' % flavour)

strawberry is delicious
mango is delicious
raspberry is delicious
chocolate is delicious


In [91]:
# sometimes you want to know the index of an item
for i in range(len(flavour_list)):
    flavour = flavour_list[i]
    print('%d: %s' % (i + 1, flavour))

1: strawberry
2: mango
3: raspberry
4: chocolate


In [92]:
# enumerate is better (wraps an iterator with a lazy generator)
for i, flavour in enumerate(flavour_list):
    print('%d: %s' % (i + 1, flavour))

1: strawberry
2: mango
3: raspberry
4: chocolate


In [93]:
# can specify the initial index
for i, flavour in enumerate(flavour_list, 1):
    print('%d: %s' % (i, flavour))

1: strawberry
2: mango
3: raspberry
4: chocolate


### Bitwise operations

#### `&` - AND
```
      1010
    & 1000
    ------
      1000
```

    Can be used to:
    - quickly test divisibility by 2, e.g. `6 & 1 -> 0`


#### `|` - OR
```
      1010
    | 1000
    ------
      1010
```

#### `^` - XOR
```
      1010
    ^ 1000
    ------
      0010
```

#### `~` - NOT (compliment) (TBD)
```

```

    Can be used to:
    - invert a gray scale image
    - by default Python is using signed values


#### `<<` - left shift (pad zeros to the right)
```
     1010 << 1
    ----------
    10100
```

    It's like multiplying a number by 2.

    In other languages, this could discard the overflow values (effectively creating a mask over the value), in Python a mask needs to be applied with &, e.g. `& 15` will leave only the last 4 bits.


#### `>>` - right shift (cut on the right, pad zeros to the left)
```
    1010 >> 1
    ---------
    0101
```

    It's like dividing a number by 2.

In [78]:
a = 0b1010
b = 0b1000

print('a ', a, '\nb ', b)
print('AND ', bin(a & b), a & b)
print('OR ', bin(a | b), a | b)
print('XOR ', bin(a ^ b), a ^ b)
print('NOT a ', bin(~a), ~a)
print('LEFT a ', bin(a << 1), a << 1)
print('RIGHT a ', bin(a >> 1), a >> 1)

a  10 
b  8
AND  0b1000 8
OR  0b1010 10
XOR  0b10 2
NOT a  -0b1011 -11
LEFT a  0b10100 20
RIGHT a  0b101 5


## Item 11: Use zip for parallel iterators

In [105]:
animals = ['kitten', 'bird', 'puppy']
letters = [len(n) for n in animals]
print(letters)

[6, 4, 5]


In [98]:
longest_animal = None
max_letters = 0

for i in range(len(animals)):
    count = letters[i]
    if count > max_letters:
        longest_animal = animals[i]
        max_letters = count

print(longest_animal)

kitten


In [99]:
for i, animal in enumerate(animals):
    count = letters[i]
    if count > max_letters:
        longest_animal = animal
        max_letters = count

print(longest_animal)

kitten


In [100]:
# zip wraps two or more iterators with a lazy generator
for animal, length in zip(animals, letters):
    if count > max_letters:
        longest_animal = animal
        max_letters = count

Two potential issues:
- Python 2 doesn't use a generator, it will fully exhaust the iterators and return a list of all the tuples
- if the input iterators are different lenghts, `zip` keeps yielding tuples until a wrapped iterator is exhausted (meaning the shorter list)

In [106]:
animals.append('giraffe')
for animal, count in zip(animals, letters):
    print(animal)

kitten
bird
puppy


In [108]:
# to exhaust the longer list
from itertools import zip_longest

for animal, count in zip_longest(animals, letters):
    print(animal, count)

kitten 6
bird 4
puppy 5
giraffe None


## Item 12: Avoid `else` after `for` and `while`

In [109]:
for i in range(3):
    print('wee' + i*'ee')
else:
    print('wee no more')

wee
weeee
weeeeee
wee no more


In [110]:
for i in range(3):
    print('wee' + i*'ee')
    if i == 1:
        break
else:
    print('wee no more')

wee
weeee


In [111]:
for i in []:
    print('I will never run')
else:
    print('But I will')

But I will


In [112]:
while False:
    print('Nothing to see here')
else:
    print('I am free!')

I am free!


When would you use that?

You could use it for a case in which your while/for never encounter a `break`.

In [113]:
# in practice, you wouldn't write it like this
a = 4
b = 9
for i in range(2, min(a, b) + 1):
    print(i)
    if a % i == 0 and b % i == 0:
        print('Not coprime')
        break
else:
    print('Comprime')

2
3
4
Comprime


In [115]:
# 1. return early
def coprime(a, b):
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            return False
    return True

coprime(a, b)

True

In [116]:
# 2. have a result variable
def coprime(a, b):
    is_coprime = True
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            is_coprime = False
            break
    return is_coprime

coprime(a, b)

True

## Item 13: Use each block in `try`/`except`/`else`/`finally`
Each block in the exception handling has a purpose.

### Finally
Want exceptions to propagate up and run cleanup code even when exceptions occur.

In [117]:
handle = open("my_file.txt")
try:
    # will always propagate
    data = handle.read()
finally:
    # guaranteed to run
    handle.close()

### Else
If `try` doesn't raise an exception, `else` will run. Minimize the amount of code in `try`.

In [118]:
def load_json_key(data, key):
    try:
        # may raise ValueError
        result_dict = json.loads(data)
    except ValueError as e:
        raise KeyError from e
    else:
        # may raise KeyError
        return result_dict[key]

### All the things

In [120]:
UNDEFINED = object()

def divide_json(path):
    # may raise IOError
    handle = open(path, 'r+')
    try:
        # may raise UnicodeDecodeError
        data = handle.read()
        # may raise ValueError
        op = json.loads(data)
        # may raise ZeroDivisionError
        value = (op['numerator'] /
                op['denominator'])
    # fires on specific error
    except ZeroDivisionError as e:
        return UNDEFINED
    # runs if except doesn't fire
    else:
        op['result'] = value
        result = json.dumps(op)
        handle.seek(0)
        # may raise IOError
        handle.write(result)
        return value
    # happens no matter the result
    finally:
        handle.close()