# Chapter 1: Pythonic Thinking

<div id="toc"></div>

## Item 1: Know Which Version of Python You’re Using

In [None]:
python --version

In [None]:
python3 --version

In [6]:
import sys
print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=6, micro=0, releaselevel='final', serial=0)
3.6.0 |Anaconda custom (64-bit)| (default, Dec 23 2016, 11:57:41) [MSC v.1900 64 bit (AMD64)]


* There are two major versions of Python still in active use: Python 2 and Python 3. 
* There are multiple popular runtimes for Python: CPython, Jython, IronPython, PyPy, etc. 
* Be sure that the command-line for running Python on your system is the version you expect it to be. 
* Prefer Python 3 for your next project because that is the primary focus of the Python community. 

## Item 2: Follow the PEP 8 Style Guide

http://www.python.org/dev/peps/pep-0008/

> Note
The Pylint tool (http://www.pylint.org/) is a popular static analyzer for Python source code.  
Pylint provides automated enforcement of the PEP 8 style guide and detects many other types of common errors in Python programs.  


* Always follow the PEP 8 style guide when writing Python code.  
* Sharing a common style with the larger Python community facilitates collaboration with others.  
* Using a consistent style makes it easier to modify your own code later.  

## Item 3: Know the Differences Between bytes, str, and unicode

In [7]:
def to_str( bytes_or_str):
    if isinstance( bytes_or_str, bytes) :
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value # Instance of str

In [8]:
def to_bytes( bytes_or_str):
    if isinstance( bytes_or_str, str) :
        value = bytes_or_str.encode( 'utf-8' )
    else:
        value = bytes_or_str
    return value # Instance of bytes

In [None]:
# Python 2
def to_unicode( unicode_or_str) :
    if isinstance( unicode_or_str, str) :
        value = unicode_or_str.decode( 'utf-8' )
    else:
        value = unicode_or_str
    return value # Instance of unicode

In [None]:
# Python 2
def to_str( unicode_or_str):
    if isinstance( unicode_or_str, unicode):
        value = unicode_or_str.encode( 'utf-8' )
    else:
        value = unicode_or_str
    return value # Instance of str

In [None]:
with open('/tmp/random.bin', 'w') as f:
    f.write(os.urandom(10))

In [None]:
with open('/tmp/random. bin', 'wb') as f:
    f.write(os.urandom(10))

* In Python 3, bytes contains sequences of 8-bit values, str contains sequences of Unicode characters.  
* bytes and str instances can’t be used together with operators (like > or +).  
* In Python 2, str contains sequences of 8-bit values, unicode contains sequences of Unicode characters.  
* str and unicode can be used together with operators if the str only contains 7-bit ASCII characters.  
* Use helper functions to ensure that the inputs you operate on are the type of character sequence you expect (8-bit values, UTF-8 encoded characters, Unicode characters, etc.).  
* If you want to read or write binary data to/from a file, always open the file using a binary mode (like ' rb' or ' wb' ).  


## Item 4: Write Helper Functions Instead of Complex Expressions

In [16]:
from urllib.parse import parse_qs
my_values = parse_qs('red=5&blue=0&green=', keep_blank_values=True)
print(repr(my_values))

{'red': ['5'], 'blue': ['0'], 'green': ['']}


In [17]:
print('Red: ', my_values.get('red') )
print('Green: ', my_values.get('green') )
print('Opacity:', my_values.get('opacity') )

Red:  ['5']
Green:  ['']
Opacity: None


In [None]:
# For query string'red=5&blue=0&green='
red = my_values.get('red', [''] ) [0] or 0
green = my_values.get('green', [''] ) [0] or 0
opacity = my_values.get('opacity', [''] ) [0] or 0
print('Red: %r'% red)
print('Green: %r'% green)
print('Opacity: %r'% opacity)

In [None]:
red = int(my_values.get('red',[''])[0] or 0)

In [None]:
red = my_values.get('red', [''])
red = int(red[0]) if red[0] else 0

In [None]:
green = my_values.get('green', [''] )
if green[0]:
    green = int(green[0])
else:
    green = 0

In [None]:
def get_first_int( values, key, default=0):
    found = values.get(key, [''] )
    if found[0]:
        found = int(found[0])
    else:
        found = default
    return found

In [None]:
green = get_first_int(my_values, 'green' )

* Python’s syntax makes it all too easy to write single-line expressions that are overly complicated and difficult to read.  
* Move complex expressions into helper functions, especially if you need to use the same logic repeatedly.  
* The if/else expression provides a more readable alternative to using Boolean operators like or and and in expressions.  



## Item 5: Know How to Slice Sequences

In [2]:
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('First four:', a[:4])
print('Last four:', a[-4:])
print('Middle two:', a[3:-3])

First four: ['a', 'b', 'c', 'd']
Last four: ['e', 'f', 'g', 'h']
Middle two: ['d', 'e']


In [19]:
assert a[:5] == a[0:5]

In [3]:
a[:5] == a[0:5]

True

In [20]:
assert a[5:] == a[5:len(a)]

In [4]:
a[:5] == a[0:5]

True

In [None]:
a[:]       # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
a[:5]      # ['a', 'b', 'c', 'd', 'e']
a[:-1]     # ['a', 'b', 'c', 'd', 'e', 'f', 'g']
a[4:]      # ['e', 'f', 'g', 'h']
a[-3:]     # ['f', 'g', 'h']
a[2:5]     # ['c', 'd', 'e']
a[2:-1]    # ['c', 'd', 'e', 'f', 'g']
a[-3:-1]   # ['f', 'g']

In [None]:
first_twenty_items = a[:20]
last_twenty_items = a[-20:]

In [19]:
a[20]

In [5]:
b = a[4:]
print('Before:', b)
b[1] = 99
print('After:', b)
print('No change:', a)

Before: ['e', 'f', 'g', 'h']
After: ['e', 99, 'g', 'h']
No change: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


In [6]:
print('Before', a)
a[2:7] = [99, 22, 14]
print('After', a)

Before ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After ['a', 'b', 99, 22, 14, 'h']


* Avoid being verbose: Don’t supply 0 for the start index or the length of the sequence for the end index.  
* Slicing is forgiving of start or end indexes that are out of bounds, making it easy to express slices on the front or back boundaries of a sequence (like a[:20] or a[-20:]).  
* Assigning to a list slice will replace that range in the original sequence with what’s referenced even if their lengths are different.  

## Item 6: Avoid Using start, end, and stride in a Single Slice

In [8]:
a = ['red', 'orange', 'yellow', 'green', 'blue', 'purple']
odds = a[::2]
evens = a[1::2]
print(odds)

['red', 'yellow', 'blue']


In [9]:
x = b'mongoose'
y = x[::-1]
print(y)

b'esoognom'


In [10]:
w = '謝謝'
x = w.encode('utf-8')
y = x[::-1]
z = y.decode('utf-8')

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9d in position 0: invalid start byte

In [None]:
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
a[::2] # ['a', 'c', 'e', 'g']
a[::-2] # ['h', 'f', 'd', 'b']

In [None]:
a[2::2]     # ['c','e','g']
a[-2::-2]   # ['g','e','c','a']
a[-2:2:-2]  # ['g','e']
a[2:2:-2]   # []

In [None]:
b = a[::2]  # ['a', 'c', 'e', 'g']
c = b[1:-1] # ['c', 'e']

* Specifying start, end, and stride in a slice can be extremely confusing.  
* Prefer using positive stride values in slices without start or end indexes.  
* Avoid negative stride values if possible.  
* Avoid using start, end, and stride together in a single slice.  
* If you need all three parameters, consider doing two assignments (one to slice, another to stride) or using islice from the itertools built-in module.    


## Item 7: Use List Comprehensions Instead of map and filter

In [12]:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [x**2 for x in a]
print(squares)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


In [15]:
squares = map(lambda x: x ** 2, a)
print(squares)

<map object at 0x0000003EB99820B8>


In [16]:
even_squares = [x**2 for x in a if x % 2 == 0]
print(even_squares)

[4, 16, 36, 64, 100]


In [17]:
alt = map( lambda x: x**2, filter( lambda x: x % 2 == 0, a) )
assert even_squares == list( alt)

In [18]:
chile_ranks = {'ghost': 1, 'habanero': 2, 'cayenne': 3}
rank_dict = {rank: name for name, rank in chile_ranks.items() }
chile_len_set = {len(name) for name in rank_dict.values() }
print(rank_dict)
print(chile_len_set)

{1: 'ghost', 2: 'habanero', 3: 'cayenne'}
{8, 5, 7}


* List comprehensions are clearer than the map and filter built-in functions because they don’t require extra lambda expressions.  
* List comprehensions allow you to easily skip items from the input list, a behavior map doesn’t support without help from filter.  
* Dictionaries and sets also support comprehension expressions.  

## Item 8: Avoid More Than Two Expressions in List Comprehensions

In [20]:
matrix = [[1, 2, 3] , [4, 5, 6] , [7, 8, 9] ]
flat = [x for row in matrix for x in row]
print(flat)

[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [22]:
squared = [[x**2 for x in row] for row in matrix]
print(squared)

[[1, 4, 9], [16, 25, 36], [49, 64, 81]]


In [23]:
my_lists = [
    [[1, 2, 3] , [4, 5, 6] ] ,
    # …
]
flat = [x for sublist1 in my_lists
        for sublist2 in sublist1
        for x in sublist2]

In [24]:
flat = []
for sublist1 in my_lists:
    for sublist2 in sublist1:
        flat.extend(sublist2)

In [28]:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
b = [x for x in a if x > 4 if x % 2 == 0]
print(b)
c = [x for x in a if x > 4 and x % 2 == 0]
print(c)

[6, 8, 10]
[6, 8, 10]


In [26]:
matrix = [[1, 2, 3] , [4, 5, 6] , [7, 8, 9] ]
filtered = [[x for x in row if x % 3 == 0]
for row in matrix if sum( row) >= 10]
print(filtered)

[[6], [9]]


* List comprehensions support multiple levels of loops and multiple conditions per loop level.  
* List comprehensions with more than two expressions are very difficult to read and should be avoided.  

## Item 9: Consider Generator Expressions for Large Comprehensions

In [29]:
value = [len(x) for x in open('./chapter1/my_file.txt') ]
print(value)

[73, 77, 72, 74, 78, 72]


In [30]:
it = (len(x) for x in open('./chapter1/my_file.txt') )
print(it)

<generator object <genexpr> at 0x0000003EB99448E0>


In [31]:
print(next(it))
print(next(it))

73
77


In [32]:
roots = ((x, x**0.5) for x in it)

In [33]:
print(next(roots))

(72, 8.48528137423857)


In [43]:
print(next(roots))

(74, 8.602325267042627)


* List comprehensions can cause problems for large inputs by using too much memory.  
* Generator expressions avoid memory issues by producing outputs one at a time as an iterator.  
* Generator expressions can be composed by passing the iterator from one generator expression into the for subexpression of another.  
* Generator expressions execute very quickly when chained together.  



## Item 10: Prefer enumerate Over range

In [37]:
import numpy as np
from math import *
random_bits = 0
for i in range(64):
    if randint(0, 1):
        random_bits |= 1 << i

NameError: name 'randint' is not defined

In [38]:
flavor_list = ['vanilla' , 'chocolate' , 'pecan' , 'strawberry' ]
for flavor in flavor_list:
    print('%s is delicious' % flavor)

vanilla is delicious
chocolate is delicious
pecan is delicious
strawberry is delicious


In [39]:
for i in range(len(flavor_list)):
    flavor = flavor_list[i]
    print('%d: %s' % ( i + 1, flavor) )

1: vanilla
2: chocolate
3: pecan
4: strawberry


In [40]:
for i, flavor in enumerate(flavor_list):
    print('%d: %s' % ( i + 1, flavor) )

1: vanilla
2: chocolate
3: pecan
4: strawberry


In [42]:
for i, flavor in enumerate(flavor_list, 1):
    print('%d: %s' % ( i, flavor) )

1: vanilla
2: chocolate
3: pecan
4: strawberry


* enumerate provides concise syntax for looping over an iterator and getting the index of each item from the iterator as you go.  
* Prefer enumerate instead of looping over a range and indexing into a sequence.  
* You can supply a second parameter to enumerate to specify the number from which to begin counting (zero is the default).  



## Item 11: Use zip to Process Iterators in Parallel

In [49]:
names = ['Cecilia' , 'Lise' , 'Marie' ]
letters = [len(n) for n in names]

In [52]:
longest_name = None
max_letters = 0
for i in range(len(names)):
    count = letters[i]
    if count > max_letters:
        longest_name = names[i]
        max_letters = count
print(longest_name)

Cecilia


In [58]:
for i, name in enumerate(names) :
    count = letters[i]
    if count > max_letters:
        longest_name = name
        max_letters = count
print(longest_name)        

Cecilia


In [59]:
for name, count in zip(names, letters) :
    if count > max_letters:
        longest_name = name
        max_letters = count
print(longest_name)        

Cecilia


In [60]:
names.append('Rosalind')
for name, count in zip(names, letters) :
    print( name)

Cecilia
Lise
Marie


* The zip built-in function can be used to iterate over multiple iterators in parallel.  
* In Python 3, zip is a lazy generator that produces tuples.  
* In Python 2, zip returns the full result as a list of tuples.  
* zip truncates its output silently if you supply it with iterators of different lengths.  
* The zip_longest function from the itertools built-in module lets you iterate over multiple iterators in parallel regardless of their lengths (see Item 46: “Use Built-in Algorithms and Data Structures”).  



## Item 12: Avoid else Blocks After for and while Loops

In [61]:
for i in range(3):
    print('Loop %d' % i)
else:
    print('Else block!')

Loop 0
Loop 1
Loop 2
Else block!


In [62]:
for i in range(3):
    print('Loop %d' % i)
    if i == 1:
        break
else:
    print('Else block!')

Loop 0
Loop 1


In [63]:
for x in []:
    print('Never runs')
else:
    print('For Else block!')

For Else block!


In [64]:
while False:
    print('Never runs')
else:
    print('While Else block! ')

While Else block! 


In [65]:
a = 4
b = 9
for i in range( 2, min( a, b) + 1) :
    print('Testing', i)
    if a % i == 0 and b % i == 0:
        print('Not coprime')
        break
else:
    print('Coprime')

Testing 2
Testing 3
Testing 4
Coprime


In [66]:
def coprime(a, b):
    for i in range( 2, min( a, b) + 1) :
        if a % i == 0 and b % i == 0:
            return False
    return True

In [67]:
def coprime2(a, b) :
    is_coprime = True
    for i in range( 2, min( a, b) + 1) :
        if a % i == 0 and b % i == 0:
            is_coprime = False
            break
    return is_coprime

* Python has special syntax that allows else blocks to immediately follow for and while loop interior blocks.  
* The else block after a loop only runs if the loop body did not encounter a break statement.  
* Avoid using else blocks after loops because their behavior isn’t intuitive and can be confusing.  



## Item 13: Take Advantage of Each Block in try/except/else/finally
 

In [None]:
handle = open('/tmp/random_data.txt') # May raise IOError
try:
    data = handle.read( ) # May raise UnicodeDecodeError
finally:
    handle. close( ) # Always runs after try:

In [None]:
def load_json_key(data, key) :
    try:
        result_dict = json.loads(data) # May raise ValueError
    except ValueError as e:
        raise KeyError from e
    else:
        return result_dict[key] # May raise KeyError

In [None]:
UNDEFINED = obj ect()
def divide_json(path):
    handle = open(path, 'r+') # May raise IOError
try:
    data = handle.read() # May raise UnicodeDecodeError
    op = json.loads(data) # May raise ValueError
    value = (
        op['numerator'] /
        op['denominator'] ) # May raise ZeroDivisionError
except ZeroDivisionError as e:
        return UNDEFINED
else:
    op['result'] = value
    result = json.dumps(op)
    handle.seek(0)
    handle.write(result) # May raise IOError
    return value
finally:
handle.close() # Always runs

* The try/finally compound statement lets you run cleanup code regardless of whether exceptions were raised in the try block.  
* The else block helps you minimize the amount of code in try blocks and visually distinguish the success case from the try/except blocks.  
* An else block can be used to perform additional actions after a successful try block but before common cleanup in a finally block.  

