# Chapter 1: Pythonic Thinking #

## Item 1: Know which version of python you're using##

You can either use (from the command line)
`python3 --version` OR from runtime do:

In [4]:
import sys
print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=8, micro=10, releaselevel='final', serial=0)
3.8.10 (default, Jun  2 2021, 10:49:15) 
[GCC 9.4.0]


## Item 2: Follow the PEP8 style guide##

use 4 spaces, not tabs...  
tl;dr use pylint: `pyling <file> -r y` 

## Item 3: Know the differences between bytes and str##

In python, there are two types that represent sequences of character data: `bytes` and `str`.  Instances of `bytes` contain raw, unsigned 8-bit values.  Instances of `str` contain Unicode code points that represent textual characters from human languages.  Importantly, `str` instances do not have an associated binary encoding, and `bytes` instances do not have an associated text encoding.  To convert Unicode data to binary data, you must call the `encode` method of `str`.  You can explicitly specify the encoding you want to use for these methods, or accept the system default, which is commonly UTF-8.  

You can also use most binary operators to compare instances of the same class, such as `assert b'red' > b'blue'`

In [9]:
def to_str(bytes_or_str):
    """ 
    function to take a bytes or str instance and always return a str
    """
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value

print(repr(to_str(b'foo')))
print(repr(to_str('bar')))


def to_bytes(bytes_or_str):
    """
    function to take a bytes or str instance and always return a bytes
    """
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value

print(repr(to_bytes(b'foo')))
print(repr(to_bytes('bar')))


'foo'
'bar'
b'foo'
b'bar'


## Item 4: Prefer Interpolated F-Strings Over C-Style Format STrings and str.format##

There are 4 problems with c-style format strings in python:
1. If you change the type or order of data value sin the tuple on the right side of the 
formatting expression, you can get errors due to type conversion incompatibility.
2. They become difficult to read when you need to make small modifications to values before formatting them into a string - and this is an extremely common need.
3. If you want to use the same value in a format string multiple times, you have to repeat it in the right side tuple.
4. Using dictionaries in formatting expressions also increases verbosity.

To ameliorate these issues python 3 added support for *advanced string formatting* that is more expressive than the old C-Style format strings that use the % operator; this new functionality can be accessed through the `format` built-in function.

In [11]:
a = 1234.5678
formatted = format(a, ',.2f')
print(formatted)

1,234.57


This can also be used to format multiple values together by calling the new `format` method of the `str` type.

In [14]:
key = 'my_var'
value = 1.234

formatted = '{} = {}'.format(key, value)
print(formatted)

my_var = 1.234


you can also optionally provide a colon cahracter followed by format specifiers to customize how values will be converted into strings

In [16]:
formatted = '{:<10} = {:.2f}'.format(key, value)
print(formatted)

my_var     = 1.23


Python 3.6 added *interpolated format strings* - f-strings for short - to fully solve issues 1-4 above once and for all.  F-strings solve problem number 4 by completely eliminating the redundancy of providing keys and values to be formatted.  They achieve this pithiness by allowing you to reference all names in the current python scope as part of a formatting expression.

In [19]:
key = 'my_var'
value = 1.234

formatted = f'{key} = {value}'
print(formatted)

my_var = 1.234


F-strings also enable you to put a full python expression within the placeholder braces, solving problem #2 from above by allowing small modifications to the avlues being formatted with concise syntax.

In [38]:
pantry = [
    ('avocados', 1.25),
    ('bananas', 2.5),
    ('cherries', 15),
]

# comparing C-style, format and f-string formatting
for i, (item, count) in enumerate(pantry):
    old_style = '#%d: %-10s = %d' % (i+1, item.title(), round(count))
    
    new_style = '#{}: {:<10s} = {}'.format(i+1, item.title(), round(count))
    
    f_string = f'#{i+1}: {item.title():<10s} = {round(count)}'
    
    print(old_style)
    print(new_style)
    print(f_string)
    
    assert old_style == new_style == f_string


#1: Avocados   = 1
#1: Avocados   = 1
#1: Avocados   = 1
#2: Bananas    = 2
#2: Bananas    = 2
#2: Bananas    = 2
#3: Cherries   = 15
#3: Cherries   = 15
#3: Cherries   = 15


## Item 5: Write Helper Functions Instead of Complex Expressions ##

The empty string, the empty list, and zero all evaluate to `False` implicitly.  
The behavior of the `get` method of dictionaries is to return its second argument if the key doesn't exist in the dictionary.  
Use if/else or ternary expressions:  
`red_str = my_values.get('red', [''])`  
`red = int(red_str[0]) if red_str[0] else 0`

Beware not to abuse these pithy statements and create a hard to read single-line expression.

## Item 6: Prefer Multiple Assignment Unpacking Over Indexing##
Python has a built-in tuple type that can be used to create immutable, ordered sequences of values.  Besides the immutability of tuples, one of their key features is known as `tuple unpacking` which allows for assigning multiple values in a single statement.  
Tuple unpacking has less visual noise than accessing the typle's indices, and it often requires fewer lines.


In [40]:
item = ('Peanut butter', 'Jelly')
first, second = item # tuple unpacking
print(first, 'and', second)

Peanut butter and Jelly


You can even use unpacking to swap values in place without the need to create temporary variables:

In [46]:
def bubble_sort(a):
    """ Bubble sort an input list in place """
    for _ in range(len(a)):
        for i in range(1, len(a)):
            if a[i] < a[i-1]:
                a[i-1], a[i] = a[i], a[i-1] #swap
                
names = ['pretzels', 'carrots', 'arugula', 'bacon']
bubble_sort(names)
print(names)


['arugula', 'bacon', 'carrots', 'pretzels']


In [48]:
# you can also perform introspection by typing the object name followed by a question mark (?)
names?

Another valuable application of unpacking is in the target list of for loops and similar constructs, such as comprehensions and generator expressions.

Compare this verbose way:


In [63]:
snacks = [('bacon', 350), ('donut', 240), ('muffin', 190)]
for i in range(len(snacks)):
    item = snacks[i]
    name = item[0]
    calories = item[1]
    print(f'#{i+1}: {name} has {calories} calories')

#1: bacon has 350 calories
#2: donut has 240 calories
#3: muffin has 190 calories


To this succinct pythonic way of tuple unpacking with the `enumerate` built-in function

In [65]:
snacks = [('bacon', 350), ('donut', 240), ('muffin', 190)]
for rank, (name, calories) in enumerate(snacks):
    print(f'#{rank}: {name} has {calories} calories')

#0: bacon has 350 calories
#1: donut has 240 calories
#2: muffin has 190 calories


## Item 7: Prefer enumerate Over range ##
The `range` built-in function is useful for loops that iterate over a set of integers.  When you have a data structure to iterate over, like a list of strings, you can loop directly over the sequence instead.

Often you'll want to iterate over a list and also know the index of the current item in the list; for this case use the built-in `enumerate` function.  `enumerate` wrapes any iterator with a lazy generator.  `enumerate` yields pairs of the loop index and the next value from the given iterator, and can take in an optional second parameter that indicates at what value the loop index should start at.


In [77]:
flavor_list = ['vanilla', 'chocolate', 'pecan', 'strawberry']
for flavor in flavor_list:
    print(f'{flavor} is delicious!')
    
print('\nUsing the range and len functions:')    
for i in range(len(flavor_list)):
    flavor = flavor_list[i]
    print(f'{i+1}: {flavor}')
    
print('\nUsing the enumerate function and stepping through the generator manually:')
it = enumerate(flavor_list)
print(next(it))
print(next(it))
print(next(it))
print(next(it))

print('\nUsing the enumerate function in a for loop:')
for i, flavor in enumerate(flavor_list):
    print(f'{i+1}: {flavor}')
    
# you can actually make the above more concise by specifying the number from which enumerate
# should begin counting (1 in this case) as the second parameter:
# This is the most pythonic way of doing it.
print('\nUsing the enumerate function in a for loop starting at index 1:')
for i, flavor in enumerate(flavor_list, 1):
    print(f'{i}: {flavor}')


vanilla is delicious!
chocolate is delicious!
pecan is delicious!
strawberry is delicious!

Using the range and len functions:
1: vanilla
2: chocolate
3: pecan
4: strawberry

Using the enumerate function and stepping through the generator manually:
(0, 'vanilla')
(1, 'chocolate')
(2, 'pecan')
(3, 'strawberry')

Using the enumerate function in a for loop:
1: vanilla
2: chocolate
3: pecan
4: strawberry

Using the enumerate function in a for loop starting at index 1:
1: vanilla
2: chocolate
3: pecan
4: strawberry


## Item 8: Use zip to Process Iterators in Parallel##


In [94]:
names = ['Cecilia', 'Lise', 'Marie']
counts = [len(n) for n in names]
print(counts)

longest_name = None
max_count = 0
for i in range(len(names)):
    count = counts[i]
    if count > max_count:
        longest_name = names[i]
        max_count = count
        
print(longest_name)

[7, 4, 5]
Cecilia


A more pythonic, and cleaner, way of doing the above is to use the built-in function `zip`:


In [95]:
for name, count in zip(names, counts):
    if count > max_count:
        longest_name = name
        max_count = count
        
print(longest_name)

Cecilia


One issue to be aware of with `zip` is that it will only keep yielding tuples until one of the wrapped iterators is exhausted.  Its output is only as long as its shortest input.  In many cases this truncating behavior of `zip` can be surprising and bad.  To overcome this, use the `zip_longest` function in the itertools module:

In [96]:
import itertools
names.append('Rosalind')

In [101]:
import itertools

# even though we've appended Rosalind to the names list we wont see it by using built-in zip
print('Even though we have appended Rosaling to the names list,'
      ' we wont see it by using the built-in zip method')
for name, count in zip(names, counts):
    print(name)

print('\nTo fix this we can use the itertools.zip_longest method:')
for name, count in itertools.zip_longest(names, counts):
    print(f'{name}: {count}')


Even though we have appended Rosaling to the names list, we wont see it by using the built-in zip method
Cecilia
Lise
Marie

To fix this we can use the itertools.zip_longest method:
Cecilia: 7
Lise: 4
Marie: 5
Rosalind: None


## Item 9: Avoid else Blocks After for and while Loops ##

In python, an else block will run immediately after a loop finishes.  


In [105]:
# poor way of doing this:

a = 4
b = 9

for i in range(2, min(a,b) + 1):
    print('Testing', i)
    if a % i == 0 and b % i == 0:
        print('Not coprime')
        break
        
else:
    print('Coprime')

Testing 2
Testing 3
Testing 4
Coprime


In [107]:
# Better way 1: return early when I find a condition I'm looking for:

def coprime(a, b):
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            return False
    return True

assert coprime(4,9)
assert not coprime(3,6)
assert coprime(3,6)

AssertionError: 

In [110]:
# Better way 2: have a result variable that indicates whether I've found what I'm looking for in the loop:

def coprime_alternate(a, b):
    is_coprime = True
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            is_coprime = False
            break
    return is_coprime

assert coprime_alternate(4, 9)
assert not coprime_alternate(3, 6)

## Item 10: Prevent Repetition with Assignment Expressions ##

An assignment expression - also known as the *walrus operator* - is a new syntax introduced in python 3.8 to solve a long-standing problem with the language that can cause code duplication.  Whereas normal assignment statements are written `a = b` and pronounce "a equals b", these assignments are written `a := b` and is pronounced "a walrus b".  

Assignment expressions are useful because they enable you to assign variables in places where assignment statements are disallowed, such as in the conditional expression of an if statement.

In [115]:
fresh_fruit = {
    'apple': 10,
    'banana': 8,
    'lemon': 5,
}

def make_lemonade(count):
    print('Making lemonade.')

def out_of_stock():
    print('Out of stock.')

count = fresh_fruit.get('lemon', 0)
if count:
    make_lemonade(count)
else:
    out_of_stock()

Making lemonade.


The problem with this simple code is that it's noisier than it needs to be.  The ocunt variable is used only within the first block of the if statement.  Defining count above the if statement causes it to appaer to be more important than it really is, as if all code that follows, including the else block, will need to access the count variable, when in fact that is not the case.

In the below example, the code is only one line shorter, but it's more readable because it's now clear that count is only relevant to the first block of the if statement.  The assignment operation is first assigning a value to the count variable, and then evaluating that value in the context of the if statement to determine how to proceed with flow control.  This two step behavior - assign and then evaluate - is the fundamental nature of the walrus operator.

In [118]:
# Better way

if count := fresh_fruit.get('lemon', 0):
    make_lemonade(count)
else:
    out_of_stock()

Making lemonade.
