### Effective Python, 59 ways specific ways to write better python
by Brett Slatkin

### Chapter 1

#### Item 3: Know the Difference Between _bytes_, _str_, and _unicode_ (python3)

In python3, there are two types that represent sequences of characters: _**bytes**_ and _**str**_. Instances of bytes contain raw 8-bit values. Instances of str contain Unicode characters.
The most common Unicode encoding is utf-8.

Below are helper functions that convert between these two cases and to ensure that the type of input values matches your code's expectations.

- the following function takes a _**str**_ or _**bytes**_ and always returns a _**str**_.

raw bytes characters have the  _**decode()**_ method applied, converting them to Unicode, utf-8 in this case.

In [1]:
def to_str(bytes_or_str):
    """test for str or bytes, and return a str"""
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of str

In [2]:
a_bytes1 = b'hello world'

In [3]:
type(a_bytes1)

bytes

- call to\_str() function with a _**bytes**_ string:

In [4]:
new_str1 = to_str(a_bytes1)

In [5]:
type(new_str1)

str

In [6]:
a_str1 = 'hello python'

In [7]:
type(a_str1)

str

- call to_str() function with a **_str_** string:

In [8]:
new_str2 = to_str(a_str1)

In [9]:
type(new_str2)

str

- the following function takes a _**str**_ or _**bytes**_ and always returns a _**bytes**_

Unicode characters have the  _**encode()**_ method applied, converting them to raw 8-bit values.

In [10]:
def to_bytes(bytes_or_str):
    """test for str or bytes and return a bytes"""
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of bytes

In [11]:
a_str2 = 'hello cheerlights'

In [12]:
type(a_str2)

str

- call to\_bytes() function with a _**str**_ string:

In [13]:
new_bytes1 = to_bytes(a_str2)

In [14]:
type(new_bytes1)

bytes

In [15]:
a_bytes2 = b'hello colorful'

In [16]:
type(a_bytes2)

bytes

- call to\_bytes() function with a _**bytes**_ string:

In [17]:
new_bytes2 = to_bytes(a_bytes2)

In [18]:
type(new_bytes2)

bytes

- In python3, **_bytes_** and **_str_** instances are never equivalent - not even the empty string - so you must be more deliberate about the types of character sequences that you're passing around.

- Also in python3, file handlers (returned by the _open_ built-in function) default to utf-8 encoding.  This can cause unexpected failures when tyring to write some random binary data to a file.

This will break in python3 (but not in python2)

<code>with open('random.bin', 'w') as f:
    f.write(os.urandom(10))</code>

In [35]:
import os

In [36]:
with open('random.bin', 'w') as f:
    f.write(os.urandom(10))

TypeError: write() argument must be str, not bytes

\>>> 
TypeError: must be str, not bytes

To make this work properly, you must indicate that the data is being opened in write binary mode (**'wb'**) instead of write character mode (**'w'**)
The following would work correctly in both python2 and python3.

<code>with open('random.bin', 'wb') as f:
    f.write(os.urandom(10))</code>

In [38]:
with open('random.bin', 'wb') as f:
    f.write(os.urandom(10))

With the correct file type called, this code succeeds - no error raised.

- In python3, bytes contains sequences of 8-bit values, str contains sequences of Unicode characters. bytes and str instances can't be used together with operators (like > or +).
- In python2, str contains sequences of 8-bit values, unicode contains sequences of Unicode characters. str and unicode _can_ be used together with operators if the str only contains 7-bit ASCII characters.
- Use helper functions to ensure that the inputs you operate on are the type of character sequence you expect (8-bit values, UTF-8 encoded characters, Unicode characters, etc.).
- If you want to read or write binary data to/from a file, always open the file using a binary mode (like 'rb' or 'wb').

---

#### Item 4: Write Helper Functions Instead of Complex Expressions

- An example of pythons 'pithy' syntax, making it easy to write single-line expressions that implement a lot of logic.

In [19]:
from urllib.parse import parse_qs
my_values = parse_qs('red=5&blue=0&green=', keep_blank_values=True)

In [20]:
print(repr(my_values))

{'red': ['5'], 'blue': ['0'], 'green': ['']}


Query strings may contain multiple values, some may have single values, some may be present but have blank values, and some may be missing entirely.

In [21]:
print('Red:      ', my_values.get('red'))
print('Green:    ', my_values.get('green'))
print('Opacity:  ', my_values.get('opacity'))

Red:       ['5']
Green:     ['']
Opacity:   None


We'd prefer if a default value of 0 was assigned when a parameter isn't supplied or is blank.

Using one-line boolean expression, if statements:

In [22]:
red = my_values.get('red', [''])[0] or 0
green = my_values.get('green', [''])[0] or 0
opacity = my_values.get('opacity', [''])[0] or 0

In [23]:
print('Red:      %r' % red)
print('Green:    %r' % green)
print('Opacity:  %r' % opacity)

Red:      '5'
Green:    0
Opacity:  0


to convert the values to integers for further use in mathematical expressions:

In [24]:
red = int(my_values.get('red', [''])[0] or 0)

This is even harder to read than the previous expressions. There's so much **visual noise**.

<code>if/else</code> conditional -or ternary- espressions to make cases like this clearer while keeping the code short.

In [25]:
red = my_values.get('red', [''])
red = int(red[0]) if red[0] else 0

In [26]:
print(red)

5


The full <code>if/else</code> statement over multiple lines. Seeing all the logic spread out like this makes the dense version seem even more complex.§

In [27]:
green = my_values.get('green', [''])
if green[0]:
    green = int(green[0])
else:
    green= 0

In [28]:
print(green)

0


- Writing a helper function is the way to go, especially if you might need to use this logic repeatedly.

In [33]:
def get_first_int(values, key, default=0):
    found = values.get(key, [''])
    if found[0]:
        found = int(found[0])
    else:
        found = default
    return found

- Using the **get\_first\_int()** with *my\_values*:

In [30]:
get_first_int(my_values, 'red')

5

In [31]:
get_first_int(my_values, 'green')

0

In [32]:
get_first_int(my_values, 'opacity')

0

- Python's syntax makes it all too easy to write single-line expressions that are overly complicated and difficult to read.
- Move complex expressions onto helper functions, especially if you need to use the same logic repeatedly.
- The <code>if/else</code> expression provides a more readable alternative to using Boolean operators like _or_ and _and_ in expressions.

---