# Item 14: Prefer Exceptions to Returning None

- When writing utility functions, there's a draw for Python programmers to give special meaning to the return value of None. It seems to makes sense in some cases. For example, say you want a helper function that divides one number by another. In the case of dividing by zero, returning None seems natural because the result is undefined.

In [1]:
def divide(a, b):
    try:
        return a / b
    except ZeroDivisionError:
        return None

- Code using this function can interpret the return value accordingly.

In [2]:
result = divide(x, y)
if result is None:
    print('Invalid inputs')

NameError: name 'x' is not defined

- What happens when the numerator is zero? That will cause the return value to also be zero. This can cause problems when you evaluate the result in a condition like an if statement. You may accidentally look for any False equivalent value to indicate errors instead of only looking for None 

In [3]:
x, y = 0, 5
result = divide(x, y)
if not result:
    print('Invalid inputs') # This is wrong!

Invalid inputs


- This is a common mistake in Python code when None has special meaning. This is why returning None from a function is error prone. There are two ways to reduce the chance of such errors.

- The first way is to split the return value into a two-tuple. The first part of the tuple indicates that the operation was a success or failure. The second part is the actual result that was computed.

In [4]:
def divide(a, b):
    try:
        return True, a / b
    except ZeroDivisionError:
        return False, None

- Callers of  this function have to unpack the tuple. That forces them to consider the status part of the tuple instead of just looking at the result of division.

In [5]:
success, result = divide(x, y)
if not success:
    print('Invalid inputs')

- The problem is that callers can easily ignore the first part of the tuple(using the underscore variable name, a Python convention for unused variables). The resulting code doesn't look wrong at first glance. This is as bad as just returning None.

In [6]:
_, result = divide(x, y)
if not result:
    print('Invalid inputs')
    

Invalid inputs


- The second, better way to reduce these errors is to never return None at all. Instead, raise an exception up to the caller and make them deal with it. Here, I turn a ZeroDivisionError into a ValueError to indicate to the caller that the input values are bad:

In [7]:
def divide(a, b):
    try:
        return a / b 
    except ZeroDivisionError as e:
        raise ValueError('Invalid inputs') from e
        

- Now the caller should handle the exception for the invalid input case(this behavior should be documented). The caller no longer requires a condition on the return value of the function. If the function didn't raise an exception, then the return value must be good. The outcome of exception handling is clear.

In [8]:
x, y = 5, 2
try:
    result = divide(x, y)
except ValueError:
    print('Invalid inputs')
else:
    print('Result is %.1f' % result)

Result is 2.5


## Things to Remember

- Functions that return None to indicate special meaning are error prone because None and other values(e.g., zero, the empty string) all evaluate to False in conditional expressions.

- Raise exceptions to indicate special situations instead of returning None. Expect the calling code to handle exceptions properly when they're documented.

# Item 15: Know How Closures Interact with Variable Scope

- Say you want to sort a list of numbers but prioritize one group of numbers to come first. This pattern is useful when you're rendering a user interface and want important messages or exceptional events to be displayed before everything else.

- A common way to do this is to pass a helper function as the key argument to a list's sort method. The helper's return value will be used as the value for sorting each item in the list. The helper can check whether the given item is in the important group and can vary the sort key accordingly.

In [5]:
def sort_priority(values, group):
    def helper(x):
        if x in group:
            return (0, x)
        return (1, x)
    values.sort(key=helper)

- This function works for simple inputs.

In [6]:
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sort_priority(numbers, group)
print(numbers)

[2, 3, 5, 7, 1, 4, 6, 8]


- There are three reasons why this function operates as expected:

* Python supports closures: functions that refer to variables from the scope in which they were defined. This is why the helper function is able to access the group argument to sort_priority.

* Functions are first-class objects in Python, meaning you can refer to them directly, assign them to variables, pass them as arguments to other functions, compare them in expressions and if statements, etc. This is how the sort method can accept a closure function as the key argument.

* Python has specific rules for comparing tuples. If first compares items in index zero, then index one, then index two, and so on. This is why the return value from the helper closure causes the sort order to have two distinct groups.

- It'd be nice if this function returned whether highr-priority items were seen at all so the user interface code can act accordingly. Adding such behhavior seems straightforward. There's already a closure to flip a flag when high-priority items are seen? Then the function can return the falg value after it's been modified by the closure. 

In [7]:
def sort_priority2(numbers, group):
    found = False
    def helper(x):
        if x in group:
            found = True
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return found

In [8]:
found = sort_priority2(numbers, group)
print('Found:', found)
print(numbers)

Found: False
[2, 3, 5, 7, 1, 4, 6, 8]


- The sorted results are correct, but the found result is wrong. Items from group were definitely found in nubmers, but the function returned False. How could this happen?

- When you reference a variable in an expression, the Python interpreter will traverse the scope to resolve the reference in this order:


1. The current function's scope
2. Any enclosing scopes (like other containing functions)
3. The scope of the module that contains the code (also called the global scope)
4. The built-in scope (that contains functions like len and str)

- If none of these places have a defined variable with the referenced name, then a NameError exception is raised.

- Assigning a value to a variable works differently. If the variable is already defined in the current scope, then it will just take on the new value. If the variable doesn't exist in the current scope, then Python treats the assignment as a variable definition. The scope of the newly defined variable is the function that contains the assignment.

- This assignment behavior explains the wrong return value of the sort_priority2 function. The found variable is assigned to True in the helper closure. The closure's assignment is treated as a new variable definition within helper, not as an assignment within sort_priority2.

In [None]:
def sort_priority2(numbers, group):
    found = False
    def helper(x):
        if x in group:
            found = True
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return found

- Encountering this problem is sometimes called the scoping bug because it can be so surprising to newbies. But this it the intended result. This behavior prevents local variabels in a function from polluting the containing module. Otherwise, every assignment within a function would put garbage into the global module scope. Not only would that be noise, but the interplay of the resulting global variables variables could cause obscure bugs.

### Getting Data Out

- In Python 3, there is special syntax for getting data out of a closure. The nonlocal statement is used to indicate that scope traversal should happen upon assignment for a specific variable name. The only limit is that nonlocal won't traverse up to the module-level scope(to avoid polluting globals).

In [9]:
def sort_priority3(numbers, group):
    found = False
    def helper(x):
        nonlocal found
        if x in group:
            found = True
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return found

- The nonlocal statement makes it clear when data is being assigned out of closure into another scope. It's complementary to the global statement, which indicates that a variable's assignment should go directly into the module scope.

- However, much like the anti-pattern of global variables, I'd caution against using nonlocal for anything beyond simple functions. The side effect of nonlocal can be hard to follow. It's especially hard to understand in long functions where the nonlocal statement and assignments to associated variables are far apart. 

- When your usage of nonlocal starts getting complicated, it's better to wrap your state in a helper class. Here, I define a class that achieves the same result as the nonlocal approach. It's a little longer, but much easier to read.

In [10]:
class Sorter(object):
    def __init__(self, group):
        self.group = group
        
    def __call__(self, x):
        if x in self.group:
            self.found = True
            return (0, x)
        return (1, x)
sorter = Sorter(group)
numbers.sort(key=sorter)
assert sorter.found is True

### Scope in Python 2

- Unfortunately, Python 2 doesn't support the nonlocal keyword. In order to get similar behavior, you need to use a work-around that takes advantage of Python'scoping rules. This approach isn't pretty, but it's the common Python idiom.

In [11]:
# Python 2
def sort_priority(numbers, group):
    found = [False]
    def helper(x):
        if x in group:
            found[0] = True
            return (0, x)
        return (1, x)
    numbers.sort(key=helper)
    return found[0]
    

- As explained above, Python will traverse up the scope when the found variable is referenced to resolve its current value. The trick is that the value for found is a list, which is mutable. This means that once retreived, the closure can modify the state of found to send data out of the inner scope(with found [0]=True)

- This approach also works when the variable used to traverse the scope is dictionary, a set, or an instance of a class you've defined.

## Things to Remember

- Closure functions can refer to variables from any of the scopes in which they were defined.
- By default, closures can't affect enclosing scopes by assigning variables. 
- In Python 3, use the nonlocal statement to indicate when a closure can modify a variable in its enclosing scopes.
- In Python 2, use a mutable value (like a single-item list) to work around the lack of the nonlocal statement.
- Avoid using nonlocal statements for anything beyond simple functions.

# Item 16: Condifer Generators Instead of Returning Lists

- The simplest choice for functions that produce a sequence of results is to return a list of items. For example, say you want to find the index of every word in a string. Here, I accumulate results in a list using the append method and return it at the end of the function:

In [12]:
def index_words(text):
    result = []
    if text:
        result.append(0)
        for index, letter in enumerate(text):
            if letter == ' ':
                result.append(index + 1)
        return result

In [13]:
address = 'Four score and seven years ago...'
result = index_words(address)
print(result[:3])

[0, 5, 11]


- There are two problems with the index_words function.

- The first problem is that the code is a bit dense and noisy. Each time a new result is found, I call the append method. The method call's bulk(result.append) deemphasizes the value being added to the list (index + 1). There is one line for creating the result list and another for returning it. While the function body contains ~ 130 characters, only ~75 characters are important.

- A better way to write the function is using a generator. Generator are functions that use yield expressions. When called, generator functions do not actually run but instead immediately return an iterator. With each call to the next built-in function, the iterator will advance the generator to its next yield expression. Each value passed to yield by the generator will be returned by the iterator to the caller.

- Here, I define a generator function that produces the same results as before.

In [14]:
def index_words_iter(text):
    if text:
        yield 0
    for index, letter in enumerate(text):
        if letter == ' ':
            yield index + 1

In [15]:
result = list(index_words_iter(address))

In [16]:
result

[0, 5, 11, 15, 21, 27]

In [17]:
address

'Four score and seven years ago...'

- The second problem with index_words is that it requires all results to be stored in the list before being returned. For huge inputs, this can cause your program to run out of memory and crash. In contrast, a generator version of this function can easily be adapted to take inputs of arbitrary length.

- Here, I define a generator that streams input from a file one line at a time and yields outputs one word at a time. The working memory for this function is bounded to the maximum length of one line of input.

In [18]:
def index_file(handle):
    offset = 0
    for line in handle:
        if line:
            yield offset
        for letter in line:
            offset += 1
            if letter == ' ':
                yield offset

- The only gotcha of defining generators like this is that the callers must be aware that the iterators returned are stateful and can't be reused.

## Things to Remember

- Using generators can be clearer than the alternative of returning lists of accumulated results.
- The iterator returned by a generator produces the set of values passed to yield expressions within the generator function's body. 
- Generators can produce a sequence of outputs for arbitrarily large inputs because their working memory doesn't include all inputs and outputs.

# Item 17: Be Defensive When Iterating Over Arguments