Welcome to **Lesson 4** of the Noisebridge Python Class! (https://github.com/audiodude/PythonClass)

In this lesson, we will try to make sure you can write basic Python scripts on your own. This means we will go over some of the more nitpick-y details of the language that we may have glossed over in previous lessons, such indentation and function definitions. We will also cover exceptions and basic debugging. We will learn what can go wrong with your programs and how to recover.

You will learn:

* Specifics about how Python indentation/whitespace works
* Definitions of positional and keyword arguments to functions
* The special Python arguments: `*args` and `*kwargs`
* Using keyword arguments to extend functionality in a backwards compatible way

You may have heard that in Python, "whitespace matters". If you've been editing the code examples in these notebooks, or writing code in a Replit, you might have already come across an `IndentationError` or two.

The rule in Python, in general, is that additional levels of indention are used to define increasingly nested blocks of code. Inside a given "block", all of the lines that comprise the block must have the same indentation, ie the same number of tabs or spaces.

In [None]:
# Top level psuedo-block (not actually a block), no indentation
x = 42
y = x * 2
if y > x:
    # The colon above starts a block, inside the if statement
    print('y is greater')
    z = y * 2
    # Note that every line of code in this block aligns
    
for i in range(x):
    if i > 39 and i % 2 == 0:
        # A second block requires a new level of indentation
        print('%s is over 39 and even' % i)
        continue

    # Blank lines don't matter
    if i % 17 == 0:
            # The indentation of a block doesn't need to match
            # the indentation of other sibling blocks. It just
            # needs to be internally consistent
            print('%s is divisible by 17' % i)
            x2 = i + 10
            
    # Outdenting means the end of the block. This line runs after
    # each of the if statements above
    y2 = x - i
    # This is an IndentationError, because there's no new block,
    # but the indentation of the next line doesn't match the previous
    #  y3 = y2 * 3

*(note to self, uncomment the indentation error and demonstrate that)*

---

Let's move on to function definitions. Functions can have any number of **positional arguments** and **keyword arguments**. Positional arguments are what we have seen so far, they are required when calling a function:

In [None]:
def my_func(pos1, pos2):
    print(pos1, pos2)

my_func(42, 'foo')

Keyword arguments are optional and are defined with a **default value**. If the function is called with a given keyword argument missing, the default value is used inside the function. Otherwise, you can assign a value to a keyword argument when calling a function by specifying the name of the argument with an equal sign, then the value.

In [None]:
def my_func2(pos1, pos2, kw1=42):
    print(pos1, pos2, kw1)
    
my_func2(10, 'red', kw1=100)
my_func2(20, 'blue')

When calling a function, you must specify the keyword arguments *after* the positional arguments. So the following is an error:

In [None]:
my_func2('foo', kw1='bar', 100)

Keyword arguments themselves, however, can be specified in any order.

In [None]:
def my_func3(pos1, pos2, kw1='foo', kw2='bar', kw3='baz'):
    print(pos1, pos2, kw1, kw2, kw3)
    
my_func3(10, 20, kw3='red', kw2='blue')

You can also specify keyword arguments as if they were positional arguments:

In [None]:
my_func3('red', 'yellow', 'blue')

And positional arguments as if they were keyword arguments:

In [None]:
my_func3(pos1=10, pos2=20)

Though in practice, doing so can cause confusion for folks who are reading your code.

---

Python provides the special parameters `*args` and `**kwargs`, that capture all of the remaining positional (`*args`) and keyword (`**kwargs`) arguments to a function. Let's see this in practice.

In [None]:
def color_them(color, *args):
    for arg in args:
        print('%s: %s' % (color, arg))
        
color_them('red', 1, 2, 3)

The first argument, `'red'` is assigned to the argument `color`. Then the next positional arguments, as many as we want, are assigned to `args`, which is a list. Notice that when referring to `args` in the code, we omit the asterisk (`*`), which is only used in the function definition to indicate that `args` is a special variable that is capturing all of the remaining positional arguments.

We can define keyword arguments after `*args` if we like.

In [None]:
def color_them2(color, *args, print_twice=False):
    for arg in args:
        i = 1
        if print_twice:
            i = 2
        # We use a single underscore, '_', to indicate that
        # we're not using a variable. It doesn't have any
        # special meaning, it's just a convention.
        for _ in range(i):
            print('%s: %s' % (color, arg))
            
color_them2('blue', 10, 20, 30, 40, 50, print_twice=True)

What if we want to use a variable length list of `*args` to call a function that can take a variable length list of `*args`?

In [None]:
def color_with_header(color, *args):
    print('=== %s ===' % color)
    color_them(color, *args)
    
color_with_header('green', 100, 150, 200, 250)

Here, we again using the asterisk (`*`) but it has a different meaning. When we use it on line 5 above in our call to `color_them`, we are using it as the **unpacking operator**. This means, take an actual list of items, and extract each one, rather than just passing it as a list.

You may be wondering why we would use `*args` instead of just passing a single item that represents a list. We'll get back to that, promise.

In [None]:
def color_them3(color, things):
    for thing in things:
        print('%s: %s' % (color, thing))

color_them3('yellow', [3, 5, 7])

Just like we have a way to capture any variable number of positional args, we can also capture keyword args using `**kwargs`.

In [None]:
def print_prices(header, multiplier=1, **kwargs):
    print(header)
    for thing, price in kwargs.items():
        print('%s costs %s' % (thing, price * multiplier))
        
print_prices('The prices:', apple=1.29, orange=1.59, banana=0.89)

We can pass literally any valid python identifier to the `print_prices` function, and they will all be captured in the dictionary `kwargs`. Notice that there is still a positional argument (we can have as many of those as we like) and a named keyword argument (`multiplier`) that can be specified as well and will be captured outside of `kwargs` (so `multiplier` won't be part of the `kwargs` dictionary).

In [None]:
print_prices('Toy prices:', train=5.50, multiplier=2, blocks=1.00)

Like `*args`, we can use the dictionary destruction operator `**` to pass a dictionary to a function as keyword arguments.

In [None]:
def turn_the_car(direction='left', speed=30):
    print(direction, speed)
    
my_kwargs = {'direction': 'right'}
turn_the_car(**my_kwargs)

It's important to note that you can't call the function `turn_the_car` with an arbitrary destructured dictionary, because it's not set up to accept arbitrary keyword arguments.

In [None]:
my_kwargs2 = {'direction': 'up', 'brake': True, 'foo': 'bar'}
turn_the_car(**my_kwargs2)

So what's the point of all this? The main reason to capture `*args` and `**kwargs` is so that you can confidently delegate to or wrap helper functions. Let's say we had a function that performs some task. Maybe we want to print out a logging message before and after the task.

In [None]:
def perform_task(data, instruction, preference=False, num_rows=100):
    # Doesn't actually do anything, left to your imagination
    print(data, instruction, preference, num_rows)

def log_perform_task(*args, **kwargs):
    print('About to run perform_task')
    perform_task(*args, **kwargs)
    print('Done with perform_task')
    
perform_task([1,2,3], 'foo')
log_perform_task([4,5,6], 'bar', num_rows=50)

Here, what we're basicially saying is: "Whatever positional arguments and keyword arguments were passed to this function, pass those same arguments to the function we're calling". So the `*args` and `**kwargs` arguments in the definition of `log_perform_task` capture the positional and keyword arguments, which are then **destructured** and passed as the positional and keyword arguments of `perform_task`.

We could also modify or remove parameters:

In [None]:
def perform_twice_as_many_rows(*args, **kwargs):
    if 'num_rows' in kwargs:
        kwargs['num_rows'] *= 2
    perform_task(*args, **kwargs)
    
perform_twice_as_many_rows([1,2,3], 'foo', num_rows=500)

We could have also explicitly defined the necessary parameters for our utility function:

In [None]:
def log_perform_task_worse(data, instruction, preference=False, num_rows=100):
    print('About to run perform_task')
    perform_task(data, instruction, preference=preference, num_rows=num_rows)
    print('Done with perform_task')

The problem with that approach is that we have to update all of our utility functions (and we already have two of them!) whenever the definition of `perform_task` updates. So if we add a new parameter to `perform_task`, the function `log_perfrom_task_worse` will also need to be updated.

In [None]:
def perform_task(data, instruction, preference=False, num_rows=100, capture=True):
    print(data, instruction, preference, num_rows, capture)
    
def log_perform_task_worse(data, instruction, preference=False, num_rows=100, capture=True):
    print('About to run perform_task')
    perform_task(data, instruction, preference=preference, num_rows=num_rows, capture=capture)
    print('Done with perform_task')

Instead, the `*args`/`**kwargs` approach let's us basically say "We don't care what the arguments to the delegated function are, pass them".

---

A popular pattern when writing Python code is to use keyword arguments to introduce new features to a function without having to update all of the existing places where it is called.

In [None]:
def find_job(database, cpu):
    workers = []
    for name, cycles in database.items():
        if cycles >= cpu:
            workers.append(name)
    return workers
        
def find_increasing_jobs(database):
    candidates = {}
    for i in range(0, 100, 10):
        candidates[i] = find_job(database, i)
    return candidates
        
db = {
    'alpha': 45,
    'beta': 55,
    'gamma': 91,
    'phi': 27,
}

data = find_increasing_jobs(db)
print(data)

We can add an argument for only returning the first job that meets our criteria. The main thing here to consider is that the default value of the argument should match the behavior before we modified the code. Here we introduce the `first_only` keyword argument, and set it to `False` because the old version of the function behaved as if this value was `False`.

In [None]:
def find_job(database, cpu, first_only=False):
    workers = []
    for name, cycles in database.items():
        if cycles >= cpu:
            workers.append(name)
            if first_only:
                break
    return workers

def find_first_increasing_jobs(database):
    candidates = {}
    for i in range(0, 100, 10):
        candidates[i] = find_job(database, i, first_only=True)
    return candidates

data = find_increasing_jobs(db)
print(data)

print('===')

data2 = find_first_increasing_jobs(db)
print(data2)