# Functions

One of the most important concepts in programming is a **function**, a bit of reusable code. We already saw a couple built-in functions (what were they again?) and some methods (which are functions attached to an object) but the important thing is that you can define your own. Like `for` and `while` statements, a `def` statement ends with a colon, and the function code is indented. 

In [1]:
def square(x):
    return x ** 2

A function has zero or more **parameters** (in this case, one) and returns a value. We can now call it like this, passing in an **argument**.

In [2]:
square(5)

25

Many languages verify the type of variables being passed to functions, but you don't need to worry about in python.

Here's a longer example:

In [3]:
def add_numbers(numbers):
    """Add a list of numbers together."""
    total = 0
    for number in numbers:
        total += number
    return total

In [4]:
add_numbers([1, 2, 3, 4, 5])

15

One important point is that variables defined inside of a function are **local**, and don't affect anything outside. 

In [5]:
total

NameError: name 'total' is not defined

See? The variable `total` we defined doesn't exist outside the function. This is important because it lets you treat a function as a black box, only worrying about the input and output, knowing it won't mess everything else up.

You might have noticed that string at the beginning of the function, not really doing anything. That's called a **docstring** and is used to describe the function. It's generally inside sets of three quotes, so it can be spread over multiple lines, but that's not really necessary.

The `help` function will display it.

In [6]:
help(add_numbers)

Help on function add_numbers in module __main__:

add_numbers(numbers)
    Add a list of numbers together.



You might have noticed I passed a function (`add_numbers`) as an argument to another function (`help`). Weird, huh? In python, functions are just another type of object, like `int`s and `list`s and everything else we've talked about. You can pass them to functions or assign them to variables just like any other object. The thing that's special about a function is that you can follow it with parenthesis and it will call some code and return the result. Think of it a little like how you can follow a list with square brackets to get some element or slide of elements.

In turns out that wasn't a very exiting function, since a built-in version already exists.

In [7]:
sum([1, 2, 3, 4, 5])

15

## Recursive functions

To help you appreciate the how variables are local let's look at a **recursive function**, that is, a function that calls itself. This can be pretty confusing; if it doesn't make sense let me know. Recursive function aren't that common in python, but understanding them will help you understand other concepts.

Suppose you want to write the factorial of a number. One approach (the preferred approach in most languages) is to start set a variable to one, then count up to the number in question, multiplying it by each number. I'll let you write most of it; if you get stuck scroll to the bottom for the answer.

In [8]:
def factorial(n):
    product = 1
    # replace this with some loop, updating product inside of the loop
    return product

Note that line that starts with `#` is a **comment**. Anything after a `#` in a line is ignored; it's their to help other people understand your code. Sometimes "other people" is yourself, six months from now, trying to understand what you were thinking. Be kind to your future self.

But there's another way to write this function. You might also define factorials by saying the factorial of 0 is 1, and the factorial of any larger integer n is just n times the factorial of n-1. Writing this in code:

In [9]:
def recursive_factorial(n):
    if n == 0:
        return 1
    else:
        return n * recursive_factorial(n-1)

In [10]:
recursive_factorial(6)

720

Each invocation of the function has its own copy of the variable `n`, so when you call it above `n` is set to 6, but then it calls a version of itself with `n` set to 6, and so on until it gets to `0` at which point it returns, and returns from then version with `n` euqals 1, and so on, until it returns from the original function call.

## Return values

The recursive function above has two `return` statements; that's perfectly normal. As soon as a function encounters `return` it's done. If a the computer gets to the end of a function without hitting a `return` statement, it returns a special value called `None` which is the only possible value of an object of type `NoneType`.

In [11]:
def dummy_function():
    pass # pass is a special command that means "do nothing"

In [12]:
dummy_function()

It looks like it didn't return anything at all, but that's just because jupyter doesn't show it.

In [13]:
print(dummy_function())

None


What if you want to return more than one value? That's not that common, but you can use a tuple for that.

In [14]:
def min_and_max(numbers):
    minimum = maximum = numbers[0] # I didn't explain this but this should make sense
    for number in numbers:
        if number < minimum:
            minimum = number
        if number > maximum:
            maximum = number
    return minimum, maximum

In [15]:
a, b = min_and_max([3,5,2,7,8,1])
print(f"minimum = {a}")
print(f"maximum = {b}")

minimum = 1
maximum = 8


Note how I assigned a pair of variables to a tuple. That's called **unpacking** a tuple and is sometimes useful.

I should note that when I first wrote this function, I made a mistake and it didn't work. See if you can figure out what I did wrong, and why it gave the answer it did.

In [16]:
def incorrect_min_and_max(numbers):
    minimum = maximum = numbers[0] # I didn't explain this but this should make sense
    for number in numbers:
        if number < minimum:
            minimum = number
        if number > maximum:
            maximum = number
        return minimum, maximum

In [17]:
incorrect_min_and_max([3,5,2,7,8,1])

(3, 3)

## Keyword and positional arguments

As I mentioned, functions can have multiple parameters, though we haven't seen that yet. Let's look at a simple example.

In [18]:
def divide(numerator, denominator):
    return numerator / denominator

We'd call in by passing in multiple arguments, the same number as parameters, and they are matched up in order.

In [19]:
divide(3, 4)

0.75

The order is easy to remember here, but some functions have lots of parameters, so we can also use **keyword arguments** (as opposed to **positional arguments**), that work like this:

In [20]:
divide(denominator=3, numerator=2)

0.6666666666666666

We specify the names of the of parameters. We can even mix these styles; the positional arguments are first matched to parameters, they the keyword arguments are used later.

In [21]:
divide(5, denominator=7)

0.7142857142857143

You might try playing with this a bit, seeing what happens if you switch that to `numerator` in the above code. Errors are a whole topic that we'll get to later, but for now you should know not to be afraid of them, and try to make them happens to learn how they work and what the output means.

# Default parameter

Sometimes we want to be able to specify defaults.

Let's suppose we're calculating the distance between two points. We'll pass each of the points in as lists or tuples and get a usual Euclidean distance back. We'll assume they each have the same number of dimensions.

In [22]:
def distance(a, b):
    total = 0
    for ai, bi in zip(a, b):
        total += (ai - bi) ** 2 
    return total ** 0.5

In [23]:
distance((1, 2), (4, 6))

5.0

There's some tricky stuff in there with the `zip` function and unpacking the returned tuples into `ai` and `bi`, but I won't get into that no. You can experiment a bit if you'd like. Here's an equivalent that might make more sense:

In [24]:
def distance(a, b):
    total = 0
    for i in range(len(a)):
        total += (a[i] - b[i]) ** 2 
    return total ** 0.5

In [25]:
distance((1, 2), (4, 6))

5.0

There are other ways to measure distance; Euclidean is really just a special distance of [Minkowski distance](https://en.wikipedia.org/wiki/Minkowski_distance). Suppose we want to write a general function to return any such distance measure. We could add another parameter, `p`, like this:

In [26]:
def distance(a, b, p):
    total = 0
    for ai, bi in zip(a, b):
        total += abs((ai - bi) ** p)
    return total ** (1/p)

In [27]:
distance((1, 2), (4, 6), 2)

5.0

Or using `p=1` (sometimes called the L1 distance or the taxicab distance or the Manhattan distance)

In [28]:
distance((1, 2), (4, 6), 1)

7.0

That's nice, except it's annoying that we need specify the `p` every time when it will usually be 2. So we can give the parameter a **default value** like this:

In [29]:
def distance(a, b, p=2):
    total = 0
    for ai, bi in zip(a, b):
        total += abs((ai - bi) ** p)
    return total ** 0.5

Now we can call if with or without specifying `p`.

In [30]:
distance((0, 0), (3, 4))

5.0

In [31]:
distance((0, 0), (3, 4), 2)

5.0

## Answers

Answer to factorial question earlier.

In [32]:
def factorial(n):
    product = 1
    for i in range(1, n+1):
        product *= i
    return product

In [33]:
factorial(5)

120

And if you didn't figure out what I did wrong with `incorrect_min_and_max`: the `return` statement is indented too much, so it's inside the loop, so the function exits on the first iteration of the loop. That's a pretty common mistake.