# Functions

## What are functions?

* Functions are a set of one or more lines of code that do something.
* Functions can do anything.
* Functions can take inputs (arguments) or they can not.
* Functions can return results.

## How does one make a function?

Functions require 2 basic components: a definition line and a body.

The definition line starts with the keyword `def`, and includes the function name and any parameters that the function will accept. The names of parameters specified here will be the variable names used to refer to provided inputs within the function body.

The function body is where the lines of code, which should be run when the function is executed (or called), are written.

The minimal function looks like the following

In [1]:
def my_func():
    pass

That function includes everything you need to define a function named `myfunc`. However, it doesn't do anything. The `pass` keyword tells Python to silently ignore that there is nothing there, but also to not error about indentation that is not used.

If we want the function to do something, then we need to add something to the body. To be as cliche as possible, let's make this function print "Hello world!"

In [2]:
def my_func():
    print("Hello world!")

Note that when you run the above code, nothing prints. That's because while the code constituting the function definition ran, the function has not yet been executed. That means that the code in the function body hasn't run yet. To run the function, we need to call it just like we have been calling built-in functions. i.e., run a line of code including the name of the function followed by parentheses.

In [3]:
my_func()

Hello world!


When the function is called, the code within is executed. We can call the same function over and over again.

In [4]:
my_func()
my_func()
my_func()

Hello world!
Hello world!
Hello world!


## Defining function inputs

### args

Functions are often more useful if they take an input and then do something with it. We can specify the inputs accepted by a function as follows.

In [5]:
def print_loudly(thing):
    print(thing + "!!!")

The function definition line now has a parameter specified within the parentheses, `thing`. You can name parameters whatever you like. The chosen name can then be used to refer to the data that was given to the function when it was called.

**Note**, when naming parameters, it is good practice not to use any name that already refers to something else, like a built-in function. The reason for this is that if you name your parameter something like `sum`, then within the function you will no longer be able to refer to the `sum()` built-in function.

When calling a function that has required parameters, you must provide values for each.

In [6]:
print_loudly("I am the input")

I am the input!!!


**A note about jargon:** you will hear function inputs be referred to as "parameters" and "arguments". Generally, these words are used to refer to the same thing. However, "parameters" refers specifically to the variables used to refer to input information within a function, while "arguments" refer to the inputs you provide when you call the function. In the above example, `thing` is a parameter of `print_loudly()`, while "I am the input" is an argument. Don't fret too much about using the right word for this, but if you ever need to be specific, you can use these definitions.

Functions can take multiple inputs. Input parameters are specified as a column-separated list in the function definition.

In [7]:
def print_two_things(thing1, thing2):
    print(thing1 + " and " + thing2)

print_two_things("ham", "eggs")

ham and eggs


When multiple parameters are defined for a function, arguments are treated as positional. That means the first argument you provide will be stored in the first parameter, etc. This is similar to the process used in unpacking `tuple`s.

### kwargs

You might not always want to provide arguments as positional arguments. We might instead want to specify which parameter each argument is to be associated with. We can do that using something called keyword arguments or kwargs. Basically, we just use something that looks like variable assignment within the function call, to associate each argument with a named parameter.

In [8]:
print_two_things(thing2="ham", thing1="eggs")

eggs and ham


As you can see in the above function call, we can provide kwargs in either order. Because we named the parameter in which each argument should be stored, Python knows what to do.

### Default values

In some cases you might want your function to have default values assigned to parameters. Using defaults allows you to only specify an argument if you want to use something other than the default value. If no argument is provided, then the default is used.

In [9]:
def double_it(x, times=1):
    for i in range(times):
        x *= 2
    print(x)

In the above function, the provided input is doubled however many times you want. If you don't specify how many times it should be doubled, then it is doubled once. However, the function provides the functionality to double the input multiple times if you want.

Writing functions with default values is often useful to provide optional flexibility to do things, without requiring really long lists of arguments whenever someone uses the function.

In [10]:
double_it(2, 2)

8


In [11]:
double_it(x=5, times=10)

5120


In [12]:
double_it(times=2, 5)

SyntaxError: positional argument follows keyword argument (4150095670.py, line 1)

As you can see in the error above, Python is fussy about argument order when you use kwargs and args together. You must adhere to an order of args first, then kwargs both when defining a function, and when calling a function.

### Return values

In addition to printing to the terminal, functions can also produce output that can be assigned to a variable. In order to return an output, you use the `return` keyword. Let's see how the `double_it()` function would look if it returned `x` instead of printing it.

In [13]:
def double_it(x, times=1):
    for i in range(times):
        x *= 2
    return x

In [14]:
result = double_it(2)
print(result)

4


Note that once a line with the `return` keyword in it has been executed, the function ends. Therefore, functions can only return once. Subsequent `return` statements are ignored.

You can use this behaviour to return different results depending on some processing of the function. In addition, you can use `return` similarly to how `break` works within a loop to end a functions execution early if some condition is met.

In [15]:
def conditional_function(x):
    if x > 10:
        return "x greater than 10"
    
    return "x less than or equal to 10"

print(conditional_function(5))

x less than or equal to 10


In [16]:
print(conditional_function(100))

x greater than 10


As you can see, when we run `conditional_function()` with a value greater than 10, only the first return value is returned, even though the second `return` statement is outside the if block. If we replaced the `return` keyword with a `print` call instead, both lines would run

In [17]:
def conditional_function(x):
    if x > 10:
        print("x greater than 10")
    
    print("x less than or equal to 10")

conditional_function(100)

x greater than 10
x less than or equal to 10


## Documenting functions - docstrings and type hints

Now consider the following calls to the `double_it()` function.

In [18]:
print(double_it([1,2,3]))

[1, 2, 3, 1, 2, 3]


In [19]:
print(double_it("abc"))

abcabc


In [20]:
print(double_it({1:"a", 2:"b"}))

TypeError: unsupported operand type(s) for *=: 'dict' and 'int'

As you can see, the function we have defined works for some object classes, but not others. If we wanted to know what the function does and what classes we can use it with, we would need to read the code and figure it out. For this simple function, that isn't a difficult task. However, for functions that do anything more complicated than a line or two of code, it is worthwhile describing what the function is doing, what kinds of inputs it requires, and what outputs it produces. In addition, all functions, no matter how simple, are clearer with type hints.

### Docstrings

Docstrings are string objects that are interpreted in a special way by Python. They are both used as essentially a multi-line comment that can be read by someone looking at your code, and they are stored as part of the function. That docstrings are stored as part of a function means you can use the Python built-in function `help()` to see the docstring and usage for a function. For example, our function `double_it()` has a help message that doesn't say much for now.

In [21]:
help(double_it)

Help on function double_it in module __main__:

double_it(x, times=1)



If we take a look at the help message for a built-in function, we will see a bit more information.

In [22]:
help(print)

Help on built-in function print in module builtins:

print(*args, sep=' ', end='\n', file=None, flush=False)
    Prints the values to a stream, or to sys.stdout by default.
    
    sep
      string inserted between values, default a space.
    end
      string appended after the last value, default a newline.
    file
      a file-like object (stream); defaults to the current sys.stdout.
    flush
      whether to forcibly flush the stream.



We can flesh out the help message for our function, and explain it's behaviour to anyone reading the code by adding a docstring.

[The developers of Python have prescribed a recommended formatting style for docstrings](https://peps.python.org/pep-0257/). There are also additional docstring styles. [Google](https://google.github.io/styleguide/pyguide.html#383-functions-and-methods) and [NumPy](https://numpydoc.readthedocs.io/en/latest/format.html) are commonly used styles. For personal use, you can choose your favourite. However, it is best to stick with one of the Google or NumPy formats as that is what other people (i.e., the readers of your docstrings) are likely to be familiar with. In addition, you can then take advantage of [tools like  Sphinx](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html), which can generate automatic documentation for your Python code. [There are also plugins for IDEs like VS Code that streamline docstring writing](https://marketplace.visualstudio.com/items?itemName=njpwerner.autodocstring)

The exact specifics of docstrings depend upon which style you choose. I will use Google format docstrings as I find them more readable than NumPy. However, something all docstrings have in common is a few key elements:

1. Doctrings use the multiline string triple quotes `"""`, even if you are only writing a single line docstring.
2. If you are documenting a very simple function, (1 or 2 lines), you can use a single line docstring stating what the function does and what it returns
3. Docstrings for longer functions still have the single line summary, but may also have a longer summary 2 lines below the first line (i.e., single line summary, then blank line, then longer summary)
4. If your function takes arguments, describe them in an Args block. If you use type hints in the function definition, don't also include those here. If you don't use type hints, include those here.
5. Whether or not your function returns something, incldue a "return" section. State simply what your function returns. Again, if you didn't put type hints in your function definition, put the return type here.
6. If your function can raise exceptions, describe those here, including the type of exception and a brief description of the condition that causes the exception. We'll talk about exceptions later.

The docstring for our `double_it()` function would looks something like this

In [23]:
def double_it(x, times=1):
    """Double a number as many times as you want
    
    Args:
        x (int): The number to double
        times (int, optional): The number of times to double the provided number. Defaults to 1.
    
    Returns:
        int: The doubled number
    """
    for i in range(times):
        x *= 2
    return x

Now that we've added a docstring to the function, we can call `help(double_it)` to check how to use it if we ever forget. Or we can go to the code and read it directly.

In [24]:
help(double_it)

Help on function double_it in module __main__:

double_it(x, times=1)
    Double a number as many times as you want
    
    Args:
        x (int): The number to double
        times (int, optional): The number of times to double the provided number. Defaults to 1.
    
    Returns:
        int: The doubled number



### Type hints

In the above docstring, we are specifying the type of each object. Specifying the type is important so that people using your function know what to expect when they use it. As we saw above, we were able to call the `double_it()` function with either `int`, `str`, or `list` arguments. It ran and produced outputs in all three cases. However, in each case it behaved differently.

Python uses a system called duck typing (if it walks like a duck and quacks like a duck, then it must be a duck). That means that Python doesn't explicitely check what type an object is. It simply attempts to perform the operations you write and then throws an error if any of them are unsupported. You will typically write functions that are designed to do something with a certain type. In many cases, if a function does work with an additional type, it will be by accident, and the result may not be what you expect.

Using type hints tells the user of your code exactly what functions were written to work with. They then know that if they use the stated types, they will see the described behaviour. Python doesn't enforce types when you use type hints, so a user could still give a different type as an argument, but any resulting behaviour of your code is their problem. You did all that can be asked of you by writing type hints. If someone were to post a GitHub issue on your code complaining that your function that takes `int`s doesn't work correctly when they give it `str`s, then you can reasonably ignore them and close the issue if you have documented the correct usage with type hints.

In addition to describing the types for which your functions have defined behaviour, type hints are also useful documentation. For complicated functions, it can sometimes not be immediately obvious in what form the function is expecting its input data. Type hints make that clear.

We are already using type hints in the docstring example above. However, we can also specify type hints inline in your code. That would look like the following for our `double_it()` function.

In [25]:
def double_it(x: int, times: int = 1) -> int:
    """Double a number as many times as you want
    
    Args:
        x: The number to double
        times (optional): The number of times to double the provided number. Defaults to 1.
    
    Returns:
        The doubled number
    """
    for i in range(times):
        x *= 2
    return x

When you specify type hints inline, you don't also need them in the docstring. Now our help message looks a bit different, but still has all the information we need.

In [26]:
help(double_it)

Help on function double_it in module __main__:

double_it(x: int, times: int = 1) -> int
    Double a number as many times as you want
    
    Args:
        x: The number to double
        times (optional): The number of times to double the provided number. Defaults to 1.
    
    Returns:
        The doubled number



In addition, you can specify multiple possible types using pipes (`|`) to delimit the options. For example, our `double_it()` function actually works on `int`s, `str`s, and `list`s

In [27]:
def double_it(x: int | str | list, times: int = 1) -> int | str | list:
    """Double a number as many times as you want
    
    Args:
        x: The number to double
        times (optional): The number of times to double the provided number. Defaults to 1.
    
    Returns:
        The doubled number
    """
    for i in range(times):
        x *= 2
    return x

In [28]:
help(double_it)

Help on function double_it in module __main__:

double_it(x: int | str | list, times: int = 1) -> int | str | list
    Double a number as many times as you want
    
    Args:
        x: The number to double
        times (optional): The number of times to double the provided number. Defaults to 1.
    
    Returns:
        The doubled number



You can find a short cheatsheet describing how to type hint different objects [at this link](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html).

## Be careful with mutable function defaults

When you specify default values for your function parameters, those defaults are instantiated once and only once when the script is run. What that means is that if you use something like a `list` or `dict`, every time you run the function using that default, you will actually be using the same instance. If one call to the function changes the default value in some way (e.g., by appending to it), then future calls will use the changed object.

Let's look at that in action with an example:

In [29]:
def bad_append_func(thing_to_append: str, mutable_default_list: list[str | None] = []) -> list[str]:
    """Append a string to a list"""
    
    mutable_default_list.append(thing_to_append)
    return mutable_default_list

x = bad_append_func(thing_to_append="first call")
print(x)
    

['first call']


In [30]:
y = bad_append_func(thing_to_append="second call")
print(y)

['first call', 'second call']


As you can see, even though we called the function and stored the returned list in different variables, the append operation from the first call was seen in the output of the second call. Furthermore, `x` has also been modified when we called the function the second time!

In [31]:
print(x)

['first call', 'second call']


To get around this we have to simply not use the instance created in the function definition line. Instead, we can check if the default was used and overwrite it with a new empty instance if so.

In [32]:
def bad_append_func(thing_to_append: str, mutable_default_list: list[str | None] = []) -> list[str]:
    """Append a string to a list"""
    if mutable_default_list == []:
        mutable_default_list = []
    mutable_default_list.append(thing_to_append)
    return mutable_default_list

x = bad_append_func(thing_to_append="first call")
print(x)
y = bad_append_func(thing_to_append="second call")
print(y)

['first call']
['second call']


Alternatively, you may want to consider whether your function actually benefits from having a default for that parameter.

## `yield` and generator functions

Finally, we will take a quick look at writing generator functions. We won't spend much time on them as they are outside the scope of this class. However, I want to introduce them here so that you have at least an awareness of them.

[Generator functions](https://docs.python.org/3/reference/expressions.html#yieldexpr) are functions that can "generate" an iterable output one element at a time. They look very similar to the functions described above except they use the `yield` keyword to output the elements to be iterated over.

Note that generator functions, or "yield expressions" return a Generator type [which has the type hint](https://docs.python.org/3/library/typing.html#typing.Generator) `Generator[YieldType, SendType, ReturnType]`. We'll only use the `YieldType` field here, but a properly formatted type hint has all three fields, with unused ones being set to `None`.


In [33]:
from typing import Generator # importing for the type hint

def iter_int(x: int) -> Generator[int, None, None]:
    """iterate over an int, yielding the digits"""
    for i in str(x):
        yield i

gen = iter_int(12345)


Previously, we observed that `int` doesn't support iterating. That's because the `int` class doesn't have an implementation of the `__yield__()` method. Don't worry about that specific syntax right now, but basically that just means nobody wrote a function to yield parts of the int when iterating over it. How might such a function behave? One option is to output the entire `int` in one go and then end. After all, it's not really possible to separate out the digits meaningfully. A second option could be to return each digit as a representation of the value it contributes to the `int`. For example, 12 could be outputted as 10 and 2. A third option could be to simply output the digits one by one. That's what the above function does. Side-note, perhaps try writing your own generator functions to implement the other two options.

After writing our generator, we called it with an `int` and stored the output in a variable, `gen`. Let's take a look at what that object is. The type hint should already give you an idea of what it might be...

In [34]:
print(type(gen))

<class 'generator'>


As the type hint suggested, the generator function returned a `generator` class instance. As with other classes we've looked at, what this means is that we can interact with this object in defined ways. Mostly, `generator` instances support operations related to looping. One such operation which will allow us to understand how our generator function works is the `next()` function. `next()` simply "yields" the next element in the sequence produced by the `generator`. 

In [35]:
print(next(gen))

1


The first time we run `next()` we get the first element. If we run it again we get the second.

In [36]:
print(next(gen))

2


Note that we can do other stuff in between calling next. This text isn't part of the code, but we can execute code in between calling `next()` and the generator still remembers its place in the sequence.

In [37]:
print("this is an interruption")
print(next(gen))

this is an interruption
3


The above code and outputs show that what `yield` does is basically create a function that, instead of ending when it reaches `return`, pauses whenever it reaches `yield`. Then when the produced generator is instructed to do so, it simply picks up where it left off. We can see that explicitely if we put some `print()` calls in our generator function.

In [38]:
def iter_int(x: int) -> Generator[int, None, None]:
    """iterate over an int, yielding the digits"""
    print("starting")
    for i in str(x):
        print("Pre-yield")
        yield i
        print("Post-yield")

gen = iter_int(12345)

Now let's call `next(gen)` like before and see what prints.

In [39]:
print(next(gen))

starting
Pre-yield
1


In [40]:
print(next(gen))

Post-yield
Pre-yield
2


When we called `print(next(gen))` the first time, the function started, then ran until `yield` was reached and then stopped. However, when we called it the second time, it started up again at the line after `yield`, then it went to the next iteration of the loop. `yield` simply paused the execution of the function until it was told to start again.

In addition to using `next()` to see how the function is working, we can also iterate over it in a loop. For example (keeping the prints in to see what happens)

In [41]:
for i in gen:
    print(i)

Post-yield
Pre-yield
3
Post-yield
Pre-yield
4
Post-yield
Pre-yield
5
Post-yield


As you can see, a regular loop worked just like us running `next()` over and over. Also see that the loop started at 3. As we had run `next()` a couple of times, we had consumed the first two elements in the sequence. If we started it again, we could loop over the whole sequence.

In [42]:
for i in iter_int(12345):
    print(i)

starting
Pre-yield
1
Post-yield
Pre-yield
2
Post-yield
Pre-yield
3
Post-yield
Pre-yield
4
Post-yield
Pre-yield
5
Post-yield


You can therefore use `next()` to increment a generator instance or you can use it to skip elements (such as a header line in a file).

We're not going to talk any more about generator functions in this course. You can use them in exercises if you find a use for them, but you won't be asked to. Two examples of cases in which you will find `yield` useful are if you want to define a method to iterate over a custom class (we'll cover custom classes later) or if you are writing an implementation of a recursive algorithm (e.g., merge sort).