In [1]:
name = '2019-02-28-functions-will'
title = 'Functions in Python, an introduction'
tags = 'functions, basics'
author = 'Wilhelm Hodder'

In [2]:
from nb_tools import connect_notebook_to_post
from IPython.core.display import HTML

html = connect_notebook_to_post(name, title, tags, author)

## Basic principles and features

Functions are exactly that: they usually take an input and return an output. 
When you do your typical Python coding you will be using functions all the time, for example **np.mean()** or **np.arange()** from the numpy library.
The great thing is you can write them yourself!

Below is a very simple example:

In [3]:
def print_a_phrase(): # we start the definition of a function with "def"
    print("Academics of the world unite! You have nothing to lose but your over-priced proprietary software licenses.")
    #return 0;

In [4]:
print_a_phrase()

Academics of the world unite! You have nothing to lose but your over-priced proprietary software licenses.


We define the function using **def**, followed by whatever name we choose for the function, which is immediately followed by a round bracket **()**. The bracket is where you would write your *arguments* in, for example your input variable.

The function above doesn't take an input, but it prints a pre-defined phrase. If we were to make it more dynamic and make it print any input we want to give it, it would just become **print()**, so not much use for this here: Do not re-invent the wheel!

It also doesn't actually *return* anything, it just *does* something. In C++ this function would be defined as **void**.

What if we do give it an input? Below are two examples of simple functions for unit conversion:

In [5]:
def convert_pa_to_mb(pascal):
    millibar = pascal * 0.01
    return millibar

See how the above takes the input, operates on it, and then **returns** your output. Note how we didn't specifify what form **Pascal** has to take: we could put in an integer or a float. This is due to Python's *polymorphism*. We could also put a string in, and it wouldn't complain until it has to do maths on it, which is when it causes an error. More on this later.

In [6]:
convert_pa_to_mb(80000)

800.0

But if your function is that simple, you can save a line by calculating and returning the result in one go:

In [7]:
def convert_mb_to_pa(millibar):
    return millibar * 100

In [8]:
convert_mb_to_pa(1050)

105000

Now all of this was very basic, and you might think, why not just write the calculation directly into my main code? 

For a start, as your work becomes more sophisticated the longer and more complex your calculations and operations become. Then consider that you'll probably end up wanting to reuse the same task in your code or in another code. If you wrote out the same stuff again and again your code would get very messy very quickly. Instead it really does help to keep recurring functions neatly titied away at the top or bottom of your code, or in a seperate file.

In [9]:
def check_integer_and_change(value):
    if value.is_integer() == False:
        remainder = value
        while remainder > 1:
            remainder = remainder - 1  # reduce number to smallest positive value
        if remainder >= 0.5:  # if half ovr above, round up
            value = value + 1 - remainder
        else:
            value = value - remainder  # round down
    return value

Above is just another example of a function; it figures out if a value is an integer, and if it is not, it rounds it up or down, as appropriate.

Notice **is_integer()** is also a function, from the main python library. Here it doesn't take an input in it's brackets **()**, but instead it's "attached" to the variable **value**. This is because it is a *class function*, and in Python variables are *class objects*, but that's a story for another presentation. In fact, you can find an excellent tutorial on classes here: https://docs.python.org/3/tutorial/classes.html .

In [10]:
import numpy as np  # need this to make array

my_array = np.arange(5)  # make an array of integers
new_array = np.array([1.2, 8, 4.5])  #  make another array to insert into our main array
my_array = np.concatenate((my_array, new_array))  # let's add some non-integers to our little array
print(my_array)

[0.  1.  2.  3.  4.  1.2 8.  4.5]


In [11]:
for i in range(len(my_array)):
    my_array[i] = check_integer_and_change(my_array[i])  # we perform our function from above on each array element

print(my_array)

[0. 1. 2. 3. 4. 1. 8. 5.]


Another good reason for using functions is how it deals with *memory*: If you use arrays a lot, and you do all your calculations in line with your main code then you might start piling up a lot of *stuff*, i.e. you use up your memory, which can lead to your program to slow, or even memory leackage. Instead, functions take your input, temporarily take some extra space in your memory, and once they spit out their output, they delete whatever extra stuff they needed for their calculation, but without affecting your input, unless you want it to. - **for** and **while** loops, or when you use **if**, also delete any variables that were defined within them, but they might change your input from before the loop, if you're not careful!

To illustrate this danger:

In [12]:
def double_something(value):
    value = value * 2 # we actively modify the original input variable; in many cases the original value would be irretrievable
    return value

In [13]:
input_value = 5.0
new_value = double_something(input_value)
print("new_value =", new_value)
# now check if our original input is the same
print("input =", input_value)

new_value = 10.0
input = 5.0


In [14]:
# now let's do the same the function does, but in a loop for 1 iteration:
value = 5.0
new_value = 0
for i in range(1):
    # we copy the function above exactly
    value = value * 2
    new_value = value # our "return"
    
    
print("new_value =", new_value)
# now check if our original input is the same
print("input =", value)

new_value = 10.0
input = 10.0


So we can see that due to sloppy coding we now changed our original input in the **for** loop, but preserved it when we used a function instead.

There are instances where a function could permanently affect your input data, due to the fact that in python when you use the **=** operator, the variables you get are really just pointers to the same data. It is recommended to make sure you've properly copied vital data before passing it to a function.

Now let's look at putting different variable types in, see what we can get out, and think of how what we'll find could be useful to us.
Say you want a function that can operate on a single quantity, as well as multiple quantities, e.g. in an array. For this example we will do a unit conversion and then calculate the average of all the input values.
But we will use **try** and **except** to make a distinction between single values and arrays.

In [15]:
def convert_JeV_and_mean(eV_values): # take input in eV
    
    joule_values = 1.60218e-18 * eV_values # here just do the simple conversion
    
    # if we want the mean from numpy we need to make sure we only do that for an array input
    try:
        array_length = len(joule_values) # if we have a single value, this line will cause an error
        mean = np.mean(joule_values) 
    except: 
        return joule_values # if it's just a single value, return it by itself
    
    return joule_values, mean # if it's an array, we can return both it and its mean
        

Let's put in a float...

In [16]:
convert_JeV_and_mean(1.0)

1.60218e-18

Only a single output was returned: the converted float.

Now let's input a **numpy array**. (Btw, the values seen below are of the order GeV, which is a common sight in high energy particle physics, and in astrophysics)

In [17]:
some_data = np.array([5.0956e-11, 5.1130e-11, 4.8856e-11 ])
convert_JeV_and_mean(some_data)

(array([8.16406841e-29, 8.19194634e-29, 7.82761061e-29]), 8.061208452e-29)

The output above looks complicated, but it is essentially first the array of converted values, and then the mean. But this bracketed output looks messy and in your own code you would do something like this:

In [18]:
results, mean = convert_JeV_and_mean(some_data)

print("results =", results); print("mean =", mean, "J")

results = [8.16406841e-29 8.19194634e-29 7.82761061e-29]
mean = 8.061208452e-29 J


As the function potentially outputs two different objects, we can assign them to seperate variables as shown above.

Notice how the order of the output (i.e. values *then* mean) corresponds to the order in which we wrote them after **return** at the bottom of our function.

## Examples of advanced features

### Decorators and Wrappers

Quite neatly, we can define functions within functions, and also return functions from functions. An example where this is applied is the use of **decorators** and **wrappers**, which are types of functions. Below is a demonstration of a very simple example: we have a function called **func()**, and we *wrap* it up in another function, which will just measure time, inside the decorator.

In [19]:
from functools import wraps
import time 

def decorator(f):
    @wraps(f)
    def wrapper(*args, **kwargs):
        start_time = time.time() 
        rv = f(*args, **kwargs) # here we run the function f
        print("Time taken =", time.time() - start_time) # difference in points in time gives duration of f run
        return rv
    return wrapper # we return the function called wrapper

@decorator
def func():
    pass # does nothing, just for demonstration

The `@` followed by the name of a function, here **wraps** and **decorator**, acts like a kind of override. You place the @ right above a function definition, which, for example, tells the program that whenever **func** is run, it is *decorated* by the function called **decorator**. decorator returns **wrapper**, which will now run everytime func is run. **\*args** and **\*\*kwargs** are ways of allowing you to pass a flexible number of arguments to your function, and are explained here: http://book.pythontips.com/en/latest/args_and_kwargs.html .

Let's see if it works:

In [20]:
func()

Time taken = 9.5367431640625e-07


Running  **func()** printed the time taken, which will naturally be negligible, due to the simplicity of the function.

### Docstrings

Finally, not necessarily an "advanced" feature but still useful, we have **docstrings**, which are known to be *documentation* inside your code. You can insert a docstring anywhere in your code, but it's not advised as that can slow your program down, and in most places you should use *comments* instead. However, they are very useful, say, at the top of a function definition to explain *what* the function does. For example:

In [21]:
def check_integer_and_round(value):
    """
    Function takes a float or integer value and tests the quantity for its type.
    If the input is an integer ther function does nothing more and returns the original input
    If the input is a non-integer it rounds it to the nearest whole integer, and returns the result.
    """
    if value.is_integer() == False:
        remainder = value
        while remainder > 1:
            remainder = remainder - 1 # reduce number to smallest positive value 
        if remainder >= 0.5: # if half ovr above, round up
            value = value + 1 - remainder
        else:
            value = value - remainder # round down
    return value

We reused the previous function for rounding numbers to integers, with a small change to its name to avoid any issues with doubly defining it. We also addedd some text within two triple quotation marks `"""`; this is a docstring. Like a comment, which starts with a hash `#`, it does nothing functionally when the code is run, it simply serves to help the user understand what the function does. In your terminal you can call it with the **help()** function:

In [22]:
help(check_integer_and_round)

Help on function check_integer_and_round in module __main__:

check_integer_and_round(value)
    Function takes a float or integer value and tests the quantity for its type.
    If the input is an integer ther function does nothing more and returns the original input
    If the input is a non-integer it rounds it to the nearest whole integer, and returns the result.



A few more examples can be found in an earlier post: [Some peculiarities of using functions in Python](https://ueapy.github.io/some-peculiarities-of-using-functions-in-python.html).

In [23]:
HTML(html)