# Functions

## Function calls

`function`: a **sequence of statements** that performs a computation and that goes under a given **name**. 

When you define a function, you specify the name and the sequence of statements. Later, you can “call” the function by name.

You already used a function before...

In [1]:
type(42)

int

The name of the function is _type_. The expression in parentheses is called the argument of the function. The result, for this function, is the type of the argument.

Python provides functions that convert values from one type to another.

In [4]:
int(3.8)

3

In [10]:
float(42)

42.0

In [11]:
str(3.1415)

'3.1415'

## Math functions

As with most programming languages related functions can be collected together in a file, such a file is called in Python a module. A package is a specific type of module that generally may include submodules. 
Python has a `math` module that provides most of the familiar mathematical functions. 
However, we are not going to use this module but a package called `numpy` that offers more advanced numeric computation capability beyond simple math.

Before we can use the functions in a module, we have to import it with an import statement:

In [2]:
import numpy as np

This statement creates a module object named np (if you don't provide the `as` part the name will be just numpy).

To access one of the functions of the module, you have to specify the name of the module and the name of the function, separated by a dot (also known as a period). This format is called **dot notation**.

In [3]:
degrees = 45
radians = degrees / 180.0 * np.pi
height = np.sin(radians)
print(height)

0.7071067811865475


As you can see some module not only have functions but also variables defined such as `np.pi` which is the value of $\pi$.

### Math with strings

You can't really use math functions with strings.

In [20]:
np.sin('3.14')

TypeError: ufunc 'sin' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

However you can use two mathematical operators with strings: `+` and `*`. 
But they have a slightly different meaning. The `+` concatenates two strings, while the `*` repeats a string.

So, for example:

In [21]:
print("mouse" + "trap")

mousetrap


In [23]:
print("rat"*5)

ratratratratrat


## Compositionality

So far, we have looked at the elements of a program in isolation without talking about how to combine them.
One of the most useful features of programming languages is their ability to take small building blocks and compose them. For example, the argument of a function can be any kind of expression, including arithmetic operators:

In [24]:
x = np.sin(degrees / 360.0 * 2 * np.pi)
x

0.7071067811865475

And even function calls:

In [27]:
print(x)
x = np.exp(np.log(x))
x

1.7071067811865475


1.7071067811865475

Almost anywhere you can put a value, you can put an arbitrary expression, with one exception: the left side of an assignment statement has to be a variable name. Any other expression on the left side is a syntax error.

In [28]:
minutes = hours * 60                 # right
hours * 60 = minutes                 # wrong!

SyntaxError: can't assign to operator (<ipython-input-28-e40822aac543>, line 2)

## Adding new functions

So far, we have only been using the functions that come with modules, but it is also possible to add new functions. A **function definition** specifies the name of a new function and the sequence of statements that run when the function is called.

Here is an example:

In [29]:
def print_exp():
    print("In vivo mouse experiment.")
    print("Multielectrode array recordings (32 channels).")

`def` is a keyword that indicates that this is a function definition. The name of the function is `print_exp`. 
The rules for function names are the same as for variable names: letters, numbers and underscore are legal, but the first character can’t be a number. You can’t use a keyword as the name of a function, and you should avoid having a variable and a function with the same name.

The empty parentheses after the name indicate that this function doesn’t take any arguments.

The first line of the function definition is called the header; the rest is called the body. The header has to end with a colon and the body has to be indented. By convention, indentation is always four spaces. The body can contain any number of statements.

The strings in the print statements are enclosed in double quotes. Single quotes and double quotes do the same thing; most people use single quotes except in cases like this where a single quote (which is also an apostrophe) appears in the string.

Defining a function creates a **function object**, which has type `function`:

In [30]:
print(print_exp)
type(print_exp)

<function print_exp at 0x7fed697ae290>


function

The syntax for calling the new function is the same as for built-in functions:

In [31]:
print_exp()

In vivo mouse experiment.
Multielectrode array recordings (32 channels).


Once you have defined a function, you can use it inside another function. For example, to repeat the previous refrain, we could write a function called print_twice:

In [32]:
def print_twice():
    print_exp()
    print_exp()

And then call print_twice:

In [33]:
print_twice()

In vivo mouse experiment.
Multielectrode array recordings (32 channels).
In vivo mouse experiment.
Multielectrode array recordings (32 channels).


But that’s not really useful.

## Definitions and uses


We can put together all function definitions and call. Here we are slightly modifying code fragments from previous section and the whole program looks like this:

In [4]:
def print_lab():
    print("------- Cortical Networks Lab -------")

def print_exp():
    print("In vivo mouse experiment.")
    print("Multielectrode array recordings (32 channels).")

def print_header():
    print_lab()
    print_exp()

print_header()

------- Cortical Networks Lab -------
In vivo mouse experiment.
Multielectrode array recordings (32 channels).


This program contains three function definitions: `print_lab`, `print_exp` and `print_header`. Function definitions get executed just like other statements, but the effect is to create function objects. The statements inside the function do not run until the function is called, and the function definition generates no output.

As you might expect, you have to create a function before you can run it. In other words, the function definition has to run before the function gets called.

As an exercise, move the last line of this program to the top, so the function call appears before the definitions. Run the program and see what error message you get.

Now move the function call back to the bottom and move the definition of `print_exp` after the definition of `print_header`. What happens when you run this program?

## Flow of execution

To ensure that a function is defined before its first use, you have to know the order statements run in, which is called the **flow of execution**.

Execution always begins at the first statement of the program. Statements are run one at a time, in order from top to bottom.

Function definitions do not alter the flow of execution of the program, but remember that statements inside the function don’t run until the function is called.

A function call is like a detour in the flow of execution. Instead of going to the next statement, the flow jumps to the body of the function, runs the statements there, and then comes back to pick up where it left off.

That sounds simple enough, until you remember that one function can call another. While in the middle of one function, the program might have to run the statements in another function. Then, while running that new function, the program might have to run yet another function!

Fortunately, Python is good at keeping track of where it is, so each time a function completes, the program picks up where it left off in the function that called it. When it gets to the end of the program, it terminates.

In summary, when you read a program, you don’t always want to read from top to bottom. Sometimes it makes more sense if you follow the flow of execution.

## Parameters and arguments

Some of the functions we have seen require **arguments**. For example, when you call `np.sin` you pass a number as an argument. Some functions take more than one argument: `np.power` takes two, the base and the exponent.

Inside the function, the arguments are assigned to variables called **parameters**. We can for example modify the function `print_twice` to print what ever we pass to it two times:

In [7]:
def print_twice(bruce):
    print(bruce)
    print(bruce)
    
print_twice("Mayday")

Mayday
Mayday


It's common to display information of an analysis that is running but usually you want to personalize this information, e.g. displaying the name of the subject you are analyzing. 

This is easily accomplished using functions that take parameters.

In [21]:
def print_subject(bruce):
    print("Subject name:", end=' ')
    print(bruce)

This function assigns the argument to a parameter named `bruce`. When the function is called, it prints the string `"Subject name: "` and then the value of the parameter (whatever it is). By default print terminates every string by a new line, here we change this behaviour by defining the character to put at the end of the print.

This function works with any value that can be printed.

In [14]:
print_subject('P01')
print_subject('Spam')
print_subject(42)
print_subject(np.pi)

Subject name: P01
Subject name: Spam
Subject name: 42
Subject name: 3.141592653589793


The same rules of composition that apply to built-in functions also apply to programmer-defined functions, so we can use any kind of expression as an argument for `print_subject`:

In [16]:
print_subject('P01, ' + 'P03')
print_subject('P01 ' * 3)

Subject name: P01, P03
Subject name: P01 P01 P01 


NameError: name 'bruce' is not defined

The argument is evaluated before the function is called.. 

Be careful with argument name, argument value and parameter. The following is a convoluted example worth thinking about it.

In [22]:
bruce = 'batman'
print_subject('bruce')
print_subject(bruce)

Subject name: bruce
Subject name: batman


The name of the variable we pass as an argument (`bruce`) has nothing to do with the name of the parameter (`bruce`). It doesn’t matter what the value was called back home (in the caller); here in `print_subject`, we call everybody `bruce`.

## Variables and parameters are local

When you create a variable inside a function, it is local, which means that it only exists inside the function. 
For example:

In [24]:
def sum_and_print_twice(num1, num2):
    tmp = num1 + num2
    print_twice(tmp)

This function takes two arguments, sums them, and prints the result twice. Here is an example that uses it:

In [31]:
num_one = 4
num_two = 3
sum_and_print_twice(num_one, num_two)

7
7


When `sum_and_print_twice` terminates, the variable `tmp` is destroyed. If we try to print it, we get an exception:

In [30]:
print(tmp)

NameError: name 'tmp' is not defined

Parameters are also local. For example, outside `print_twice`, there is no such thing as `bruce` (unless you defined something else to be named bruce).

## Stack diagram

A stack diagram is like a state diagram but with variables and parameters referring to the function that uses them.
For example the code we run above would correspond to this stack diagram:

|                      |              |
|:---------------------|-------------:|
|`__main__`            | num_one --> 4|
|                      | num_two --> 3|
|                      |              |
|`sum_and_print_twice` | num1 --> 4   |
|                      | num2 --> 3   |
|                      | tmp --> 7    |
|                      |              |
|`print_twice`         | bruce --> 7  |

Functions are organized in a stack with the order refleting the calling order. Here `print_twice` was called by `sum_and_print_twice` which in turn was called by the `__main__` which is the name of the topmost caller.

While this example is very simple and you don't need a stack diagram to understand the code, it might be very is useful to understand and analyze complex code.

## Traceback

If an error occurs during a function call, Python prints the name of the function, the name of the function that called it, and the name of the function that called that, all the way back to the topmost caller. In a notebook the topmost caller is `<module>`, in an interactive session it is `__main__`.

For example, if you try to access `tmp` from within `print_twice`, you get a `NameError`

In [32]:
def print_twice(bruce):
    print(tmp)
    print(bruce)
    print(bruce)

sum_and_print_twice(3, 4)

NameError: name 'tmp' is not defined

This list of functions is called a **traceback**. It tells you what program file the error occurred in, and what line, and what functions were executing at the time. It also shows the line of code that caused the error. 

## Fruitful functions and void functions

Some of the functions we have used, such as the math functions, return results; for lack of a better name, I call them fruitful functions. Other functions, like print_twice, perform an action but don’t return a value. They are called void functions.

When you call a fruitful function, you almost always want to do something with the result; for example, you might assign it to a variable or use it as part of an expression:

In [None]:
x = np.cos(radians)
golden = (np.sqrt(5) + 1) / 2

When you call a function in interactive mode, Python displays the result:

In [33]:
np.sqrt(5)

2.23606797749979

but since it doesn’t store the result, it cannot be further used.

Void functions might display something on the screen or have some other effect, but they don’t have a return value. If you assign the result to a variable, you get a special value called None.

In [34]:
result = print_subject('Braille')
print(result)

Subject name: Braille
None


The value None is not the same as the string 'None'. It is a special value that has its own type:

In [35]:
type(None)

NoneType

##  Return values

The functions we have written so far are all void. Speaking casually, they have no return value; more precisely, their return value is `None`.

In this section we will write fruitful functions. The first example is area, which returns the area of a circle with the given radius:

In [41]:
def area(radius):
    a = np.pi * radius**2
    return a

The return statement ends the function execution and returns the execution to the caller. The expression after the return is evaluated and its value is called return value, the value of the function execution in the caller.

This means that when the function `area` is called it will return the value of local variable `a`.

We can assign the return value to a variable.

In [42]:
unit_circle_area = area(1)
print(unit_circle_area)

3.141592653589793


The return expression can be arbitrarily complicated, so we could have written this function more concisely:

In [38]:
def area(radius):
    return math.pi * radius**2

On the other hand, temporary variables like `a` can make debugging easier.

As soon as a return statement runs, the function terminates without executing any subsequent statements. Code that appears after a return statement, or any other place the flow of execution can never reach, is called _dead code_.

## Incremental development

As you write larger functions, you might find yourself spending more time debugging.

To deal with increasingly complex programs, you might want to try a process called __incremental development__. The goal of incremental development is to avoid long debugging sessions by adding and testing only a small amount of code at a time.

The key aspects of the process are:

1. Start with a working program and make small incremental changes. At any point, if there is an error, you should have a good idea where it is.
2. Use variables to hold intermediate values so you can display and check them.
3. Once the program is working, clean the code and simplify statements, but only if it does not make the program difficult to read.

For example let's write a function that calculates the Pearson's correlation coefficient from the covariance and variances of two variables:

$\rho = cov(x, y) / \sqrt(\sigma_x^2 \sigma_y^2)$

First thing we need to do is to design the function interface, i.e. the parameters it takes and the return value.

In [None]:
def correlation(cov, vx, vy):
    # TODO
    return 0.0

Of course this function doesn't compute the correlation but it's a good start.

Next step is to calculate the product of variances. We print it to check the result is correct.
When testing the function I'm choosing values of parameters such that I know the answer and can easily track any error in the intermediate steps.

In [41]:
def correlation(cov, vx, vy):
    prod = vx * vy
    print(prod)
    return 0.0

correlation(2, 4, 9)

36


0.0

Now we can calculate the normalization factor by taking the square root of the product just calculated.

In [42]:
def correlation(cov, vx, vy):
    prod = vx * vy
    norm_factor = np.sqrt(prod)
    print(norm_factor)
    return 0.0

correlation(2, 4, 9)

6.0


0.0

Finally we divide the covariance by the normalization factor.

In [43]:
def correlation(cov, vx, vy):
    prod = vx * vy
    norm_factor = np.sqrt(prod)
    corr = cov / norm_factor
    return corr

correlation(3, 4, 9)

0.5

Now we can also make our function a little more compact. We could also write it as a return expression but that would be harder to read.

In [None]:
def correlation(cov, vx, vy):
    norm_factor = np.sqrt(vx * vy)
    corr = cov / norm_factor
    return corr

A __docstring__ is a string at the beginning of a function that explains the interface. Here is an example:

In [None]:
def correlation(cov, sx, sy):
    """
    Calculates Pearson's correlation coefficient from variances and covariance of two variables x and y.
    PARAMETERS:
        cov : float covariance of x and y
        vx : float variance of x
        vy : float variance of y
    RETURN:
        corr : float correlation coefficient of x and y
    """
    norm_factor = np.sqrt(sx * sy)
    corr = cov / norm_factor
    return corr

## Why functions?

1. make your program easier to read and debug by giving names to chunks of code

2. make a program smaller by eliminating repetitive code

3. allow debugging by parts (_divide et impera_)

4. allow to reuse functions in other programs

## Debugging

One of the most important skills you will acquire is debugging. Although it can be frustrating, debugging is one of the most intellectually rich, challenging, and interesting parts of programming.

In some ways debugging is like detective work. You are confronted with clues and you have to infer the processes and events that led to the results you see.

Debugging is also like an experimental science. Once you have an idea about what is going wrong, you modify your program and try again. If your hypothesis was correct, you can predict the result of the modification, and you take a step closer to a working program. If your hypothesis was wrong, you have to come up with a new one. As Sherlock Holmes pointed out, “When you have eliminated the impossible, whatever remains, however improbable, must be the truth.” (A. Conan Doyle, The Sign of Four)

For some people, programming and debugging are the same thing. That is, programming is the process of gradually debugging a program until it does what you want. The idea is that you should start with a working program and make small modifications, debugging them as you go.

## Exercises?

### Exercise 1  

Draw a stack diagram for the following program. What does the program print?
```
def b(z):
    prod = a(z, z)
    print(z, prod)
    return prod

def a(x, y):
    x = x + 1
    return x * y

def c(x, y, z):
    total = x + y + z
    square = b(total)**2
    return square

x = 1
y = x + 1
print(c(x, y+3, x+y))
```

### Exercise 2

In calcium imaging neurons are often identified as circles of different size and location in the reference image.
You are give the center (x and y coordinate) and area of each neuron. 
Write a function that calculates the distance between two neurons.

Use incremental development and include docstrings in your functions. Your code should contain at least three functions.

In [51]:
def distance(x1, y1, x2, y2):
    """
    """
    diffx = x1-x2
    diffy = y1-y2
    sq1 = diffx**2
    sq2 = diffy**2
    dist = np.sqrt(sq1 + sq2)
    return dist

def get_radius(area):
    radius = np.sqrt(area / np.pi)
    return radius

def get_neurons_distance(centers_distance, radius1, radius2):
    return centers_distance - radius1 - radius2

centerx1 = 0
centery1 = 0
area1 = 10
centerx2 = 2
centery2 = 2
area2 = np.pi

dist_centers = distance(centerx1, centery1, centerx2, centery2)
radius1 = get_radius(area1)
radius2 = get_radius(area2)
neurons_dist = get_neurons_distance(dist_centers, radius1, radius2)
print(neurons_dist)




0.04430300859341907


### Exercise 3

How many rectangular cuboids of sides $a > b > c$ can you fit in cube of edge $a+b+c$?
You only need to define one function.

After finding the solution you can read about this interesting mathematical puzzle: https://en.wikipedia.org/wiki/Hoffman%27s_packing_puzzle

### Exercise 4

Given two parallel lines $y_1 = a x + c_1$ and $y_2 = a x + c_2$, calculate the distance between the two lines.

Hint: the intersection points between each line and a parallel line are: $(\frac{-c_1 a}{a^2+1}, \frac{c_1}{a^2+1})$ and $(\frac{-c_2 a}{a^2+1}, \frac{c_2}{a^2+1})$. You only need to define a function that returns these two points given the parameters of the two lines. If you reuse a function from exercise 2 you don't need to define other functions.