# Python functions tutorial

In this tutorial, we will learn what is the purpose of a function, and how to create them. We will also learn about: arguments, parameters, and returned values.

A Python function is a short block of code that can be executed as many times as desired. As an example, `print()` and `type()` are two examples of Python functions. When we write our functions, there are several good practices to follow. 

First:

All the functions must do **only one specific task** and do it well. This doesn't mean that the Python block of code only consists of one single line of code; it can contain loops, several variable assignments, etc. That means **only one specific task** is one conceptual TASK (change the format of a variable, filter a list, train a model, ...etc). If we find that our function is doing more than one task, is better to split the function into two functions. For example, imagine that we defined a function named `train_and_evaluate_model()`. This will not be a good idea. Is better to define two functions: `train_model()` and `evaluate_model()`. 

Second:

As shown in the previous examples all the user-defined functions (there are other types of functions) **must have a self-explanatory name** that allows us to identify what is the function goal. Of course, we can use any desired name for our functions. However, over time it will be harder to know the purpose of our function. In addition: don't use uppercase letters or numbers, and don't include white spaces on the function name (replace them with underscores). 

Third:

It's a good practice to add a `docstring` to the function. A `docstring` is a multiline Python string that explains: what is the function going to do, what are the types of arguments, and what the outputs types. 

# Functions syntax 

To create a function in Python we need to use the following syntax:

```python
def function_name():
   """
   docstring
   """
   ...
   [return]
```

where `function_name` can be any name that we want to give to our function. However, remember to follow the second rule of the previous paragraph!!! Next, are two examples bad and good function names: `d()` and `days_until_end_of_year()`. Don't you agree that the second function name is more meaningful than the first one?

As we can see in the previous example, the function definition starts with the **keyword** `def` followed by the function's name (please don't name your function as `def()`), parentheses, and `:`. Next, it contains the `docstring` which is optional but a good practice to include because this will allow you and third parties to understand what is the function's purpose, the arguments, the outputs, and their types (integers, floats, lists, tuples,..etc). One example of a `docstring` could be:

```python
def days_until_end_of_year():
   """
   This function computes how many days are left until the end of the year based on the current date.
   
   Inputs: 
   None
   
   Outputs:
   Integer with the number of days left until the end of the year
   """
   ...
   [return]
```

in this way, anyone who wants to know how to use our function will know it by reading the documentation. 

Is important to notice that the code inside the function must be **indented**. What does this mean? It means that we must start the function's lines of code leaving four white spaces before (or a tab). That's how Python identifies that those lines of code belong to the function and not to the main program. If we don't include indentation, Python will consider the lines of code as part of the main program. It's very important to pay attention to this point when we copy and paste Python code from one text editor to another because this indentation can be lost.

At this point, you might wonder what are the `...` dots after the `docstring`. They represent any amount of lines of code. They could be `print()` statements, variable declarations,...etc.

Finally, we have the `return` clause within square brackets. The square brackets mean that the `return` clause is optional. In other words, including the `return` clause (without the square brackets) is optional. What is the purpose of the `return` clause? When we define a Python function, we usually want to do a calculation, a plot, a new variable,...etc, and store this variable **into another one external to the function**. If we don't include a `return` clause we can still make calculations, plots,...etc but we will not be able to store them into another variable external to the function.
However, **by default the returned value of a function is `None` unless we use the `return` clause** to overwrite the returned value. What does mean this? Let's loos a couple of examples:

```python
def days_until_end_of_year():
   """
   This function computes how many days are left until the end of the year based on the current date.
   
   Inputs: 
   None
   
   """
   
   days_to_end_of_year = current_date - date_end_of_year
    
   print("The number of days until the end of the year is: ", days_to_end_of_year)
```

this function doesn't include a `return` and it will print the desired value on the screen. However, we will not be able to store this value in another variable external to the function. See the code below:


```python
days_left = days_until_end_of_year()
print(days_left)
```

in this case, the value of `days_left` will be `None` because our function doesn't include a `return` clause to replace the returned value. See the difference with this other version of the function:

```python
def days_until_end_of_year():
   """
   This function computes how many days are left until the end of the year based on the current date.
   
   Inputs: 
   None
   
   Outputs:
   Integer with the number of days left until the end of the year
   """
   
   days_to_end_of_year = current_date - date_end_of_year

   return days_to_end_of_year

days_left = days_until_end_of_year()
print(days_left)
```

in this other case, the value of the variable `days_left` will not be `None` anymore. In contrast, the value of `days_left` will be the same as the variable `days_to_end_of_year` declared inside the function.

# Variable scoping

At this point, is worth of mentioning the `scope` of the variables. What does this mean? Let's pay attention to a couple of variables defined in the previous block of code: `days_left` and `days_to_end_of_year`. The first one is **external** to the function, while the second one is **internal** to the function. What does this mean? This means that the main Python code **can't have access to the variable `days_to_end_of_year` because is internal to the function** and only lasts during the function definition. 

A good analogy to understand this concept could be your house. You have your house (your function) and from it, you can see what's outside (the main program). However, the people outside your house (your function) can't see what you have inside your house. 

# Function Arguments

So far, all examples of functions didn't include **arguments** in the function definition. What is an argument? It's a variable that needs to be provided to the function to make it work. They are defined within the parentheses. Let's look at an example:

```python
from datetime import date

def days_until_end_of_year(current_date):
   """
   This function computes how many days are left until the end of the year based on the current date.
   
   Inputs: 
   None
   
   Outputs:
   Integer with the number of days a left until the end of the year
   """
   
   days_to_end_of_year = current_date - date_end_of_year

   return days_to_end_of_year

today = date.today()
days_left = days_until_end_of_year(today)
print(days_left)
```

In this case, the function definition has one argument within the parentheses: `current_date` which needs to be provided when the function is called. In the previous example, when the function is called we use the variable `today` instead of `current_date`. What happens here? At this point, Python does the following assignment implicitly `current_date = today`. In other words, the content of the `current_day` is filled with the same content as the variable `today`. 

At this point, you might wonder...if the function can see the variables defined in the main program (in particular `today`), why should I include the `current_date` argument in my function and not simply use today as shown below? 

```python
from datetime import date

def days_until_end_of_year():
   """
   This function computes how many days are left until the end of the year based on the current date.
   
   Inputs: 
   None
   
   Outputs:
   Integer with the number of days are left until the end of the year
   """
   
   days_to_end_of_year = today - date_end_of_year

   return days_to_end_of_year

today = date.today()
days_left = days_until_end_of_year()
print(days_left)
```

Because **it's a very bad practice to assume that the variable that the function needs to use is defined in the main program**. What it will happen if the variable `today` is not defined in the main program? Our function will not work!!!

At this point, a good practice is to **avoid naming the arguments with the same name as another variable of the main program**. Instead, we should use a generic and self-explanatory name.

Our functions can have as many arguments as needed: none, one, two,...etc. Consider the following silly example:

```python
def add_two_numbers(a,b):
    """
    This function returns the sum of the two arguments
    
    Inputs:
    a -> number
    b -> number
    
    Output:
    The sum of the numbers
    """
    
    return a+b

total = add_two_numbers(4,7)
print(total)
```

this function, when called fills the content of `a` and `b` with `4` and `7` respectively and returns the sum of both replacing the default output value `None`.

# Optional arguments

Sometimes, we might want our functions to use default values for some of the arguments. When that's the case, we simply need to assign a value to the argument withtin the parentheses. In these cases, when the function is called from the main program we can omit the corresponding optional argument. In such case, the function will work using the default value of the argument. Let's see an example.

```python
def add_two_numbers(a, b=5):
    """
    This function returns the sum of the two arguments
    
    Inputs:
    a -> number
    b -> number
    
    Output:
    The sum of the numbers
    """
    
    return a+b

total_1 = add_two_numbers(4,7)
total_2 = add_two_numbers(4)
print(total_1)
print(total_2)
```

in this case we fill the value of `total_1` calling the function providing both arguments (a,b) and the content of `total_1` will be 11. In contrast, when we fill value while `total_2` calling the function **omiting the second argument**, the content of `total_2` will be 9 as `b` will take the default value 5. 

It's important to notice that when we define a function with optional arguments, those must be provided **after the mandatory arguments**. In other words, we can't use `def add_two_numbers(b=5, a)`. 

Thus, it's important to remember a couple of points:

First: The order of the arguments matters!!!

Second: The type of the argument matters!!!

A good analogy with a Python function is a coffee machine. The machine needs two arguments: water and coffe. However, if we place the water in the coffe depot and the coffe in the water depot the machine will not work. On the other hand, if we place ice in the water depot, the machine will not work until the water melts.

What it will happen with the previous function if we make the following call?

```python
def add_two_numbers(a, b=5):
    """
    This function returns the sum of the two arguments
    
    Inputs:
    a -> number
    b -> number
    
    Output:
    The sum of the numbers
    """
    
    return a+b

total = add_two_numbers(4,"Hello")
```

of course we can't add `4` and `Hello`, therefore we will get an Python error.

# Returning multiple variables

So far, we have considered functions in which the return clause only returned one variable. However, we can define functions that return more than one variable. In fact, as many as we want. However, when we want to capture the output we need to do it in a very special way. Let's see a simplistic example to illustrate this point:

```python
def add_and_substract(a,b=4):
    """
    This function returns the sum and subtraction of the two arguments
    
    Inputs:
    a -> number
    b -> number
    
    Output:
    The sum and subtraction of two numbers
    """
    
    return a+b, a-b
```

At this point, you might wonder, how we can collect the output (now we have two outputs)!!!. Simply separating by variables of the main program by `,` as shown down below.

```python
addition, difference = add_and_substract(4,3)
```

at this point, the contents of `addition` and `difference` will be 7 and 1 respectively. Moreover, in some cases, we might not be interested to store one or more of the function-returned values. In those cases, when we call the function we can replace the variable name with an underscore `_`. See the following examples:

```python
_, difference_1 = add_and_substract(4,3)
addition_2, _   = add_and_substract(7)
```

in these cases `difference_1` will be 1 and `addition_2` will be 11 (remember that `b` has a default value of 4). However, we will not capture the values of 4+3 and 7-4 because we have discarded them using the underscores `_`.


# Important considerations

1. When we create functions to perform modifications in a Panda's dataframes is a safe practice to create an independent copy of the dataframe and work with the copy. This is because if we make a mistake in the code function, unless we work with the copy, we will unintendedly modify the original dataframe and then we will have to start over. 

```python
def add_the_first_two_columns_of_a_pandas_dataframe(df):
    """
    This function adds a new column to a pandas DataFrame with the addition of the first two columns
    and returns the resulting dataframe
    
    Inputs:
    df -> panda's dataframe
    
    Outputs:
    df2 -> the modified panda's dataframe
    """
    
    df2 = df.copy()
    
   df2['new_column'] = df2.iloc[:,0] + df2.iloc[:,1] 

    return df2
````

2. Always test your functions with simple examples before using them in your code.

# Creating functions from your code

Imagine that we have a piece of code that we are using over and over. Wouldn't be nicer to create a function to store this piece of code and replace the lines of our main program with calls to our new function? In this way, the main code will look shorter and easier to read. Another better advantage is that if we get used to using functions in our code is easier to locate mistakes or make modifications in the code that only affect one function. 

If we want to turn a piece of code into a function, there are several steps to follow:

* Look at the piece of code that you want to turn into a function and figure out what are the main variables used to do any calculation

* Use these variables as arguments for your functions

* Determine a self-explanatory name for the function

* Add a docstring

let's see a trivial example:

```python
a = 5
b = 6
print(a+b)

def add_two_numbers(a,b):
    """
    This function returns the sum of two numbers
    
    Inputs:
    a -> number
    b -> number
    """
    print(a+b)
```

in this example, the third line of code can't run unless we provide `a` and `b` into the function arguments. Let's see a more elaborated example:

```python
import numpy as np

y_real = np.array([1,5,3,6,3,5])
y_pred = np.array([5,1,6,3,6,7])

mse = round(np.mean(np.pow(results['y_real'] - results['y_pred'],2)),2)

def get_mean_squared_error(y_real, y_pred):
    """
    This function computed the mean squared error of any model predictions using the real values and the model's predictions
    
    Inputs:
    
    y_real -> real values (numpy array)
    y_pred -> model predictions (numpy array)
    
    Output:
    
    mean squared error
    """
    
    mse = round(np.mean(np.pow(results['y_real'] - results['y_pred'],2)),2)
    
    return mse
```

again, the previous code can't work unless we provide `y_real` and `y_pred`. Therefore, those variables need to be used in the function arguments.

# Excercises

1. Define a function that given an integer as an argument prints the factorial of the argument. Then, create a new variable named `result`, assign to it the output of the function, and print the content of `result`.

2. Modify the previous function to return the factorial of a number. Then, create a new variable named `result`, assign to it the output of the function, and print the content of `result`.

3. Add a docstring to your function.

4. Modify the argument of the function to be optional and assign a default value of 5.

5. Define a function that given an angle in (radians) returns the cos(angle), sin(angle).

6. Define a function that takes a list of numbers as an argument, and returns the same list but changes the sign of the odd numbers.

8. Define a function that given a list of numbers as an argument, returns a dictionary with the keys being the same numbers but as strings and the same numbers as values. For example: [1,2,3] -> {"1":1,"2":2,"3":3}

7. Define a function that given a Panda's DataFrame as an argument, returns two dataframes, the first one with the numeric columns of the input dataframe and the second with the categorical columns of the dataframe.

8. Define a function that given a Panda's DataFrame as an argument returns a **new dataframe** but with the same columns as the original but in reverse order **without modifying the original one**.