# FUNCTIONS

## Overview 
One construct that’s extremely useful and provided by almost all programming languages is functions.

We have already met several functions, such as
`
- the `sqrt()` function from NumPy and
- the built-in `print()` function
In this lecture we’ll treat functions systematically and begin to learn just how useful and important they are.

One of the things we will learn to do is build our own user-defined functions

We will use the following imports.

In [81]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (10,6) # global figure size

## Function Basics

A function is a named section of a program that implements a specific task.

Many functions exist already and we can use them off the shelf.

First, we review these functions and then discuss how we can build our own.

### Built-In Functions

Python has a number of *built-in* functions that are available without `import`.

In [2]:
max(19, 20)

20

In [3]:
print('foobar')

foobar


In [4]:
str(22) # make number into a string

'22'

In [5]:
type(22)

int

Two more useful built-in functions are `any()` and `all()`

In [13]:
bools = False, True, True
all(bools)  # True if all are True and False otherwise

False

In [14]:
any(bools)  # False if all are False and True otherwise

True

The full list of Python built-ins is [here](https://docs.python.org/3/library/functions.html).

## Third Party Functions

If the built-in functions don’t cover what we need, we either need to import functions or create our own.

Here’s another one, which tests whether a given year is a leap year:

In [18]:
import calendar

calendar.isleap(2020)

True

## Defining Functions

In many instances, it is useful to be able to define our own functions.

This will become clearer as you see more examples.

Let’s start by discussing how it’s done

### Syntax

Here’s a very simple Python function, that implements the mathematical function $f(x) = 2x + 1$

In [20]:
def f(x):  # <- function header
    return 2 * x + 1 

- Returning values is generally more desirable than printing them out because a print() call assigned to a variable has type NoneType.

Now that we've *defined* this function, let's *call* it and check whether it does what we expect:

In [23]:
f(1) # let x value in function be 1

3

In [24]:
f(10)

21

Here’s a longer function, that computes the absolute value of a given number.

In [1]:
def new_abs_function(x):
    
    if x < 0:
        abs_value = -x
        
    else: 
        abs_value = x
    
    return abs_value # instead of printing value we return it via assigning it to some variable

Let’s review the syntax here.

- `def` is a Python keyword used to start function definitions.
- `def new_abs_function(x)`: indicates that the function is called `new_abs_function` and that it has a single argument `x`.
- The indented code is a code block called the function body.
- The `return` keyword indicates that`abs_value` is the object that should be returned to the calling code.

This whole function definition is read by the Python interpreter and stored in memory.

Let’s call it to check that it works:

In [28]:
print(new_abs_function(3))
print(new_abs_function(-3))

3
3


## Docstrings

- Docstrings describe what your function does
- serve as documentation for your function
- placed in immediate line after function header and between triple quotes """

### Anatomy of a docstring

In [1]:
def function_name(arguments):
    """
    Description of what the function does.
    
    Description of the arguments, if any. 
    
    Description of the return value(s), if any. 
    
    Description of errors raised, if any. 
    
    Optional extra notes or example of usage. 
    """

## Docstring formats

- Google Style
- Numpydoc 
- reStructuredText
- EpyText

#### Google style
- In Google style, the docstring starts with a concise description of what the function does. This should be written in imperative language. 
    - For instance: "Split the data frame and stack the columns" instead of "This function will split the data frame and stack the columns".
- Next comes the "Args" section where you list each argument name, followed by its expected type in parentheses, and then what its role is in the function. If you need extra space, you can break to the next line and indent as I've done here. If an argument has a default value, mark it as "optional" when describing the type. If the function does not take any parameters, feel free to leave this section out.
- The next section is the "Returns" section, where you list the expected type or types of what gets returned. You can also provide some comment about what gets returned, but often the name of the function and the description will make this clear. Additional lines should not be indented.

- Finally, if your function intentionally raises any errors, you should add a "Raises" section. You can also include any additional notes or examples of usage in free form text at the end.

#### Google Style Example

In [44]:
def function(arg_1,arg_2=42):
    """Descriptions of what the function does.
    
    
    Args:
        arg_1 (str): Description of arg_1 that can break onto the next line if needed.
        arg_2 (int, optional): Write optional when an argument has a default value. 
        
    Returns:
        bool: Optional description of the return value
        Extra lines are not indented.
        
    Raises:
        ValueError: Include any error types that the function intentionally raises. 
        
    Notes:
        See "X" for more info. 
    """

#### Numpydoc
- The Numpydoc format is very similar and is the most common format in the scientific Python community. 

In [46]:
def function(arg_1,arg_2=42):
    """
    Descriptions of what the function does.
    
    Parameters
    ----------
    arg_1 : expected type of arg_1
        Description of arg_1
    arg_2 : int, optional
        Write optional when an argument has a default value.
        Default=42
        
    Returns
    -------
    The type of the return value
        Can include a description of the return value.
        Replace "Returns" with "Yields" if this function is a generator.
    """

## Retrieving docstrings

In [47]:
def the_answer():
    """ Return the answer to life, the universe, and everything.
    
    Returns:
            int 
    """
    return 42
print(the_answer.__doc__)

 Return the answer to life, the universe, and everything.
    
    Returns:
            int 
    


A cleaner version:

In [50]:
import inspect
print(inspect.getdoc(the_answer))

Return the answer to life, the universe, and everything.

Returns:
        int 


With one question mark we bring up the docstring, and with two we get the source code as well.

In [51]:
the_answer?

[0;31mSignature:[0m [0mthe_answer[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return the answer to life, the universe, and everything.

Returns:
        int 
[0;31mFile:[0m      /var/folders/3j/22yv_sj10t96slp1b8b6z1zw0000gn/T/ipykernel_1454/3516594323.py
[0;31mType:[0m      function


In [52]:
the_answer??

[0;31mSignature:[0m [0mthe_answer[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0mthe_answer[0m[0;34m([0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m""" Return the answer to life, the universe, and everything.[0m
[0;34m    [0m
[0;34m    Returns:[0m
[0;34m            int [0m
[0;34m    """[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0;36m42[0m[0;34m[0m[0;34m[0m[0m
[0;31mFile:[0m      /var/folders/3j/22yv_sj10t96slp1b8b6z1zw0000gn/T/ipykernel_1454/3516594323.py
[0;31mType:[0m      function


# Code smells and refactoring

### DRY ("don't repeat yourselfe") principle

- it is totally normal to copy and paste a bit of code, tweak it slightly, and re-run it. However, this kind of repeated code can lead to real problems.
    - one of the problems with copying and pasting is that it is easy to accidentally introduce errors that are hard to spot.
- Repeated code like this is a good sign that **you should write a function**.
    - Wrapping the repeated logic in a function and then calling that function several times makes it much easier to avoid the kind of errors introduced by copying and pasting. 

### DRY - Example

Instead of this:

```Python
# Standardize the GPAs for each year
df['y1_z'] = (df.y1_gpa - df.y1_gpa.mean()) / df.y1_gpa.std()
df['y2_z'] = (df.y2_gpa - df.y2_gpa.mean()) / df.y2_gpa.std()
df['y3_z'] = (df.y3_gpa - df.y3_gpa.mean()) / df.y3_gpa.std()
df['y4_z'] = (df.y4_gpa - df.y4_gpa.mean()) / df.y4_gpa.std()
```

You should write a function:

```Python
def standardize(column):
  """Standardize the values in a column.

  Args:
    column (pandas Series): The data to standardize.

  Returns:
    pandas Series: the values as z-scores
  """
  # Finish the function so that it returns the z-scores
  z_score = (column - column.mean()) / column.std()
  return z_score

# Use the standardize() function to calculate the z-scores
df['y1_z'] = standardize(df.y1_gpa)
df['y2_z'] = standardize(df.y2_gpa)
df['y3_z'] = standardize(df.y3_gpa)
df['y4_z'] = standardize(df.y4_gpa)
```

## "Do One Thing"

- Instead of one big function, we could have a more nimble function that just loads the data and a second one for plotting. 
- We get several advantages from splitting the load_and_plot() function into two smaller functions.
    - First of all, our code has become more flexible. 
    - The code will also be easier for other developers to understand
    - more pleasant to test and debug.
    - Finally, if you ever need to update your code, functions that each have a single responsibility make it easier to predict how changes in one place will affect the rest of the code.


### Do One thing - Example

This violates Do one thing princple:

```Python
def mean_and_median(values):
    """Get the mean and median of a sorted list of 'values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    tuple (float, float): The mean and median
    """
  mean = sum(values) / len(values)
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]

  return mean, median
```

Instead split up and write a function for mean and one for median:

```Python
def mean(values):
  """Get the mean of a sorted list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the mean() function
  mean = sum(values) / len(values)
  return mean
```

```Python
def median(values):
  """Get the median of a sorted list of values

  Args:
    values (iterable of float): A list of numbers

  Returns:
    float
  """
  # Write the median() function
  midpoint = int(len(values) / 2)
  if len(values) % 2 == 0:
    median = (values[midpoint - 1] + values[midpoint]) / 2
  else:
    median = values[midpoint]
  return median
```

### Pandas Example

In [10]:
import pandas as pd

data = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})

- A good practice is to split the function process_data into smaller functions that do only one thing. In the code below, I split the function process_data into 4 different functions and apply these functions to a pandas DataFrame in order using pipe.

In [11]:
def create_a_copy(df: pd.DataFrame):
    return df.copy()


def add_new_features(df: pd.DataFrame):
    df["c"] = [1, 1, 1]
    return df


def add_one(df: pd.DataFrame):
    df["a"] = df["a"] + 1
    return df


def sum_all_columns(df: pd.DataFrame):
    df["sum"] = df.sum(axis=1)
    return df


(data
    .pipe(create_a_copy)
    .pipe(add_new_features)
    .pipe(add_one)
    .pipe(sum_all_columns)
)

Unnamed: 0,a,b,c,sum
0,2,4,1,7
1,3,5,1,9
2,4,6,1,11


# Mutable default arguments are dangerous!
Finally, here is a thing that can get you into trouble. foo() is a function that appends the value 1 to the end of a list. But, whoever wrote this function gave the argument an empty list as a default value. When we call foo() the first time, we get what you would expect, a list with one entry.

In [71]:
def foo(var=[]):
    var.append(1)
    return var

foo()

[1]

But, when we call foo() again, the default value has already been modified! 

In [72]:
foo()

[1, 1]

If you really want a mutable variable as a default value, consider defaulting to None and setting the argument in the function.

In [73]:
def foo(var=None):
    if var is None:
        var = []
    var.append(1)
    return var

foo()

[1]

In [74]:
foo()

[1]

# Context managers

Any time you use a context manager, it will look like this. 
- The keyword "with" lets Python know that you are trying to enter a context.
- Then you call a function. You can call any function that is built to work as a context manager.
- A context manager can take arguments like any normal function.
- You end the "with" statement with a colon as if you were writing a for loop or an if statement.
- Any code that you want to run inside the context that the context manager created needs to be indented.
- When the indented block is done, the context manager gets a chance to clean up anything that it needs to, like when the "open()" context manager closed the file.

```Python
with <context-manager>(<args>): 
    # Run your code here
    # This code is running "inside the context"
```

Some context managers want to return a value that you can use inside the context. \
By adding "as" and a variable name at the end of the "with" statement, you can assign the returned value to the variable name. 

```Python
with <context-manager>(<args>) as <variable-name>: 
    # Run your code here
    # This code is running "inside the context"
```

### Context managers - example

```Python
with open('my_file.txt') as my_file: 
    text = my_file.read()
    length = len(text)

print('The file is {} characters long'.format(length))
```

- The "open()" function is a context manager. When you write "with open()", it opens a file that you can read from or write to. 
- Then, it gives control back to your code so that you can perform operations on the file object. 
- In this example, we read the text of the file, store the contents of the file in the variable "text", and store the length of the contents in the variable "length". When the code inside the indented block is done, the "open()" function makes sure that the file is closed before continuing on in the script. 
- The print statement is outside of the context, so by the time it runs the file is closed.

## Multiple function parameters

In [2]:
def raise_to_power(value1,value2): #<-- 2 parameters in function header
    """Raise value1 to the power of value2."""
    new_value = value1**value2
    return new_value

In [3]:
result = raise_to_power(2,3)
print(result)

8


### Making functions return multiplie values: TUPLES!
Tuples:
- like a list - can contain multiple values
- immutable - can't modify vaule
- constructed using parentheses () 

#### Unpack tuple

In [8]:
even_nums = (2,4,6)
a, b = even_nums

result = raise_to_power(a,b)
print(result)

8


#### Accessing tuple elements

In [10]:
even_nums = (2,4,6)
print(even_nums[1]) # second element

4


### Returning Multiple Values (using Tuples)

In [11]:
def raise_both(value1,value2): #<-- 2 parameters in function header
    """Raise value1 to the power of value2 and vice versa"""
    
    new_value1 = value1**value2
    new_value2 = value2**value1
    
    new_tuple = (new_value1, new_value2)
    
    return new_tuple

In [12]:
result = raise_both(2,3)
print(result)

(8, 9)


## Scope in function

- not all objects are accesible everywhere in a program/script
- scope - part of program where an object or name may be accesible
- 3 types of scope:
    - global scope - defined in the main body of a script
    - local scope - defined inside a function
    - built-in scope - names in the pre-defined built-ins module (e.g. print())
    
If not found locally, Python will search in eclosing functions, then global and lastly built-in.

## Nested functions 

- **WHY**: if we want to use a proces multiple times within a function 

In [14]:
def mod2plus5(x1,x2,x3):
    """Returns the reaminder plus 5 of three values."""
    
    def inner(x):
        """Returns the remainder plus 5 of a value."""
        return x % 2 + 5
    
    return (inner(x1),inner(x2),inner(x3))

In [16]:
print(mod2plus5(1,2,3))

(6, 5, 6)


# Python Pass Statement
- If you want to create code that does a particular thing but don’t know how to write that code yet, put that code in a function then use `pass`.

- Once you have finished writing the code in a high level, start to go back to the functions and replace `pass` with the code for that function. This will prevent your thoughts from being disrupted

In [9]:
def say_hello():
    pass 

def ask_to_sign_in():
    pass 

def main(is_user: bool):
    if is_user:
        say_hello()
    else:
        ask_to_sign_in()

main(is_user=True)

# Arguments

## Add default argument

In [18]:
def power(number, pow=1):
    """Raise number to the power of pow."""
    new_value = number ** pow
    return new_value

power(9) # function will use default as parameter 2

9

## Flexible arguments: *args

- if we aren't sure how many positional arguments will be passed

In [19]:
def add_all(*args):
    """ Sum all values in *args together."""
    
    # Initialize sum
    sum_all = 0
    
    # Accumulate the sum
    for num in args:
        sum_all += num
        
    return sum_all

In [20]:
add_all(5,10,15,20)

50

## Flexible arguments: **kwargs
- use a double star to pass an arbitrary number of keyword arguments, i.e. arguments preceded by identifiers.
- to write such a function, we use the parameter kwargs preceded by a double star. This turns the identifier-keyword pairs into a dictionary within the function body

In [24]:
def print_all(**kwargs):
    """Print out key-value pairs in **kwargs."""
    
    # Print out the key-value pairs
    for key, value in kwargs.items():
        print(key + ": " + value)

In [25]:
print_all(name='dumbledore', job='headmaster')

name: dumbledore
job: headmaster


## Lambda functions - One-Line Functions
-  will come in handy, especially when you're writing and maintaining big programs, e.g. simple functions

In [26]:
raise_to_power = lambda x, y: x**y
raise_to_power(2,3)

8

### Lambda functions - Application: Anonymous functions
- The best use case for lambda functions, however, are for when you want these simple functionalities to be anonymously embedded within larger expressions.
- function map takes two arguments: `map(func, seq)`
- `map()` applies the function to ALL elements in the sequence

In [1]:
nums = [48, 6, 9, 21, 1]

square_all = map(lambda num: num**2, nums)

# print(square_all) 
print(list(square_all)) # <-- to see result

[2304, 36, 81, 441, 1]


### Lambda functions - Application: filtering & reduce
- The function `filter()` offers a way to filter out elements from a list that don't satisfy certain criteria.
- The `reduce()` function is useful for performing some computation on a list and, unlike `map()` and `filter()`, returns a single value as a result.

In [31]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'pippin', 'aragorn', 'boromir', 'legolas', 'gimli', 'gandalf']

# Use filter() to apply a lambda function over fellowship: result
result = filter(lambda member: len(member) > 6, fellowship)

# Convert result to a list: result_list
result_list = list(result)

# Print result_list
print(result_list)

['samwise', 'aragorn', 'boromir', 'legolas', 'gandalf']


Instead of this:

In [36]:
# Define function
def sol1(*args):
    """Concatenate strings in *args together."""
    result = ''
    for member in args:
        result += member
    return result

sol1('robb', 'sansa', 'arya', 'brandon', 'rickon')

'robbsansaaryabrandonrickon'

... We can use `reduce()`

In [39]:
# Import reduce from functools
from functools import reduce

# Create a list of strings: stark
stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']

# Use reduce() to apply a lambda function over stark: result
result = reduce(lambda item1, item2: item1+item2 , stark)

# Print the result
print(result)

robbsansaaryabrandonrickon


### Decorator in Python
- Do you want to add the same block of code to different functions in Python? If so, try decorator.

In [4]:
import time 

def time_func(func):
    def wrapper():
        print("This happens before the function is called")
        start = time.time()
        func() # <-- the function!
        print('This happens after the function is called')
        end = time.time()
        print('The duration is', end - start, 's')

    return wrapper

Now all I need to do is to add `@time_func` before the function `say_hello`

In [5]:
@time_func
def say_hello():
    print("hello")

say_hello()

This happens before the function is called
hello
This happens after the function is called
The duration is 1.71661376953125e-05 s


Decorator makes the code clean and shortens repetitive code. If I want to track the time of another function, for example, func2(), I can just use:

In [6]:
@time_func
def func2():
    pass
func2()

This happens before the function is called
This happens after the function is called
The duration is 9.775161743164062e-06 s


# Why Write Functions?
User-defined functions are important for improving the clarity of your code by
- separating different strands of logic
- facilitating code reuse

### Underscore(_): Ignore Values That Will Not Be Used
- When assigning the values returned from a function, you might want to ignore some values that are not used in future code. If so, assign those values to underscores `_`

In [7]:
def return_two():
    return 1, 2

_, var = return_two()
var

2

### Underscore “_”: Ignore The Index in Python For Loops
- If you want to repeat a loop a specific number of times but don’t care about the index, you can use `_`

In [8]:
for _ in range(5):
    print('Hello')

Hello
Hello
Hello
Hello
Hello


# Error handling

## Errors and exceptions

In [82]:
def sqrt(x):
    """Returns the square root of a number."""
    try:
        return x ** 0.5
    except TypeError: 
        print('x must be an int or float')

In [83]:
sqrt('hi')

x must be an int or float


## Raise an error

- say we don't want user to use negative numbers

In [47]:
def sqrt(x):
    """Returns the square root of a number."""
    if x < 0: 
        raise ValueError('x must be non-negative')
    try:
        return x ** 0.5
    except TypeError: 
        print('x must be an int or float')

In [48]:
sqrt(-2)

ValueError: x must be non-negative

# Describing a utility function
$$u(x_1,x_2)=x^{\alpha}_1*x^{1-\alpha}_2$$

In [10]:
# x1 and x2 are positional arguments
# alpha is a keyword argument with default value

def u_func(x1,x2,alpha=0.5):
    return x1**alpha*x2**(1-alpha)

## Evaluations (Print to Screen)

In [11]:
x1 = 1
x2 = 3
u = u_func(x1,x2)

print(f'x1 = {x1:.3f}, x2 = {x2:.3f} -> u = {u:.3f}')

x1 = 1.000, x2 = 3.000 -> u = 1.732


### Print multiple evaluations

In [12]:
x1_list = [2,4,6,8]
x2 = 3 

for x1 in x1_list: 
    u = u_func(x1,x2, alpha=0.25)
    print(f'x1 = {x1:.3f}, x2 = {x2:.3f} -> u = {u:.3f}')

x1 = 2.000, x2 = 3.000 -> u = 2.711
x1 = 4.000, x2 = 3.000 -> u = 3.224
x1 = 6.000, x2 = 3.000 -> u = 3.568
x1 = 8.000, x2 = 3.000 -> u = 3.834


A nicer way to implement it:

In [13]:
for i, x1 in enumerate(x1_list):
    u = u_func(x1,x2, alpha=0.25)
    print(f'{i:1d}: x1 = {x1:<3d} x2 = {x2:<2d} -> u = {u:<6.3f}')
    
# {i:2d}: integer a width of 2 (right-aligned)
# {x1:<3d}: integer a width of 2 (<, left-aligned)
# {u:<6.3f}: float width of 6 and 3 decimals (<, left-aligned)

0: x1 = 2   x2 = 3  -> u = 2.711 
1: x1 = 4   x2 = 3  -> u = 3.224 
2: x1 = 6   x2 = 3  -> u = 3.568 
3: x1 = 8   x2 = 3  -> u = 3.834 


## Printing outputs to a file
Imagine you wanted to store outputs from your model in order to put it into a paper. Then you want it in a file..

1. Create a text-file using the `with` operator. 
2. Write to the file in a loop using the reference variable `file_ref`:

In [14]:
with open('somefile.txt', 'w') as file_ref: # 'w' is short for 'write'
    
    for i, x1 in enumerate(x1_list):
        # Calculate utility at loop iteration
        u = u_func(x1,x2,alpha=0.25)
        
        # Create a formatted line of text
        text_line = f'{i+10:2d}: x1 = {x1:<6.3f} x2 = {x2:<6.3f} -> u = {u:<6.3f}'
        
        # Write the line of text to the file using the
        file_ref.write(text_line + '\n') 
    
# note: the width clause ensures that the file is properly closed afterwards    

You can also **read** from a file in the same manner, just using `r` instead of `w`.  
Open a text-file and read the lines in it and then print them:

In [15]:
with open('somefile.txt', 'r') as file_ref: # 'r' is for 'read'
    
    # load ALL file content into the object lines
    lines = file_ref.readlines()
    
    # Printing each loaded line by loop
    for line in lines:
        print(line,end='') # end='' removes the extra lineshift print creates

10: x1 = 2.000  x2 = 3.000  -> u = 2.711 
11: x1 = 4.000  x2 = 3.000  -> u = 3.224 
12: x1 = 6.000  x2 = 3.000  -> u = 3.568 
13: x1 = 8.000  x2 = 3.000  -> u = 3.834 
