<a href="https://colab.research.google.com/github/gg5d/DS-1002/blob/main/17_functions_student_F23.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Functions: Built-In and User-Defined


### University of Virginia
### Programming for Data Science
---  

### PREREQUISITES
- data types
- variables
- `if` statement
- conditional operators
- `list comprehension` (not essential here)


### SOURCES
- Python built-in functions  
https://docs.python.org/3/library/functions.html


- Some good function examples and details  
https://www.w3schools.com/python/python_functions.asp


- Default arguments  
https://www.geeksforgeeks.org/default-arguments-in-python/  


- Details on the function `return` statement  
https://realpython.com/python-return-statement/#understanding-the-python-return-statement


- Global versus Local Variables  
https://www.geeksforgeeks.org/global-local-variables-python/?ref=lbp


### OBJECTIVES
- Explain the benefits of functions
- Illustrate how to use built-in functions
- Illustrate how to create and use your own (user-defined) functions
- Demonstrate the scope and lifetime of a variable
- Illustrate global and local nature of variables through functions
- Demonstrate function parameter use
- Provide recommendations on how to create and document functions
- Show how to print and write docstrings


### CONCEPTS

- functions
- built-in functions
- user-defined functions
- variable scope
- global versus local variables
- default arguments
- *args
- function call
- docstring

---

## I. Introduction to Functions

Functions take input and produce output. They contain a block of code to do their work.  
They make transformations happen...from simple to complex.

Python provides many built-in functions.  
There are also many packages to bring in additional functions.

NOTE: function inputs are called both `parameters` and `arguments`.

**A SMALL EXAMPLE**

This small function is built in to Python: `bool`

Documentation: [Python built-in functions](https://docs.python.org/3/library/functions.html)  

Takes an argument $x$ and returns True or False.  

In [None]:
# set a variable and pass into a conditional statement

x = 3
bool(x < 4)

In [None]:
bool(x >= 4)

In [None]:
help(bool)

Let's get more elaborate, defining x to be a list of integers. This breaks.

In [None]:
x = [3,4]        # define a list of integers
thr = 4          # define an integer
bool(x >= thr)   # try to compare each value to an integer

This broke because the operator doesn't support the comparison.  

## II. Creating Functions

Let's write a function to compare the list against a threshold.

In [None]:
def vals_greater_than_or_equal_to_threshold(vals, thresh):
    '''
    PURPOSE: Given a list of values, compare each value against a threshold

    INPUTS
    vals    list of ints or floats
    thresh  int or float

    OUTPUT
    bools  list of booleans
    '''

    bools = [val >= thresh for val in vals]
    return bools

**Let's break down the components:**
- the function definition starts with `def`, followed by name, one or more arguments in parenthesis, and then a colon.
- next comes a `docstring` to provide annotation
- the function body follows
- lastly is a `return` statement

The `function call` allows for function use. It consists of function name and required arguments:

`vals_greater_than_or_equal_to_threshold(arg1, arg2)` where `arg1`, `arg2` are arbitrary names.

#### docstring

- A `docstring` is a string that occurs as first statement in module, function, class, or method definition
- Saved in `__doc__` attribute
- Needs to be indented
- ``` '''enclosed in triple quotes like this''' ```

**We gave this function a descriptive docstring to:**

- explain its purpose
- name each input and output, and give their data types

---

The function body used a `list comprehension` for the compare:

`[val >= thresh for val in vals]`

**Let's test our function**

In [None]:
# validate that it works for ints

x = [3, 4]
thr = 4

vals_greater_than_or_equal_to_threshold(x, thr)

In [None]:
# validate that it works for floats

x = [3.0, 4.2]
thr = 4.2

vals_greater_than_or_equal_to_threshold(x, thr)

This gives correct results and does exactly what we want.  

**print the docstring**

special methods in Python. Python provides these methods to use as the operator overloading depending on the user. Python provides this convention to differentiate between the user-defined function with the module’s function

In [None]:
print(vals_greater_than_or_equal_to_threshold.__doc__)

**print the help**

In [None]:
help(vals_greater_than_or_equal_to_threshold)

---

**TRY FOR YOURSELF (UNGRADED EXERCISES)**

1) Write a function with these requirements:
- has a sensible name
- contains a docstring
- takes two inputs: a string, and an integer
- returns True if the string length is equal to the integer, else False

2) Call the function, passing inputs:  
- `"is this text the right length?"` for the string
- `30` for the integer

Verify the output is True.  
Try other combinations.

---


## III. Passing Parameters

**Functions need to be called with correct number of parameters**
  
This function requires two params, but the function call includes only one param


In [None]:
## function requiring 2 arguments
def fcn_bad_args(x, y):
    return x+y

# function call with only 1 of the 2 arguments
fcn_bad_args(10)

**When calling a function, parameter order matters.**

In [None]:
x = 1
y = 2

# function with order of x then y
def fcn_swapped_args(x, y):
    out = 5 * x + y
    return out

# call function in correct order
print('fcn_swapped_args(x,y) =', fcn_swapped_args(x,y))

# call function in incorrect order
print('fcn_swapped_args(y,x) =', fcn_swapped_args(y,x))

Generally it's best to keep parameters in order.  

You can swap the order by putting the parameter names in the function call.

In [None]:
x1 = 1
y1 = 2

# call parameter names in function call
fcn_swapped_args(y=y1, x=x1)

**Weirdness Alert**

Note that the same name can be used for the parameter names and the variables passed to them.

The names themselves have nothing to do with each other!

In other words, just because a function names an argument `x`, \
the variables passed to it don't have to name `x` or anything like it. \
They can even be named the same thing -- it does not matter.

In [None]:
foo = 1
bar = 2

fcn_swapped_args(foo, bar)

# works even though function was written as fcn_swapped_arg(x, y)

## IV. Unpacking list-likes using  `*args`

The `*` operator can be passed to avoid specifying the arguments individual.

In [None]:
def show_arg_expansion(*models):

    print("models          :", models)
    print("input arg type  :",  type(models))
    print("input arg length:", len(models))
    print("-----------------------------")

    for mod in models:
        print(mod)



We can pass a tuple of values to the function...

In [None]:

show_arg_expansion("logreg", "naive_bayes", "gbm")

You can pass a list to the function.

If you want the elements unpacked, put * before the list.

In [None]:
models = ["logreg", "naive_bayes", "gbm"]
show_arg_expansion(*models)

**This approach allows your function to accept an arbitrary number of arguments**

In [None]:
show_arg_expansion('a b c d e f g'.split())

In [None]:
def arg_expansion_example(x, y):
    return x**y

**The reverse is true, too.**

You can use the `*` operator to pass list-like objects to a function that specifies its arguments.

In [None]:
my_args = [2, 8]
arg_expansion_example(*my_args)

But, the passed object must be the right length.

In [None]:
my_args2 = [2, 8, 5]
arg_expansion_example(*my_args2)

---

**TRY FOR YOURSELF (UNGRADED EXERCISES)**

3) Write a function with these requirements:
- takes *args for the input argument (like above)
- squares each argument, printing the value. you can use a `for` loop like above.
- returns None

Next, call the function, passing at least two integers

---


## V. Default Arguments

`default arguments` set the value when left unspecified.

In [None]:
def show_results(precision, printing=True):
    precision = round(precision, 2)

    if printing:
      print('precision =', precision)
    return precision

In [None]:
pr = 0.912
res = show_results(pr)

In [None]:
res

The function call didn't specify `printing`, so it defaulted to True.

Default arguments must follow non-default arguments. This causes trouble:

In [None]:
def show_results(precision, printing=True, uhoh):
    precision = round(precision, 2)

    if printing:
      print('precision =', precision)
    return precision

## VI. Returning Values

Functions are not required to have return statement.
If there is no return statement, function returns `None` object.  

Functions can return no value (`None` object), one value, or many.  

Any Python object can be returned.  

In [None]:
# returns None, and prints.

def fcn_nothing_to_return(x, y):
    out = 'nothing to see here!'
    print(out)

In [None]:
x = 1
y = 2

fcn_nothing_to_return(x, y)

In [None]:
r = fcn_nothing_to_return(1, 1)
print(r)

In [None]:
# returns three values

def negate_coords(x, y, z):
    return -x, -y, -z

In [None]:
a,b,c = negate_coords(10,20,30)
print('a=', a)
print('b=', b)
print('c=', c)

**If you don't need an output, use the dummy variable `_`**

In [None]:
d,e,_ = negate_coords(10,20,30)
print('d=', d)
print('e=', e)

**Note:** For clarity purposes, it's generally a good idea to include return statements, even if not returning a value.  
You can use `return` or `return None`.

**Functions can contain multiple return statements**

In [None]:
def absolute_value(num):
    if num >= 0:
        return num
    return -num

In [None]:
absolute_value(-4)

In [None]:
absolute_value(4)

For non-negative values, the first `return` is reached.  
For negative values, the second `return` is reached.

---

**TRY FOR YOURSELF (UNGRADED EXERCISES)**

4) Write a small function that returns two outputs. Verify it works properly.

5) Define a step function with these requirements:
- take one numeric value as input
- subtract 5 from the value
- return 1 if the difference is nonnegative, else return 0

Call the function on different values to test it.

---


## VII. Variable Scope

A variable's **scope** is the part of a program where it is **visible**.

Visible means available or usable.

If a variable is **in scope** to a function, it is visible the function.

If it is **out of scope** to a function, it is not visible the function.

When a variable is defined inside of a function, is is not visible outside of the function.

We say such variables are **local** to the function.

They are also removed from memory when the function completes.

In [None]:
def show_scope(x):
    x = 10*x
    z = 4
    print('z inside function =', z)
    print('memory address of z inside function =', id(z))
    return x

In [None]:
show_scope(6)

In [None]:
print('z =', z)

This code recognizes z from inside the function.
  
Calling it from outside, where it isn't defined, throws an error.

If we define `z` and call the function, the update to `z` won't pass outside the function.

In [None]:
z = 2
print('z outside:', id(z))
out = show_scope(6)
print('z = ', z)

### Local versus Global Variables

It is helpful to have a good understanding of local versus global variables.  

Not having this understanding can lead to surprises and confusion.  

**Example 1: Variable defined outside function, used inside function**

In the code below:  

`x` is global and seen from inside the function.  
`t` is local to the function. trying to print outside function throws error.

In [None]:
x = 10

def fcn(t):
    out = x + t
    return(out)


In [None]:
print(fcn(6)) # works

In [None]:
print(t)      # fails

**Example 2: Variable defined outside function, updated and used inside function**

`fcn` uses the local version of `x`

In [None]:
x = 10

def fcn(t):
    x = 20
    sum = x + t
    print('x from fcn:', x)
    return(sum)

print('fcn(6):', fcn(6))
print('x:', x)

**Example 3: Variable defined outside function. Inside function, print variable, update, and use**

This one may be confusing. It fails!  

Python treats `x` inside function as the local `x`.  
The print() occurs before `x` is assigned, so it can't find `x`.

In [None]:
x = 10

def fcn(t):
    print('x from fcn, before update:', x)
    x = 20
    out = x + t
    print('x from fcn, after update:', x)
    return(out)

print('fcn(6):', fcn(6))
print('x:', x)

The error can be fixed by referencing x as `global` inside function.

Only necessary if we wish to reassign the variable.

It is also useful when we want several functions to operate on the same variable

In [None]:
x = 10

def fcn(t):
    global x    # add this to reference global x outside function
    print('x from fcn, before update:', x)
    x = 20
    out = x + t
    print('x from fcn, after update:', x)
    return(out)

print('fcn(6):', fcn(6))
print('x:', x)

---
**TRY FOR YOURSELF (UNGRADED EXERCISES)**

6) Define a function that creates and prints a variable.  
Show that calling this variable outside the function produces an error.

---

##  VIII. Function Design


Some good practices for creating and using functions:

- design a function to do one thing

Make them as simple as possible, which makes them:
- more comprehensible
- easier to maintain
- reusable

This helps avoid situations where a team has 20 variations of similar functions

Give your function a good name  

- it should reflect the action in performs.
- be consistent in naming conventions
- a name like `compute_variances_sort_save_print` suggests the function is overworked!
- if the function `compute_variances` also produces plots and updates variables, it will cause confusion.  

Always give your function a docstring
- Particularly important since indicating data types is not required.  
- As a side note, you can include this information by using `type annotation`.

https://docs.python.org/3/library/typing.html

In [None]:
def fun(x: str) -> list:
    return list(x)
fun("eclipse")

Function docstrings are stored in attribute `__doc__`; they can be shown like this:

In [None]:
print(bool.__doc__)

---