<div class="row">
    <div class="columm">
        <h1 style="position: absolute;font-size: 300%;">Vignette: User Defined Functions</h1>
    </div>
    <div class="column">
        <img src="https://www.bjss.com/wp-content/uploads/BJSS.svg"
             alt="BJSS Logo"
             align="right" 
             width = "200"
             style="margin: 0px 60px"
             />
    </div>
</div>

## Table of Contents

<a href="#Background"><font size="+1">Background</font></a>
* Learning objectives
* Motive
* Types of functions in python 

<a href="#Creating-and-Using-Functions"><font size="+1">Creating and Using Functions</font></a>
* Creating user defined functions
* Calling functions
* <a href="#Exercise-1:-Create-a-function-which...">Exercise 1: Create a function which...</a> 
* Anticipating special cases
    
<a href="#Deeper-Understanding"><font size ="+1">Deeper Understanding</font></a>
* Parameter vs argument
* Types of arguments 
* Scope 
* Writing a docstring

<a href="#Lambda-Functions"><font size ="+1">Lambda Functions</font></a>
* What are lambda functions?
* Lambda function syntax
* Advanced lambda functions
* <a href="#Exercise-2:-Create-a-Lambda-Function-which...">Exercise 2: Create a Lambda function which...</a> 
    
<a href="#Introduce-Applying-Functions-on-Pandas-Objects"><font size="+1">Introduce Applying Functions on Pandas Objects</font></a>
* Introducing pandas function application
    - <a href="#Series.map">`Series.map`</a>
    - <a href="#Series.apply">`Series.apply`</a>
    - <a href="#DataFrame.applymap">`DataFrame.applymap`</a>
    - <a href="#DataFrame.apply">`DataFrame.apply`</a>

<a href="#Consolidation"><font size ="+1">Consolidation</font></a>
    - Recap of learning objectives

# Background

## Learning Objectives

By the end of the session, participants will be able to:
* Know what functions are and why they are useful.
* Distinguish between a function's parameters and arguments.
    * Including different types of arguments
* Understand the idea of 'scope'.
    * i.e. Global scope vs Local scope
* Write and apply user defined functions.
* Identify and use 'lambda' functions.
* Apply functions to pandas `DataFrame` and `Series` objects.

### Motive
Up until now we've been writing code line by line and running it sequentially. As effective as this has been, things can get messy as we scale up to larger problems.

We can avoid messy code by organising it into chunks of reusable code called functions. This allows us to create blocks of code dedicated to performing a specific procedure or task. We can then 'call' this function to perform its specifc task as and when needed. Understanding why and how to write functions is an important part of writing code that is consistent, readable and reproducible.

### Types of Functions in Python  

In python we have three types of functions:
1. [Built-in functions](https://docs.python.org/3.6/library/functions.html)
> We have already used several of these, such as `help()`, `pow()`, `print()`, etc. These functions, like the name
suggests are built into python and are always available.
2. User defined Functions
> These are functions created by us to carry out a specific task. Declared using `def` keyword.
3. Anonymous Functions (commonly referred to as 'lambda' functions)
> These functions are also user defined but are not declared using keyword `def`. They are created using the keyword `lambda` and are generally one-line functions.

Functions are not executed until they are 'called'. As such, if your code is syntactically correct but actually calling the function produces errors you will not know about this until your code is called and executed.

# Creating and Using Functions

### Creating User Defined Functions

To create a function in python, we:
<ol>
    <li>Declare it using the keyword <code>def</code></li>
    <li>Name the function, taking into account that names:
        <ul>
            <li>Must start with a letter or an underscore: _.</li>
            <li>Should be lowercase (by python convention).</li>
            <li>Can include numbers if required, just not as the first character.</li>
            <li>Can be any length, within reason, but short descriptive names are best.</li>
            <li>Cannot have the same name as a python keyword.</li>
        </ul>
    </li>
    <li>Brackets go on the end of the function name, containing any necessary parameters: <code>function_name(p1, p2)</code>.</li>
    <li>Conclude with a colon <code>:</code>, which is used to start an indented block of code. Only this code will execute when the function is called.</li>
</ol>

For example:

```python
def function_name(p1,p2):
    """
    this is the docstring to my function. Here is where I would describe what this function does. When I call the
    help function on this function, it will print the docstring out.
    """
    p1 = 'My first function!'
    p2 = 'Self Hi5'
    print(p1, p2)
    return (len(p1)+len(p2))
```

In the body of this function, we assign our parameters `p1` and `p2` to two text strings and then `print()` our parameters out.  

We then use the keyword statement `return`, which will return whatever value (in the return parenthesis) we wish to send back to the 'caller'. In this case we `return` the sum of the lengths of our parameters, using another built-in python function called `len` which itself returns the size of any iterable object. 

Notes: 
* Whenever the `return` statement is executed the function will stop its execution at that line and return to the caller with whatever given value. No code following the `return` statement is executed.
* Any function without a return statement will still return a value of `None`.
* We can return multiple values if we like by wrapping them in a tuple, list or dictionary.
* More broadly, functions can return any object of our choosing, including other functions (this is called 'currying')!

### Example
In the example below, our function is slightly pointless. We can pass any two parameter values we like, however whatever we pass will simply be reassigned in the function body. `p1` will always end up as 'My first function!' and `p2` as 'Self Hi5'.

In [None]:
def function_name(p1,p2):
    """
    this is the docstring to my function. Here is where I would describe what this function does. When I call the
    help function on this function, it will print the docstring out.
    """
    p1 = 'My first function!'
    p2 = 'Self Hi5'
    print(p1, p2)
    return(len(p1)+len(p2))
    print('Print me out!')

### Calling Functions
Calling user defined functions works the same way as calling the built-in functions we've used so far. Simply, type the function name followed by parentheses containing any required parameters.

In [None]:
# call the built in help function to get our function's docstring.
help(function_name)

In [None]:
# call our function (n.b. ; suppresses output in the notebook)
function_name('apple',36);

In [None]:
# Call our function, storing the returned value as a variable
ans = function_name('zebra',-51)
print('Value returned from function_name', ans)

### Exercise 1: Create a function which...
1. Converts from Fahrenheit to degrees Celsius using the equation below. The function should print out a value rounded to 1dp.

 $$C=\frac{5}{9}(F-32)$$

> Examples: 
> - `f_to_c(32)` prints `32F is 0.0C`
> - `f_to_c(11)` prints `11F is -11.7C`
> - `f_to_c(81.3)` prints `81.3F is 27.4C`

2. Given a number and a list of numbers, return the index of the number from the list which produced the greatest difference from the given number. 

> Example:
- `max_diff(3, [5,7,3,-5])` returns `3`   

3. Given a list of numbers returns a dictionary containing:
    1. the arithmetic mean of values in the list
    2. the index of the number in the passed list which produces the greatest difference from the mean
    3. the absolute value of the greatest difference from the mean. 

>Example
- `mean_info([5,7,3,-5,2])` returns `{'mean' : 2.4, 'max diff index' : 3, 'max diff' : 7.4}`

In [None]:
# Create your functions here


In [None]:
# Test your functions here
# Question 1
f_to_c(32)
f_to_c(11)
f_to_c(81.3)

#Question 2 
print(max_diff(3, [5,7,3,-5]))

#Question 3
print(mean_info([5,7,3,-5,2]))

In [None]:
%load ../Solutions/Functions/exercise1.py

### Anticipating Special Cases

In the solutions given, there is a special case where the function may not produce the expected behaviour. This is when the largest difference between a given number and two or more items in a given list is the same. The behaviour of the functions above relies on how `list.index()` works, which is to produce the _lowest_ index in `list` where the element you searched for appears. This means that the output is always the index of the first instance of that maximum difference, irrespective of whether that instance is unique or not.

This function behaviour may be entirely appropriate to our use case for the function, however it highlights that designing an effective and useful function requires some degree of critical thinking. It is important that you as the designer of the function have a sense of when and how the function should be used, and whether there are special cases that either need ot be dealt with, or acknowledged.

# Deeper Understanding

### Parameters vs Arguments

In short, parameters are the names listed in a function declaration and arguments are the actual values which are passed when we call the function. Argument values are mapped onto the parameter names when the function is called.

#### Example:
```python
# This is the function definition:
def func(parameter1, parameter2, parameter3):
    #Do some stuff
    print("Done!")

# Here is the actual function call:
func(argument1, argument2, argument3)
```

### Types of Arguments:
- **Required Arguments**: Like the name suggests, these are arguments which are required by a function and must be passed in the correct order.
    ```python
    def arithmetic(a,b):
        return pow(a,b)/b
    
    arithmetic(4,9)
    ```
    `a` and `b` are both required arguments. We would get a TypeError if we did not pass both arguments. 
    Also, `arithmetic(a,b) != arithmetic(b,a)` (in most cases!) 
    

- **Default arguments**: This is where, in the function declaration, the value of the parameter is preset. The function will retain this value unless it is changed when the function is called.
    ```python
    def arithmetic(a,b = 1):
        return pow(a,b)/b
    
    arithmetic(5)
    ```
    In the example above the function will take `a = 5` and use the default value `b = 1`. Note: A default argument cannot be followed by non-default arguments.  
    
    
- **Keyword arguments**: This refers to when we call a function and specify what each parameter should equal.
    ```python
    arithmetic(b = 9, a = 31)
    ```
    As you can see, the order is not important when we use keywords to pass parameter values. 
    
    
- **Variable Numbers of Arguments**: Using the character `*`, `**` (followed by parameter name) we can pass any number of variables to a function. 
    
    - `*` - are used to pass multiple non-keyworded variables. Common practice is to use `*args` as parameter name.
        - In practice these additional arguments are 'packed' as a tuple, and the `*` unpacks them.
    - `**` - are used to pass multiple keyworded variables. Common practice is to use `**kwargs` as parameter name.
        - `**` can also be used to unpack a dictionary, where dictionary keys are keywords and values are arguments.

In [None]:
# args example
def print_these_args(para1, *args):
    print("my first arg: ", para1)
    for arg in args:
        print("an arg from args: ", arg)

print_these_args('Making', 'my', 'way', 'downtown','...')

In [None]:
# same args example unpacking a list with *
extra_args = ['my', 'way', 'downtown','...']+
# try removing the * and see what happens
print_these_args('Making', *extra_args)

In [None]:
# kwargs example
def print_these_kwargs(**kwargs):
    for key, value in kwargs.items():
        print("{0} = {1}".format(key, value))
        
print_these_kwargs(name = "Vanessa Carlton", song = "Thousand Miles")

In [None]:
# Dictionary of keyword: argument pairs unpacked with **
def arithmetic(a,b):
    return pow(a,b)/b

arg_dict = {'b':2,'a':10}
arithmetic(**arg_dict)

### Scope

There are two types of scopes, _global_ and _local_. Variables that have _global_ scope are visible everywhere - in every function, loop, conditional, basically every part of the code. However, variables that have _local_ scope are only visible and hence can only be used within that local scope.

In python every time you create a new function, you effectively create a new local scope. All functions have their own local scope. 

So, when you define a variable in the indented code of a function, it is local to that function. Once the function is done computing, all variables inside that local scope will be collected by python's garbage collector and cease to exist. A function variable can use the same name as a variable in other functions, because they are distinct to their specific local scopes. Confusingly, a function can also create a variable with the same name as a variable in the global scope without overwriting the global variable, however that global variable will not then be available to the function.

Any variable defined outside of a function is said to have a global scope. Variables with global scope can be seen and called inside a function. In general, global variables can also be edited inside a function, but this should be avoided in most cases, since it's considered bad programming practice. 

The way python searches for variables is interesting. It will start off by looking through the most local scope and then iterate outwards towards the highest scope ([Built-In](https://www.datacamp.com/community/tutorials/scope-of-variables-python#LEGB)). If it makes it all the way out to the highest scope without finding the variable name, we get a `NameError`. 

### Writing a docstring

Docstrings are how programmers document the workings of python modules, functions, classes, and methods. A docstring is usually a multi-line string included at the very top of the object's definition. Ideally all functions will have docstrings so that users know what a given function does.

A simple one-liner docstring can be used for really obvious cases:
```python
def square_number(a):
    """Return the square of a."""
    return (a**2)
```
A more complicated function probably requires a multi-line docstring, it consists of a summary line, like with the one line docstring, then a blank line, and then a more elaborate description. The docstring for a function or method should summarize its behaviour and document its arguments, return value(s), side effects, exceptions raised, and restrictions on when it can be called (all if applicable). Optional arguments should be indicated.

```python
def arithmetic(a = 0, b = 2):
    """ Return the value of a number raised to a power and divided by that power
    
    Keyword arguments:
    a -- the base (default 0)
    b -- the power and divisor (default 2)
    """
    return pow(a,b)/b
```

Full details of docstring conventions can be found [here](https://www.python.org/dev/peps/pep-0257/)

In [None]:
# define a simple function
def square_number(a):
    """Return the square of a."""
    return (a**2)

In [None]:
help(square_number)

In [None]:
def arithmetic(a = 0, b = 2):
    """ Return the value of a number raised to a power and divided by that power

    Keyword arguments:
    a -- the base (default 0)
    b -- the power and divisor (default 2)
    """
    return pow(a,b)/b

In [None]:
help(arithmetic)

# Lambda Functions

### What are Lambda Functions?
Lambda Functions are simply a different way of creating functions. Specifically, simple, one-line functions. Rather than using keyword `def` we use the keyword `lambda`.

Below you can see a function which takes a value `x` and returns `x + 1` and its lambda function equivalent.

```python
def increment(x):
    return x + 1
``` 
This function is equivalent to:
```python
increment = lambda x : x + 1
```

Both functions can be called using the function name and passing the necessary arguments in parenthesis: `increment(12)`

### Lambda Function Syntax
On the left of the equals sign we have `function_name`. On the right of the equals sign, we start off with the keyword `lambda` followed by the function's parameters. After the colon `:` we have the expression. The expression is the bit of code which is executed when the function is called. In the example above, this is: `x + 1`.

The `lambda` statement creates a simple function which is assigned to a given variable for reuse. Technically lambda functions do not have names hence they are 'anonymous', their name is essentially the name of the variable storing the function.

Note: lambda functions do not require a `return` statement. This is because in Lambda functions `return` statements are implicit. The expression (to the right of the `:`) in the lambda function is evaluated and returned to the function caller.

### Advanced Lambda Functions
We can also introduce control flow to our lambda functions.

Here is function which finds checks if a given number is even.
```python
def is_even(x):
    if(x % 2 == 0):
        return True
    else: 
        return False
```
Here is the Lambda equivalent:
```python
is_even = lambda x: True if x % 2 == 0 else False
```

The difference in syntax here is after our expression which will be evaluated we simply add-on the conditions in similar one-line fashion. One thing to note here, `lambda` functions must always return something, so you must specify what to return in the case where your `if` statement is false (using the `else` statement).

### Exercise 2: Create a Lambda function which...
1. Multiplies together two values passed as a tuple.
> Example:
    - `multiply_tuple((5,9))` returns `45`

2. Takes a number and subtracts 50. If the result is less than 0, the function should return 0, otherwise it returns 1.
    
> Examples:
- `sub50(27)` returns `0`
- `sub50(51)` returns `1`

3. Takes three strings and performs the following formatting:
    * Checks if the first string starts with `"Data"`
    * Ensures they're lower case,
    * Combine them into one string, with each string separated by a space. 
    
> Examples:
- `str_manipulation("DaTA Science", 'FOR puBLIc', 'GOoD')` returns `data science for public good`
- `str_manipulation("DaT Science", 'FOR puBLIc', 'GOoD')` returns `First string does not start with 'Data'`

In [None]:
# Start your code here


In [None]:
# Test cases
# Question 1 
print(multiply_tuple((5,9)))

# Question 2 
print(sub50(27))
print(sub50(51))

# Question 3
print(str_manipulation("DaTA Science", 'FOR puBLIc', 'GOoD'))
print(str_manipulation("DaT Science", 'FOR puBLIc', 'GOoD'))

In [None]:
# %load ../Solutions/Functions/exercise2.py

# Introduce Applying Functions on Pandas Objects

Firstly, import the now (hopefully) familiar pandas library, and the titanic dataset that we've been working with. 

In [None]:
import pandas as pd

titanic = pd.read_csv("../data/titanic.csv")
titanic.head()

The `pandas` package has several methods dedicated to applying functions to both [`DataFrames`](http://pandas.pydata.org/pandas-docs/stable/reference/frame.html#function-application-groupby-window) and [`Series`](http://pandas.pydata.org/pandas-docs/stable/reference/series.html#function-application-groupby-window) objects. Out of which we will introduce the following:

   * `Series.map()`
   * `Series.apply()`
   * `DataFrame.apply()`
   * `DataFrame.applymap()`
   
Note: These are all relatively high-level functions which we will only be introducing. They are capable of doing a lot more than what we will see in this vignette.

### `Series.map`

- Works element-wise, that is, item-by-item along the `Series` object. 
- Used for substituting each value in a Series with another value, that may be derived from a function or a dictionary.
- Returns a series object of same length.
- Has a parameter to ignore na (missing) values.

We have only seen a glimpse of this so far which we will recap now. 

In [None]:
# First have a look at a sample of values from the embarked column
titanic['embarked'].sample(10, random_state = 33)

In [None]:
# Now map a dictionary onto that column, and take a sample of the resultant Series.
titanic['embarked'].map({'S':'Southampton','C':'Cherbourg','Q':'Queenstown'}).sample(10, random_state = 33)

### `Series.apply`

- Used more often to apply more complex functions to `Series`.
- Also operates elementwise, *but is flexible to functions which work on entire `Series` object*.
- Allows us to pass parameters to function we're applying.
- Will return a `Series` object of same length.

In [None]:
titanic['age'].apply(round, ndigits = 2).head(10)

### `DataFrame.applymap`

- This method applies a function that accepts and returns a scalar to every element (e.g. each cell) of a DataFrame.
- Works elementwise throughout the `DataFrame`.
- Accepts dictionaries for mapping.
- Does not allow us to pass parameters for function.
- Similar to `Series.map`.

### `DataFrame.apply`

- Applies a function elementwise along a given axis.
- Axis defaults to 0. `{0 : applies function to each column, 1 : applies function to each row}`.
- Can also pass parameters to function
- Allows us to use DataFrame rows or columns to as function variables. e.g. row/column-wise sum.
- Could return a single row or a Series or an entire DateFrame depending on function passed.
- Similar to `Series.apply`.

In [None]:
titanic.loc[:, ['age','sibsp', 'parch', 'fare']].apply(max, axis = 0)

In [None]:
family_count = lambda df: df['sibsp'] + df['parch']

In [None]:
titanic['family_members'] = titanic.apply(family_count, axis = 1)
titanic.head()

# Consolidation

This vignette introduces functions, and in reality only scratches the surface. Functions are the kind of tool that make their usefulness apparent when you are actually coding in the wild. The core use case for a function becomes apparent if you find that you need to do that same thing many times. Rather than write (or copy-paste) the same code each time you need to do something, write a function instead and call that. This will save you time and effort, and if you ever need to change or tweak the behaviour of that piece of code you now only have to change a single function rather than masses of code. This is could for clarity, reproducibility and error and sense checking. Additionally, in situations where it is important to abstract out a particular calculation from a block of code, using a fucntion may be a good move. This could be helpful for Quality Assurance purposes.

Of particular interest are the `apply` and `map`/`applymap` methods on `Series` and `DataFrame` objects, these are a little confusing at first, but given a bit of time and experience prove to be really useful and succinct ways of writing code.

## Learning Objectives

By the end of the session, participants will be able to:
* Know what functions are and why they are useful.
* Distinguish between a function's parameters and arguments.
    * Including different types of arguments
* Understand the idea of 'scope'.
    * i.e. Global scope vs Local scope
* Write and apply user defined functions.
* Identify and use 'lambda' functions.
* Apply functions to pandas `DataFrame` and `Series` objects.