# Functions


**Learning Objectives**:

- Learn to write custom functions
- Use `return` to produce values
- Combine basic operations to build functions
* * * * *

We have already used functions like `len()` , `sum()`, `pd.DataFrame()`, in our code. These are essentially shortcuts that make it so that we don't need to write many lines of code to accomplish certain tasks. For example, without `sum()`, we can calculate the sum without relying on the function and by using a for loop with aggregation. One option is: 

In [1]:
list_of_numbers =  [1,3,5,7]

#use the accumulator pattern
total = 0
for s in list_of_numbers:
    total = total + s #add the current value to the sum of the values
print('the sum is', total)

the sum is 16


We can turn this code into a function called `my_sum()` with some additional syntax. (Notice how we avoid overwriting the `sum()` function):

In [2]:
def my_sum(list_of_numbers):
    total = 0
    for s in list_of_numbers:
        total = total + s
    return(total)
    
list_of_numbers =  [1,3,5,7]

print("The sum with sum() is:",sum(list_of_numbers), "\n The sum with my_sum() is:", my_sum(list_of_numbers))

The sum with sum() is: 16 
 The sum with my_sum() is: 16


So functions save us a lot of time, but they aren't black boxes. Rather, we can think of functions as basic building blocks that we expect to use over and over again.

Using existing functions from packages, or built-in functions, is generally preferred because it saves time (and effort). For example, we wouldn't write a custom `sum()` function when one already exists. However, when a function doesn't already exist that performs the desired operation, we can write our own custom function.

Specifically, a function does three things:
   1. They name pieces of code the way variables name strings and numbers.
   2. They take arguments, or data that you want to do something on.
   3. Using 1 and 2 they let you make your own "mini-scripts" or "tiny commands."
   4. They return values that can be referred to in further operations.

The details are pretty simple, but this is one of those ideas where it's good to get lots of practice!
    

## Basic function syntax

*   Begin the definition of a new function with `def`.
*   Followed by the *name* of the function.
    *   Must obey the same rules as variable names.
*   The *parameters* are defined in parentheses.
    *   Empty parentheses if the function doesn't take any inputs.
    *   We will discuss this in detail in a moment.
*   Then a colon.
*   Then an indented block of code.
*   The final line should be a `return` statement with the value(s) to be returned from the function

**Note:** Arguments and variables created within the function only exist within the function and cannot be referred to unless returned by the function using the `return` statement.


In [3]:
def feet_to_meters(x):
    return(x*.304)

Notice how there is no print statement from running the block of code above. This is because defining a function does not run it. You can think of it as assigning a value to a variable. The function needs to be *called* with appropriate arguments to execute the code it contains. 

We save the result to a variable and print the result. 


In [4]:
answer = feet_to_meters(100)
answer

30.4

## Challenge 1: My First function

Make a function that converts between temperatures in Celsius and Farenheit



In [40]:
#your function here

## Function arguments

This function would be more useful if it were more flexible. For example, if we want to convert any number of feet  to meters, not just 1. 

These *parameters* are specified when defining a function in the parentheses, separated by commas.
*   These become variables when the function is executed.
*   Are assigned the arguments in the call (i.e., the values passed to the function).
*   We do operations based on the arguments, and return the result

In [41]:
def convert_feet_to_meters(feet):
    return(feet*.304)

print(convert_feet_to_meters(12))

print(convert_feet_to_meters(100))

3.6479999999999997
30.4


We can also include multiple arguments, separated by commas. Order for these arguments matters. For example, in the simple function below what happens when you have different orders?

In [12]:
def divide(x,y):
    return(x/y)

print(divide(4,6))
print(divide(6,4))

0.6666666666666666
1.5


You can also use *keyword arguments*, where each argument is given a name. In this case, the order of the arguments doesn't matter, since each has a name associated with it. For example:

In [19]:
def divide(x, y):
    return(x/y)


print(divide(x=4,y=6))
print(divide(y=6,x=4))

0.6666666666666666
0.6666666666666666


Are the arguments named appropriately? What does x and y stand for? What could be more clear?

Generally, it's good practice to both use well-named arguments and use them in the same order. This is easier to read. 


## Challenge 2: Calling by Name

What does this short program print?

In [3]:
def print_date(year, month, day):
    joined = str(year) + '/' + str(month) + '/' + str(day)
    print(joined)

print_date(day=1, month=2, year=2003)

2003/2/1


We can also use keywords to give default arguments. If we modify the argument `y` to default to `10`, the following lines are all equivalent

In [4]:
def divide(x, y=10):
    return(x/y)


print(divide(x=4,y=10))
print(divide(x=4))
print(divide(4,10))
print(divide(4))

0.4
0.4
0.4
0.4


## Challenge 3: More Errors!
The following code gives an error. Why? 

**Hint**: Think about what happens inside the function, and how the arguments plug into the function.

In [5]:
divide(y='string')

TypeError: divide() missing 1 required positional argument: 'x'

There's a lot of different permutations of arguments in functions, so keeping them organized 

## Function writing

Function writing is one of the  

Here are some guidelines that can help minimize errors and make the process less painful


1. Plan
    1. What is the overall goal of the function? Is there a function that exists already that does the same thing? 
    2. What is going to be the output of the function? (what datatype, how many items)
    3. What arguments will you need? (What pieces of the function do you need to control?)
    4. What are the general steps of the program? This can be written in bullet points or pseudocode
2. Write
    1. Write the code without the function wrapper
    2. Start small. Write small self-contained blocks of code and put the pieces together (you can also consider sub-functions)
    3. Test each part of the function as it is added. Especially track the input of the function and how it changes at each step. 
    4. Wrap the code in the function syntax. 
3. Test
    1. Take the function and test *several* cases
    2. Before running test cases, form an expectation of the result. 
    3. Test the function. Pay attention to both errors and strange results. Make adjustments to account for new cases.
    4. Integrate the function with the rest of the code. Are the input arguments the right type? Does the output flow into the rest of the code?
    
    
Let's go through an example of the function development process.

Let's say we have a list of filenames from an experiment. Each filename has two parts, a county and a year, separated by an underscore (ex. Alameda_2020.csv) We are interested in parsing these names into to a dataframe with two columns, one containing the county (lowercase) and one with the year. 

1. Plan
    1. Parse a list of strings into two parts, 
    2. Input: list of strings
    3. Output: two lists, one of strings, one of ints
    4. Pseudocode can look like this:
    
    ``` 
    for file in filelist:
            split file into parts
            process each part
            append to list
       make a dataframe
       return
       
    ```

2. Write

Start with a single file inside the loop. First, I test the splitting function on a single file

In [5]:
data = ['Alameda_2020.csv','Marin_2020.csv','Contra Costa_2020.csv','Alameda_2021.csv','San Francisco_2021.csv']

testfile = data[1]

fileparts= testfile.split('.')[0].split('_')
print(fileparts)


['Marin', '2020']


Once we have the parts parsed out, we can do the next step and process each part appropriately. For the county, we want to make it lowercase, and for the year, we can convert it to an integer.

In [6]:
fileparts= testfile.split('.')[0].split('_')
county = fileparts[0]
year = fileparts[1]

county = county.lower()
year = int(year)
print(county, type(year))

marin <class 'int'>


The next step is to wrap this bit of code in a for-loop.

In [31]:
data = ['Alameda_2020.csv','Marin_2020.csv','Contra Costa_2020.csv','Alameda_2021.csv','San Francisco_2021.csv']

county_list = []
year_list = []
for filename in data:
    fileparts= testfile.split('.')[0].split('_')
    county = fileparts[0]
    year = fileparts[1]

    county = county.lower()
    year = int(year)
    county_list.append(county)
    year_list.append(year)
print(county_list)
print(year_list)

['marin', 'marin', 'marin', 'marin', 'marin']
[2020, 2020, 2020, 2020, 2020]


What happened? How do we fix it? When we run the code on the whole loop, do you notice anything about the other county names? What might we want to change?

Once the full code works, we can do the final steps: convert the output to a DataFrame and place everything into a function.

In [37]:
import pandas as pd

def parse_files(filelist):
    county_list = []
    year_list = []
    for filename in data:
        fileparts= filename.split('.')[0].split('_')
        county = fileparts[0]
        year = fileparts[1]

        county = county.lower().replace(' ','')
        year = int(year)
        county_list.append(county)
        year_list.append(year)
    return  (pd.DataFrame({'county':county_list,
                  'year':year_list}))
    


data = ['Alameda_2020.csv','Marin_2020.csv','Contra Costa_2020.csv','Alameda_2021.csv','San Francisco_2021.csv']
output = parse_files(data)
output

Unnamed: 0,county,year
0,alameda,2020
1,marin,2020
2,contracosta,2020
3,alameda,2021
4,sanfrancisco,2021


## Challenge 4: Advanced Conversion function

Take the conversion function above and let's make it more flexible. Let's say we want to convert from feet to other units as well. Add an additional keyword argument `unit` that takes a string as input. Use this argument and if-statements in the function to give the appropriate output. For example `convert_feet(value, unit='meters')` would convert from feet to meters, while `convert_feet(value, unit='inches')` would convert to inches.

We can follow these steps:
1. Plan your function. 
2. Write your function. 
3. Test the function.

**Bonus**: What if you wanted to convert several values at once? What if you wanted to be able to convert to multiple units at once? 

In [7]:
#original conversion function
def convert_feet_to_meters(feet):
    meters = feet*.304
    return(meters)


## your code here