# BUDS Report 12: Apply


### Table of Contents

1.  <a href='#section 1'>Apply</a>
    
    
2. <a href='#section 2'>Defining Functions</a>
<br><br> 

In [None]:
# run this cell
from datascience import *
import numpy as np
import math
import matplotlib.pyplot as plt
plt.style.use("fivethirtyeight")
%matplotlib inline

## 1. Apply <a id='section 1'></a>

Let's take a quick look at how the apply method works. Take a look at the image below.

<img src="images/apply.png" width = 500/>

The final line is the format of a call to the `apply` method.

Let's say we have a function called `add_five`, which adds five to the value inputted. For example: `add_five(12)` evaluates to `17`.

Now, let's say we use the function to apply it to a dataset of people, `people`, that was made five years ago. We might want to add five to the "ages" column because the people in that dataset are five years older now. Instead of taking the entire column out and adding five to the array, we can use the `apply` method.

Our code would look like this: `people.apply(add_five, "ages")`.

To read it in plain English, you would say that we are telling the computer to go to the `people` table and `apply` the function `add_five` to values under `ages`.

Let's try to replicate this process.

In [None]:
# create a table of people
data_2017 = Table().with_columns(
    'Person', make_array('Jim', 'Pam', 'Michael', 'Creed'),
    'Birth Year', make_array(1985, 1988, 1967, 1904),
    'Age', make_array(37, 34, 55, 118),
    'Car Purchase', make_array(2020, 2005, 2003, 2019),
    'House Purchase', make_array(2007, 2013, 1990, 2021))
data_2017

In [None]:
# this cell defines the add_five function
def add_five(x):
    return x + 5

<div class="alert alert-warning">
    <b>PRACTICE:</b> Convert the birth years of each individual into float values. What data type is returned?
    </div>

In [None]:
...

<div class="alert alert-warning">
    <b>PRACTICE:</b> Now, try finding their new ages in the cell below. Create a new table that has these values under "Age" and call it <code>data_2022</code>. 
</div>

In [None]:
...

You can also use functions that take in multiple arguments. To do this, you simply list *multiple* column names instead of just one. Each column's values correspond to an argument in the function.

<div class="alert alert-warning">
    <b>PRACTICE:</b> Try finding the latest year that each person in the <code>data_2017</code> table made a large purchase.
</div>

In [None]:
...

Well done! Although these were simple functions, you can see how helpful `apply` can be in doing quick table manipulations.

## 2. Defining Functions <a id='section 2'></a>

Now, let's take a look at how we might define a function. Creating our own functions is a really helpful tool because we may want to perform different actions for different datasets or when working with different companies or goals in mind. This gives us the freedom to do things ourselves, and a lot of computer science and data science use this process.

<img src="images/define.png" width = 600/>

In [None]:
# run this cell to define this function
def at_least_1980(x):
    "this function returns the minimum of 1980 and the given year"
    return max(x, 1980)

Which part of this function is the "name" according to the image above?

_Written answer:_

What is the purpose of the second line in the code cell? Do you remember what its called?

_Written answer:_

<div class="alert alert-warning">
    <b>PRACTICE:</b> To see this function in action, try calling it on two different values.
</div>

In [None]:
# below 1980
...

In [None]:
# above 1980
...

<div class="alert alert-warning">
    <b>PRACTICE:</b> Our data seems to have an issue in that one individual's age is 123 years. Let's say we know that our data system includes people born in the year 1980 or later. We'll assume that this individual is as old as our system allows. Update the table's "Birth Year" column.
</div>

In [None]:
...

<div class="alert alert-warning">
    <b>PRACTICE:</b> Now, we can see that the ages are wrong. Create a function that takes in a year and returns the age of the person.
</div>

In [None]:
...

In [None]:
# check that it calculates your age in this cell
...

<div class="alert alert-warning">
    <b>PRACTICE:</b> Use <code>apply</code> and the function above to update the "Age" column in our table.
</div>

In [None]:
...

<div class="alert alert-warning">
    <b>PRACTICE:</b> Finally, let's create one more function. Suppose a car's yearly payment is \$5,000 regardless of when it was purchased, and a house's yearly payment is \$10,000 regardless of when it was purchased. Create <b>one</b> function that finds out how much an individual has paid since purchasing each item.
</div>

In [None]:
...

In [None]:
# call the function for an individual who ...
# purchased a car in 2002 and a house in 2021
# what number should you get?
...

<div class="alert alert-warning">
    <b>PRACTICE:</b> Now, use that function to add a column to our table called "Payments".
</div>

In [None]:
...

Congratulations! Now we know how to use the apply function and create another column using existing data! Defining functions is a big concept so don't worry if it doesn't feel intuitive at first.

### Downloading as PDF

Download this notebook as a pdf by clicking <b><code>File > Download as > PDF via LaTeX (.pdf)</code></b>. Turn in the PDF into bCourses under the corresponding assignment.

Adapted from Data 8, Spring 2020 Lecture 9 and Lab 4