# User Defined Functions and Apply #

In this notebook we will learn:

- How to define our own functions
- How to use `apply` to, well, apply a function to a column or columns of a table

In [29]:
from datascience import *
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

## Functions ##

In [30]:
def name_of_function(argument):
    ### Here's where you document what the function does ###
    step1 = argument * 10  # Complex functions can be broken into steps
    step2 = abs(step1)
    return step2           # Function must return something

In [31]:
def triple(x):
    ### This function triples whatever you give it ###
    return 3 * x

In [32]:
triple(3)

9

In [33]:
triple(make_array(1,2,3))

array([3, 6, 9], dtype=int64)

In [34]:
num = 4

In [35]:
triple(num)

12

In [36]:
triple(num * 5 + 1)

63

### Note About Scopes

You have to define a variable *before* you use it.  See the error message below.  

By the way, if you skip to the lower cell where the variable is defined, run it, then come back and run this higher cell, that will actually work.  It's bad programming, but it will work.  

In [37]:
x

5

In [38]:
x = 5

In [39]:
triple(2 * x)

30

In [40]:
x

5

### Type Agnostic

Most functions don't care what type of variable you plug into them, as long as the operations and functions you use internal to the defined function can operate on those values.  

In [41]:
def add4(x):
    ### This function adds 4 ###
    return x+4

add4(3)

7

In [42]:
add4("ha")

TypeError: must be str, not int

In [43]:
triple('ha')

'hahaha'

Can be applied to an array, even though it wasn't designed with an array in mind.

In [44]:
triple(np.arange(4))

array([0, 3, 6, 9])

### Function mean for an array

Let's write a function that takes an array and for each element, computes the percent of the total that that element is. 

In [45]:
def percent_of_total(array):
    ### takes an array and produces an array with percent of total for original array ###
    return np.round(array / sum(array) * 100, 2)

In [46]:
percent_of_total(make_array(1,2,3,4))

array([10., 20., 30., 40.])

In [47]:
percent_of_total(make_array(1, 213, 38))

array([ 0.4 , 84.52, 15.08])

### Multiple Arguments

$ h^2 = x^2 + y^2 \hspace{20 pt} => \hspace{20 pt} h = \sqrt{ x^2 + y^2 } $

In [48]:
def hypotenuse(x,y):
    hypot_squared = (x ** 2 + y ** 2)
    return hypot_squared ** 0.5

In [49]:
hypotenuse(np.arange(3,15,3), 12)

array([12.36931688, 13.41640786, 15.        , 16.97056275])

In [50]:
hypotenuse(12, 12)

16.97056274847714

## Apply ##

In [51]:
ages = Table().with_columns(
    'Person', make_array('Jim', 'Pam', 'Michael', 'Creed'),
    'Birth Year', make_array(1985, 1988, 1967, 1904)
)
ages

Person,Birth Year
Jim,1985
Pam,1988
Michael,1967
Creed,1904


In [52]:
def cap_at_1980(x):
    return min(x, 1980)

In [53]:
cap_at_1980(1975)

1975

In [54]:
cap_at_1980(1991)

1980

In [55]:
ages.apply(cap_at_1980, 'Birth Year')

array([1980, 1980, 1967, 1904], dtype=int64)

In [56]:
ages.with_column('Capped Year', ages.apply(cap_at_1980, 'Birth Year') )

Person,Birth Year,Capped Year
Jim,1985,1980
Pam,1988,1980
Michael,1967,1967
Creed,1904,1904


In [57]:
def name_and_age(name, year):
    age = 2021 - year
    return name + ' is ' + str(age)

In [58]:
ages.apply(name_and_age, 'Person', 'Birth Year')

array(['Jim is 36', 'Pam is 33', 'Michael is 54', 'Creed is 117'],
      dtype='<U13')

In [59]:
ages.with_column("How Old?", ages.apply(name_and_age, 'Person', 'Birth Year'))

Person,Birth Year,How Old?
Jim,1985,Jim is 36
Pam,1988,Pam is 33
Michael,1967,Michael is 54
Creed,1904,Creed is 117


In [60]:
ages.apply(str, 'Birth Year')

array(['1985', '1988', '1967', '1904'], dtype='<U4')