# Lecture 8: Functions 





In [None]:
from datascience import *
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

## Histogram Review 

Let's review looking at a histograms a tool for visualizing distributions of numerical data. 

We will revisit movies data again. 

In [None]:
top_movies = Table.read_table('top_movies_2017.csv')
top_movies

Specifically, we can look at the "Age" of the movies. 

In [None]:
ages = 2024 - top_movies.column('Year')
top_movies = top_movies.with_column('Age', ages)

<br/>

**Exercise**: We can explore the range of values we see with `Age`. 

<details><summary>Click to Expand Solution</summary>
    

```python
print(min(top_movies.column('Age')))
print(max(top_movies.column('Age')))
```

</details>

<br/><br/>

**Exercise:** Use `np.arange` to create histograms with regularly spaced bins (every 10 years) that include the range of values. 

<details><summary>Click to Expand Solution</summary>
    

```python
top_movies.hist('Age', bins = np.arange(0, 110, 10))
```

</details>

<br/><br/>

**Exercise:** Let's specify the bins as in `my_bins`.  Split the "Age" column into these bins.  

In [None]:
my_bins = make_array(0, 5, 10, 15, 25, 40, 65, 105)
binned_data = ...

<details><summary>Click to Expand Solution</summary>
    

```python
my_bins = make_array(0, 5, 10, 15, 25, 40, 65, 105)
binned_data = top_movies.bin('Age', bins = my_bins)
binned_data
```

</details>

We can confirm that the number of movies in the bins equals the number of rows of data in the original Table `top_movies`

In [None]:
num_movies = sum(binned_data.column("Age count"))
num_movies == top_movies.num_rows

<br/>

**Exercise:** Show the movies in each bin as a percent rather than a count. 

In [None]:
percents = ...
binned_data = binned_data.with_column('Percent', percents)
binned_data

<details><summary>Click to Expand Solution</summary>
    

```python
percents = binned_data.column('Age count')/num_movies * 100 
binned_data = binned_data.with_column('Percent', percents)
binned_data
```

</details>

<br/>

**Exercise:** Create a histogram of `Age` using the `my_bins` bins also specifying the unit as "Year".

<details><summary>Click to Expand Solution</summary>
    

```python
top_movies.hist('Age', bins = my_bins, unit = 'Year')
```

</details>

<br/>

**Exercise:** What is the height of the [40, 65) bin?

*Recall*  Area of bar = % in bin = height x width of bin 

height = % in bin / width

In [None]:
percent = ...
width = ...

height = ...
height

<details><summary>Click to Expand Solution</summary>
    

```python
percent = binned_data.where('bin', 40).column('Percent').item(0)
width = 65 - 40 

height = percent / width
height
```

</details>

<br><br><br><br>

---

<center> Return to Slides </center>

---

<br><br><br><br><br><br><br>

<br/>

# Defining Functions 

The purpose of defining a function is to give a name to a computational process that may be applied multiple times.

<br>

**Example:** Create a function that takes a numerical input and triples it: 
$\textsf{triple}(x) = 3x $

In [None]:
def triple(x):
    return 3 * x 

In [None]:
triple(3)

We can also assign a value to a name, and call the function on the name:

In [None]:
num = 4
triple(num)

In [None]:
triple(num * 5)

## The Anatomy of a Function
    
```python
def functionname(Arguments_Parameters_Expressions_or_Values):     
      return return_expression
```

### Functions are Type-Agnostic 

In [None]:
triple(3)

In [None]:
triple(3.4)

In [None]:
triple('ha')

<br/> 

**Exercise** Feed an array into the function `triple` to see what is produced. 

In [None]:
np.arange(4)

In [None]:
triple(np.arange(4))

<details><summary>Click to Expand Solution</summary>
    

```python
triple(np.arange(4))
```

</details>

<br><br><br><br>

---

<center> Return to Slides </center>

---

<br><br><br><br><br><br><br>

<br/><br/>

### Discussion Question 

```python
def f(s):     
      return np.round(s / sum(s) * 100, 2)
```

In [None]:
def percent_of_total(s):
    return np.round(s / sum(s) * 100, 2)

In [None]:
first_four=make_array(1,2,3,4)
first_four

In [None]:
percent_of_total(first_four)

<br> 

### Functions Can Take Multiple Arguments ###

**Example:** Calculate the Hypotenuse Length of a Right Triangle


Pythagoras's Theorem: If $x$ and $y$ denote the lengths of the right-angle sides, then the hypotenuse length $h$ satisfies:

$$ h^2 = x^2 + y^2 \qquad \text{which implies}\qquad \hspace{20 pt} h = \sqrt{ x^2 + y^2 } $$

In [None]:
def hypotenuse(x, y):
    hypot_squared = (x ** 2 + y ** 2)
    hypot = hypot_squared ** 0.5
    return hypot

<br>

*Note:* We could've typed the body all in one line. Do you find this more readable or less readable than the original version?

In [None]:
def hypotenuse(x,y):
    return (x ** 2 + y ** 2) ** 0.5

<br>

**Example:**  Write a function that takes the year of birth of a person and produces their age in years.

In [None]:
def age(year):
    age = 2024 - year
    return age

In [None]:
age(1942)

<br>

**Exercise:**  Now add some bells and whistles: Take person's name and year of birth (two arguments). Produce a sentence that states how old they are.

In [None]:
name_and_age('Joe', 1942)

<details><summary>Click to Expand Solution</summary>
    

```python
def name_and_age(name, year):
    return name + ' is ' + str(age(year)) + ' years old.'
```

</details>

<br><br><br><br>

---

<center> Return to Slides </center>

---

<br><br><br><br><br><br><br>

# Apply 

Let's create a sample data set using characters from the Office. 

In [None]:
staff = Table().with_columns(
    'Person', make_array('Jim', 'Pam', 'Michael', 'Dwight'),
    'Birth Year', make_array(1978, 1979, 1967, 1970)
)
staff

<br>

**Exercise:** Calculate the current age of each person using the `age` function created above. 

<details><summary>Click to Expand Solution</summary>
    

```python
staff.apply(age, 'Birth Year')
```

</details>

<br><br>

We can check this by running the `age` function for each item in the `Birth Year` column of the table. 

In [None]:
make_array(age(staff.column('Birth Year').item(0)),
           age(staff.column('Birth Year').item(1)),
           age(staff.column('Birth Year').item(2)),
           age(staff.column('Birth Year').item(3)))

<br><br> 

**Exercise:** Use the `name_and_age` function to produce the sentence for each Office character.

<details><summary>Click to Expand Solution</summary>
    

```python
staff.apply(name_and_age, 'Person', 'Birth Year')
```

</details>