---
title: "User-Defined Functions"
toc: true
---

## `.apply(f, axis=0/1)`

A frequent operation in `pandas` is applying a function on to either each column or row of a DataFrame. 

DataFrame’s `apply` method does exactly this. 


<center><img src="../assets/vectorized2.png" width="30%" style="filter:invert(1)" /></center>


Let's say we wanted to count the number of unique values that each column takes on. We can use `.apply` to answer that question: 

In [None]:
def count_unique(col):
    return len(set(col))

elections.apply(count_unique, axis="index") # function is passed an individual column

Year             50
Candidate       132
Party            36
Popular vote    182
Result            2
%               182
dtype: int64

### Column-wise: `axis=0` (default)

`data.apply(f, axis=0)` applies the function `f` to <b><u>each column</u></b> of the DataFrame `data`. 

<center><img src="../assets/axis0b.png" width="100%" style="filter:invert(1)" /></center>

For example, if we wanted to find the number of unique values in each column of a DataFrame `data`, we could use the following code:


In [None]:
def count_unique(column):
    return len(column.unique())

elections.apply(count_unique, axis=0)

Year             50
Candidate       132
Party            36
Popular vote    182
Result            2
%               182
dtype: int64

### Row-wise: `axis=1`

`data.apply(f, axis=1)` applies the function `f` to <b><u>each row</u></b> of the DataFrame `data`.

<center><img src="../assets/axis1b.png" width="100%" style="filter:invert(1)" /></center>

For instance, let's say we wanted to count the total number of voters in an election. 

We can use `.apply` to answer that question using the following formula: 

$$ \text{total} \times \frac{\%}{100} = \text{Popular vote} $$

In [None]:
def compute_total(row):
    return int(row['Popular vote']*100/row['%'])

elections.apply(compute_total, axis=1)

0         264413
1         264412
2        1143702
3        1143703
4        1287655
         ...    
177    135720167
178    158383403
179    158383403
180    158383401
181    158383402
Length: 182, dtype: int64