# Using the `apply()` method in pandas

Sometimes, creating a calculated column in pandas is as simple as this:

```python
df['difference'] = df['first_column'] - df['second_column']
```

or this:

```python
df['date_fixed'] = pd.to_datetime(df['date'])
```

Other times, though, your needs are more complex -- you need to take each row of data in your data frame and do _several things_ to it. That's where [`apply()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html) comes in.

Given a function, `apply()` will, uh, _apply_ that function to every row in the data frame. A common scenario for doing so would be to create a new column.

An example might make this idea a little more clear. Let's load up a CSV of Texas death row media witnesses.

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv('../data/tx-death-row-media-list.csv', parse_dates=['execution_date'])

Now, let's say, we want to create a new column with the _month_ of the execution. [Given what we know about date objects](Date%20and%20time%20data%20types.ipynb), this should be simple, right?

So this might be my first guess:

In [None]:
df['month'] = df['execution_date'].month

Womp womp. Looks like we need to create a _function_ to do this for us. Then we can _apply_ that function to each row.

In [None]:
def get_month(row):
    '''Given a row of data, return the month of the execution date'''
    return row['execution_date'].month

... and now we can apply it. We also need to specify _how_ it's going to be applied. `axis=0` is the default and attempts to apply the function to each _column_. We want `axis=1`, which applies the function to each _row_ of data.

In [None]:
df['month'] = df.apply(get_month, axis=1)

In [None]:
df.head()

We could also have dropped in a _lambda expression_ for the function -- in this case, it's simple enough to be readable:

In [None]:
df['month'] = df.apply(lambda x: x['execution_date'].month, axis=1)

In [None]:
df.head()

In [None]:
sorted(df.month.unique())