In [2]:
import pandas as pd
import numpy as np

# A Variety of Functions

This subchapter will introduce you to many more miscellaneous pandas functions that are super helpful to use. First, let us load in the same dataset as the first subchapter in the `Pandas` chapter. As a reminder, this dataset has beer sales across 50 continental states in the US. It is sourced from [_Salience and Taxation: Theory and Evidence_](https://www.aeaweb.org/articles?id=10.1257/aer.99.4.1145) by Chetty, Looney, and Kroft (AER 2010), and it includes 7 columns:
- `st_name`: the state abbreviation
- `year`: the year the data was recorded
- `c_beer`: the quantity of beer consumed, in thousands of gallons
- `beer_tax`: the ad valorem tax, as a percentage
- `btax_dollars`: the excise tax, represented in dollars per case (24 cans) of beer 
- `population`: the population of the state, in thousands
- `salestax`: the sales tax percentage


In [3]:
df = pd.read_csv('data/beer_tax.csv')
df

Unnamed: 0,st_name,year,c_beer,beer_tax,btax_dollars,population,salestax
0,AL,1970,33098,72.341130,2.370,3450,4.0
1,AL,1971,37598,69.304600,2.370,3497,4.0
2,AL,1972,42719,67.149190,2.370,3539,4.0
3,AL,1973,46203,63.217026,2.370,3580,4.0
4,AL,1974,49769,56.933796,2.370,3627,4.0
...,...,...,...,...,...,...,...
1703,WY,1999,12423,0.319894,0.045,492,4.0
1704,WY,2000,12595,0.309491,0.045,494,4.0
1705,WY,2001,12808,0.300928,0.045,494,4.0
1706,WY,2002,13191,0.296244,0.045,499,4.0


## Conditional Selection

An important task we want to accomplish when looking at data is to filter out all rows that satisfy a certain condition. In other words, we want to extract all the rows **where** a certain thing is accomplished. Conditional selection helps us achieve exactly this, extracting all rows that satisfy a certain condition.

In the [previous subchapter](loading-looking.ipynb), we briefly gave an example of how we can use `.loc` and `.iloc` to extract all rows that match certain values. To be more precise, when we pass in certain values for `.loc` and `iloc` to use, they generate an array of True/False values and use that to determine which values to include. This is very similar to our prior discussion on [NumPy slicing](../02-prereqs/numpy.ipynb) 