In [None]:
import pandas as pd
import requests

# Computing AQI

![equation](equation.gif)

In [2]:
df_list = pd.read_html(
    'https://en.wikipedia.org/wiki/Air_quality_index', header=0)

In [3]:
aqi_df = df_list[14].drop(0)

In [4]:
aqi_df[['min','max']] = aqi_df['AQI'].str.split('-', 1, expand=True)

In [5]:
aqi_df.columns

Index(['O3 (ppb)', 'O3 (ppb).1', 'PM2.5 (µg/m3)', 'PM10 (µg/m3)', 'CO (ppm)',
       'SO2 (ppb)', 'NO2 (ppb)', 'AQI', 'AQI.1', 'min', 'max'],
      dtype='object')

In [6]:
aqi_df.rename(columns={'O3 (ppb).1': 'O3 (ppb) 1 hour', 'AQI.1': 'Category'}, inplace=True)

In [7]:
# The final value for "Category" should also be "Hazardous"
aqi_df.fillna(method='ffill', inplace=True)

In [8]:
aqi_df

Unnamed: 0,O3 (ppb),O3 (ppb) 1 hour,PM2.5 (µg/m3),PM10 (µg/m3),CO (ppm),SO2 (ppb),NO2 (ppb),AQI,Category,min,max
1,0-54 (8-hr),-,0.0-12.0 (24-hr),0-54 (24-hr),0.0-4.4 (8-hr),0-35 (1-hr),0-53 (1-hr),0-50,Good,0,50
2,55-70 (8-hr),-,12.1-35.4 (24-hr),55-154 (24-hr),4.5-9.4 (8-hr),36-75 (1-hr),54-100 (1-hr),51-100,Moderate,51,100
3,71-85 (8-hr),125-164 (1-hr),35.5-55.4 (24-hr),155-254 (24-hr),9.5-12.4 (8-hr),76-185 (1-hr),101-360 (1-hr),101-150,Unhealthy for Sensitive Groups,101,150
4,86-105 (8-hr),165-204 (1-hr),55.5-150.4 (24-hr),255-354 (24-hr),12.5-15.4 (8-hr),186-304 (1-hr),361-649 (1-hr),151-200,Unhealthy,151,200
5,106-200 (8-hr),205-404 (1-hr),150.5-250.4 (24-hr),355-424 (24-hr),15.5-30.4 (8-hr),305-604 (24-hr),650-1249 (1-hr),201-300,Very Unhealthy,201,300
6,-,405-504 (1-hr),250.5-350.4 (24-hr),425-504 (24-hr),30.5-40.4 (8-hr),605-804 (24-hr),1250-1649 (1-hr),301-400,Hazardous,301,400
7,-,505-604 (1-hr),350.5-500.4 (24-hr),505-604 (24-hr),40.5-50.4 (8-hr),805-1004 (24-hr),1650-2049 (1-hr),401-500,Hazardous,401,500


> Oregon’s index is based on three pollutants regulated by the federal Clean Air Act: ground-level ozone, particle pollution and nitrogen dioxide.<sup>2</sup>

Ozone: O<sub>3</sub> (ppb)

Particle pollution: PM<sub>2.5</sub> (µg/m<sup>3</sup>)

Nitrogen dioxide: NO<sub>2</sub> (ppb)

Note: AQI data typically contains both the concentration and the AQI for a particular pollutant *or* the AQI for a 24-hour period and the "Defining parameter," i.e. the pollutant whose value drove the AQI for that day.

To do:

Use the above data to define functions to compute AQI and category whenever that data is absent.

# County data

<hr>
# References

<sup>1</sup> https://en.wikipedia.org/wiki/Air_quality_index#Computing_the_AQI
  - Note that I'm using a gif of the equation here instead of the $\LaTeX$ I originally wrote because on GitHub -- but not locally -- it renders the definitions following the "where" on one line, in improbably small type. I have the code at the bottom of this notebook, and if you know what the deal is, feel free to submit a pull request!

<sup>2</sup> https://web.archive.org/web/20180822170335/https://www.oregon.gov/deq/aq/Pages/aqi.aspx

<hr>

$I = \frac{I_{high}-I_{low}}{C_{high}-C_{low}}(C-C_{low})+I_{low}\\
$

$\textrm{where}
$

$I=\textrm{the Air Quality index,}\\
C=\textrm{the pollutant concentration,}\\
C_{low}=\textrm{the concentration breakpoint that is}\leq C,\\
C_{high}=\textrm{the concentration breakpoint that is}\geq C,\\
I_{low}=\textrm{the index breakpoint corresponding to } C_{low},\\
I_{high}=\textrm{the index breakpoint corresponding to } C_{high}.\\
$