Get today's Notebook at: www.eng.mu.edu/ccl/c4c

---
## Make a copy of this Notebook

- Go to `File > Save a copy in Drive`
  - You may want to change the default name to `your-name-Friday-project.ipynb`
- Now, follow the instructions and start coding!

---

## Goal of the project

Over the entire week, we talked about AQI and calculated some components of climate-related data using Python. Today we will use real monitoring data to calculate AQI for a time and place (in Wisconsin) of your chosing.

1. [Download data](https://colab.research.google.com/drive/1OYCup15Mtp-Mi0vn0nWgjpcBd0wN6zIx#scrollTo=riKJ6xB4O9da) for a day for a location in Wisconsin from the Department of Natural Resources website.
2. We have provided you a `prepare_dnr_data_station_report()` function like we did for the earlier project to clean the data for your calculation.
  - The input into this function will just be the filename
3. Use `numpy` module to extract the appropriate data from this DataFrame to calculate AQI **based on PM2.5 concentration**.
4. Write a function that calculates AQI using the formula [given below](https://colab.research.google.com/drive/1OYCup15Mtp-Mi0vn0nWgjpcBd0wN6zIx#scrollTo=V3jpCP_uDZcr)
  - The function must take in a single value of the measured concetration ($C$) as input and return the calculated AQI
5. Plot the AQI (using PM2.5 as pollutant) vs time using Matplotlib

### Extra Practice
- Calculate and plot AQI using Ozone as pollutant
- Plot how the temperature and relative humidity varied hourly through the day
- Report the maximum and minimum temperature of the day.

---




## Calculating AQI
Air Quality Index is calculated separately for each pollutant based on the concentration of that specific pollutant. Remember we talked about common pollutants? For air quality, the ones that are used are: Ozone, PM2.5, PM10, CO (carbon monoxide), SO2 (sulfer dioxide), and NO2 (nitric oxide). Among these, Ozone and PM are the most important.

In the final project, we will calculate AQI based on PM2.5 concentration for a day somewhere in Wisconsin.

### AQI Formula

The AQI is calculated using the formula below

$$AQI = I_1 + \frac{I_2 - I_1}{B_2 - B_1}\times\left(C-B_1\right)$$

Here, **$C$ is the concentration of the pollutant** (usually averaged over a specified period).

**$I_1$, $I_2$, $B_1$, and $B_2$ are obtained from a table based on the value of $C$.** These tables for *PM2.5* and *Ozone* are listed below in appropriate units corresponding to the measurement available the project data file.

#### For **PM2.5**


|Range| $B_1$ | $B_2$ | $I_1$ | $I_2$ |
|----|---|----|----|----|
|0 < $C$ <= 9| 0 | 9.0| 0|50|
|9 < $C$ <= 35.5| 9 | 35.5| 50|100|
|35.5 < $C$ <= 55.5|35.5 | 55.5| 100|150|
|55.5 < $C$ <= 125.5|55.5 | 125.5 | 150|200|
|125.5 < $C$ <= 225.5|125.5 | 225.5 | 200 |300|
|225.5 < $C$ |225.5 | 325.5 | 300 | 500 |


#### For **Ozone**

|Range| $B_1$ | $B_2$ | $I_1$ | $I_2$ |
|---|----|----|----|----|
|0 < $C$ <= 125| 0 | 125| 0|100|
|125 < $C$ <= 165| 125 | 165| 100|150|
|165 < $C$ <= 205|165 | 205 | 150|200|
|205 < $C$ <= 405|205 | 405 | 200 |300|
|405 < $C$  |405 | 604 | 300 | 500 |


---




## Getting the data

- Go to [Wisconsin Air Quality Monitoring Data Website](https://airquality.wi.gov/report/SingleStationReport).

*On left pane*
- Pick a county
- Pick a owner (preferably, *DNR - Active*)
- Select a station
   - Make sure that this station has PM2.5 data
- Select All monitors

*On right pane*
- Select *Excel* under report view
- Select *Daily* under period
- Choose a day by changing the *From date*
- Under type, select *Average*
- Select *1 Hour* for both from and to time base

Then click *DISPLAY*. This will download the data for the selected week.

Now use this downloaded data for your project

---


## `prepare_dnr_data_station_report()` function





Run this block of code first. Do not modify it.

In [None]:
###_______________________________________###
###      RUN THIS BLOCK FIRST             ###
###_______________________________________###
### DO NOT MODIFY THIS CODE               ###
### THIS CODE CONTAINS THE FUNCTION NEEDED###
### TO PREPARE DATA                       ###
###_______________________________________###
###     RUN THIS BLOCK FIRST              ###
###_______________________________________###


# The function to prepare data for the project
# First load the downloaded file on your Colab runtime
# Then call this function

def prepare_dnr_data_station_report(file_path):

  import pandas as pd

  data = pd.read_excel(file_path)

  #
  # Drop row number 0
  data = data.drop(0).reset_index(drop=True)

  # Combine rows 0 and 1 to create a new header
  new_header = data.iloc[0] + ' (' + data.iloc[1] + ')'
  new_header = new_header.str.replace(' \(nan\)', '', regex=True).str.strip() # Clean up ' (nan)' from combined header
  data.columns = new_header

  # Drop the original header rows (now rows 0 and 1)
  data = data.drop([0, 1]).reset_index(drop=True)

  # Convert the first column to datetime objects, coercing errors
  data['DateTime'] = pd.to_datetime(data.iloc[:, 0], errors='coerce')

  data['Time'] = pd.to_datetime(data['DateTime'])
  # data['Hour'] = data['Time'].dt.hour
  # data['Minute'] = data['Time'].dt.minute
  # data['Second'] = data['Time'].dt.second

  # Calculate elapsed time in seconds
  start_time = data['Time'].iloc[0]
  data['ElapsedTime[s]'] = (data['Time'] - start_time).dt.total_seconds()
  data['ElapsedTime[hrs]'] = data['ElapsedTime[s]'] / 3600
  data = data.set_index('Time')

  # Keep only rows with valid datetime entries
  data = data.dropna(subset=['DateTime'])

  # Set 'DateTime' as the index
  data = data.set_index('DateTime')


  new_column_names = {
              'date time': 'Time',
              'NO (ppb)': 'NO[PPB]',
              'NO2 (ppb)': 'NO2[PPB]',
              'NOx (ppb)': 'NOx[PPB]',
              'Outdoor tempearture in deg (degF)': 'T[F]',
              'Ozone (ppb)': 'Ozone[PPB]',
              'SO2 (ppb)': 'SO2[PPB]',
              'Wind Direction (Degrees)': 'WindDir[Degrees]',
              'Wind Speed (mph)': 'WindSpeed[MPH]',
              'PM COARSE (10-2.5) (ug/m3 (LC))':'PM_coarse[ug/m3]',
              'PM10 (ug/m3 (25C))':'PM10[ug/m3]',
              'PM2.5 (ug/m3 (LC))':'PM2.5[ug/m3]',
              'CO (ppm)': 'CO[ppm]',
              'PM2.5 (ug/m3)':'PM2.5[ug/m3]',
              'PM10 (ug/m3)':'PM10[ug/m3]',
              'Relative Humidity (%RH)': 'RH',
              'NOy (ppb)': 'NOy[PPB]',
              'NOy - NO (ppb)': 'NOy-NO[PPB]',
              'PRECIPITATION - RAIN/MELT PCPT (inches)': 'Rain[inches]',
              'PRECIPITATION - SNOW PCPT (inches)': 'Snow[inches]',
              'OZONE (ppb)': 'Ozone[PPB]'
          }

  # Rename the columns present in the DataFrame
  data = data.rename(columns={k: v for k, v in new_column_names.items() if k in data.columns})

  data = data.drop(columns=[float('nan')])

  return data

---
## Now your code starts

In [None]:
# Final Project

