# Thomas fire

Author: Patricia Park

Repository: https://github.com/p-park6/aqi-false-color.git

## Highlights in this notebook:
- Data tidying and filtering with `pandas`
- Data statistical analysis
- Geospacial data tidying with `geopandas`
- Visualizing data through the forms of a graph and map

## About the Data
Three types of datasets were uses:
1. The first dataset covers the [Air Quality Index (AQI)](https://www.airnow.gov/aqi/aqi-basics/) found on the [US Environmental Protection Agency](https://www.epa.gov) website. Here, we will use the data provided to be able to visualize the air quality over months, and if there were any abnormal spikes that appeared during the [Thomas Fire](https://keyt.com/news/local-news/top-stories/2022/12/04/thomas-fire-5th-year-anniversary/) in 2017.

2. The second dataset includes a collection of bands (including red, green, blue, near-infrared and shortwave infrared). This data is from the [Landsat Collection 2 Surface Reflectance](https://www.usgs.gov/landsat-missions/landsat-collection-2-surface-reflectance), collected by the Landsat 8 satellite. The data was accessed and pre-processed in the Microsoft Planetary Computer to remove data outside land and coarsen the spatial resolution ([Landsat Collection in MPC](https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2)). Data should be used for visualization purposes only.
- Additional information:
    -[Landsat Satellite homepage](https://www.usgs.gov/landsat-missions)


3. The third dataset include the shapefile of the fire perimeters in California during 2017. The [complete file can be accessed in the CA state geoportal](https://gis.data.ca.gov/datasets/CALFIRE-Forestry::california-fire-perimeters-all-1/about).

## Final outputs:
This notebook will output a graph of the AQI over a number of months in Santa Barbara as well as a map showing the area of where the Thomas fire took place.

## Import libraries

In [1]:
import os
import numpy as np
import pandas as pd

import geopandas as gpd
import xarray as xr
import rioxarray as rioxr
import matplotlib.patches as mpatches
from matplotlib.markers import MarkerStyle

from shapely import Point
from shapely import Polygon
from shapely import box
import matplotlib.pyplot as plt
import matplotlib.lines as mlines

### Read in necessary data

In [None]:
# read in daily AQI 2017 zip file from url
aqi_17 = pd.read_csv("https://aqs.epa.gov/aqsweb/airdata/daily_aqi_by_county_2017.zip")

# read in daily AQI 2018 zip file from url
aqi_18 = pd.read_csv("https://aqs.epa.gov/aqsweb/airdata/daily_aqi_by_county_2018.zip")

# Read in bands dataset
#create pathway
ca_bands_fp = os.path.join(os.getcwd(),'data','landsat8-2018-01-26-sb-simplified.nc')
#read data using pathway
ca_fires_bands_2017 = rioxr.open_rasterio(ca_bands_fp)

# Read in California perimeter dataset
ca_fires_perimeter_2017= gpd.read_file(os.path.join(os.getcwd(),'data', 'California_Fire_Perimeters_2017', 'California_Fire_Perimeters_2017.shp'))

## Prepare AQI data

### Data Exploration:

In [2]:
#look at the first 5 observations of aqi 17
aqi_17.head()

NameError: name 'aqi_17' is not defined

In [4]:
#look at the first 5 observations of aqi 18
aqi_18.head()

NameError: name 'aqi_18' is not defined

### Merge datasets together

To make data easier to access, we will merge the two datasets together

In [5]:
#join the two datasets together
aqi = pd.concat([aqi_17, aqi_18])
#print to see if the two were joined together sucessfully
aqi

NameError: name 'aqi_17' is not defined

### Reformat columns

For easier access, we will rename and reformat the column names

In [6]:
# change column names to lowercase and replace spaces with '_'
aqi.columns = aqi.columns.str.lower().str.replace(' ','_')

NameError: name 'aqi' is not defined