# EDS 220 - Homework 4
## Task 2 - Section 1: Thomas Fire perimeter data retrieval and selection

Author: Nicole Pepper

Link to github repo: https://github.com/nicolelpepper/eds220-hwk4

### About this notebook:

This notebook subsets the 2017 Thomas Fire perimeter from the CAL FIRE historic fire boundary dataset and exports it as a shapefile.

### Highlights: 
- Data wrangling with `pandas`
- Working with geospatial data with `geopandas`
- Exporting and saving geospatial file


### Datasets descriptions:

- The `thomas_fire.shp` dataset is provided by CAL FIRE. It contains historical boundaries for fires (including both natural and prescribed fires) in the state of California. The dataset has a good record of past large fires but is not complete and may be missing some fires. It is stored in my data folder. (Access Date: 11/20/24, [Link to data](https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436/resource/6955eaf7-6452-4922-bc7d-bdac9091c538?inner_span=True))
- The `landsat` dataset is an image from Landsat Collection 2 Level-2, from the Microsof Planetary Computer data catalogue. Landsat Collection 2 Level-2 Science Products consist of atmospherically corrected surface reflectance and surface temperature image data. Collection 2 Level-2 Science Products are available from August 22, 1982 to present. It is accessed through UCSB Workbench 2. (Access Date: 11/20/24, [Link to data](https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2)) It is accessed through UCSB Workbench 2. (Access Date: 11/20/24, [Link to data](https://planetarycomputer.microsoft.com/dataset/landsat-c2-l2))

### Set Up Workspace

In [1]:
# Load Libraries
import os
import pandas as pd
import geopandas as gpd
from shapely.geometry import Polygon
from pyproj import CRS

### Import Data

In [2]:
# Set anaconda environment
os.environ['PROJ_LIB'] = '/opt/anaconda3/share/proj'

# Read in ca fire perimeter data 
ca_fires = gpd.read_file("data/ca_fire_perim/California_Fire_Perimeters.shp")

### Explore CAL FIRE data

In [3]:
# Display all columns in preview
pd.set_option("display.max.columns", None)

# Check out ca fire perimeter data
ca_fires.head()

Unnamed: 0,YEAR_,STATE,AGENCY,UNIT_ID,FIRE_NAME,INC_NUM,ALARM_DATE,CONT_DATE,CAUSE,C_METHOD,OBJECTIVE,GIS_ACRES,COMMENTS,COMPLEX_NA,IRWINID,FIRE_NUM,COMPLEX_ID,DECADES,geometry
0,2023,CA,CDF,SKU,WHITWORTH,4808,2023-06-17,2023-06-17,5,1,1,5.72913,,,{7985848C-0AC2-4BA4-8F0E-29F778652E61},,,2020,"POLYGON ((-13682443.000 5091132.739, -13682445..."
1,2023,CA,LRA,BTU,KAISER,10225,2023-06-02,2023-06-02,5,1,1,13.6024,,,{43EBCC88-B3AC-48EB-8EF5-417FE0939CCF},,,2020,"POLYGON ((-13576727.142 4841226.161, -13576726..."
2,2023,CA,CDF,AEU,JACKSON,17640,2023-07-01,2023-07-02,2,1,1,27.8145,,,{B64E1355-BF1D-441A-95D0-BC1FBB93483B},,,2020,"POLYGON ((-13459243.000 4621236.000, -13458968..."
3,2023,CA,CDF,AEU,CARBON,18821,2023-07-11,2023-07-11,9,1,1,58.7602,,,{CB41DB0A-E4B1-489D-A4EA-738F2CD6DB3B},,,2020,"POLYGON ((-13468077.000 4642260.000, -13467975..."
4,2023,CA,CDF,AEU,LIBERTY,18876,2023-07-11,2023-07-12,14,1,1,70.979,,,{F83F70A4-07A7-40B8-BD51-10CCC1C30D63},,,2020,"POLYGON ((-13468418.000 4614853.000, -13468428..."


In [4]:
# Explore CAL FIRE dimensions/shape
ca_fires.shape

(22261, 19)

In [5]:
# Explore data types
print(ca_fires.dtypes)

YEAR_            int64
STATE           object
AGENCY          object
UNIT_ID         object
FIRE_NAME       object
INC_NUM         object
ALARM_DATE      object
CONT_DATE       object
CAUSE            int64
C_METHOD         int64
OBJECTIVE        int64
GIS_ACRES      float64
COMMENTS        object
COMPLEX_NA      object
IRWINID         object
FIRE_NUM        object
COMPLEX_ID      object
DECADES          int64
geometry      geometry
dtype: object


In [6]:
# Explore date range (min & max)
print("Max Year =", ca_fires['YEAR_'].max())
print("Min Year =", ca_fires['YEAR_'].min())

Max Year = 2023
Min Year = 0


In [7]:
# ---- Check crs ----

# Check if fire data is geographic
print("Is the ca_fires CRS geographic?", ca_fires.crs.is_geographic)

# Check if fire data is projected 
print("Is the ca_fires CRS projected?", ca_fires.crs.is_projected)

# Print the CRS name
print("The ca_fires CRS is:", CRS(ca_fires.crs).name)

Is the ca_fires CRS geographic? False
Is the ca_fires CRS projected? True
The ca_fires CRS is: WGS_1984_Web_Mercator_Auxiliary_Sphere


#### Summary of Data:
My preliminary data exploration found that the CAL Fire data is a projected geospatial layer. It's CRS is WGS 1984 Web Mercator Auxilary Sphere. I retrieved the value types for each of the columns. The data frame has a total of 19 columns and 22261 observations (recorded fires). Some of the attributes that I think will be helpful for this study include the 'YEAR_', 'FIRE_NAME', and 'GIS_ACRES' columns. 

### Subset CAL FIRE Data to 2017 Thomas Fire

In [8]:
# Select and subset data for year = 2017 and name = Thomas Fire 
thomas_fire = ca_fires[(ca_fires["YEAR_"] == 2017) & (ca_fires["FIRE_NAME"] == "THOMAS")]

thomas_fire

Unnamed: 0,YEAR_,STATE,AGENCY,UNIT_ID,FIRE_NAME,INC_NUM,ALARM_DATE,CONT_DATE,CAUSE,C_METHOD,OBJECTIVE,GIS_ACRES,COMMENTS,COMPLEX_NA,IRWINID,FIRE_NUM,COMPLEX_ID,DECADES,geometry
2654,2017,CA,USF,VNC,THOMAS,3583,2017-12-04,2018-01-12,9,7,1,281791.0,CONT_DATE based on Inciweb,,,,,2010,"MULTIPOLYGON (((-13316089.016 4088553.040, -13..."


### Export Thomas Fire boundary as shapefile to data folder

I chose to save the Thomas Fire as a shapefile since it is an easy-to-use and tidy geospatial data format that I like to use.

In [9]:
# Create file path to data folder 
filepath = "/Users/npepper/meds/eds-220/eds220-2024-hw/eds220-hwk4/data/thomas_fire"

# Save Thomas fire boundary to geospatial shape file 
thomas_fire.to_file(f"{filepath}/thomas_fire.shp", driver='ESRI Shapefile')

### References:

CAL FIRE (2024) *California Fire Perimeters (all)* [Data file] Available from: https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436 Access date: 11/20/24

Carmen Galaz García (2024) *UCSB MEDS - 220 - Working With Environmental Datasets * [Source of Homework Assignment]. Course Website: https://meds-eds-220.github.io/MEDS-eds-220-course/ Access date: 11/20/24