# EDS 220 - Assignment 4 - Fire Perimeter Selection
### Student author: Bailey Jørgensen 

Repository Link: https://github.com/jorb1/eds220-hwk4

## Task 2: Visualizing fire scars through false color

### Background:

The Thomas Fire, which burned over 280,000 acres in Ventura and Santa Barbara counties in December 2017, was one of California’s largest wildfires at the time. It caused widespread ecological damage, displaced communities, and left lasting environmental impacts.

In this task, I will find and isolate the perimeter of the Thomas Fire, using open source data. I will then be able to use the Thomas Fire perimeter data in further analysis of the effects of the fire on Santa Barbara ecology. I will save this perimeter data as a file in my repository, so that I can use it another notebook and complete my analysis. 

**About the data**

In this task I will use historical open-access data about fire perimeters in California. There are several datasets with this information online. The dataset that I found is from data.gov at this link: https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436. It was a particularly useful site, as there were multiple filetypes to choose from. 


### First up in my Analysis: 
2. Fire perimeter data retrieval and selection

a. To begin, I will do some exploratory data analysis to get a sense of the dataset I am using. I will ensure that I know the CRS of the data, for use in further joining and analysis. 



In [1]:
# Load libraries

import os
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd

# Read in data 

fp_perimeter = os.path.join('data', 'California_Fire_Perimeters_(all).shp')
perimeter = gpd.read_file(fp_perimeter)

ERROR 1: PROJ: proj_create_from_database: Open of /opt/anaconda3/envs/eds220-env/share/proj failed


In [2]:
perimeter.head(3)

Unnamed: 0,YEAR_,STATE,AGENCY,UNIT_ID,FIRE_NAME,INC_NUM,ALARM_DATE,CONT_DATE,CAUSE,C_METHOD,OBJECTIVE,GIS_ACRES,COMMENTS,COMPLEX_NA,IRWINID,FIRE_NUM,COMPLEX_ID,DECADES,geometry
0,2023,CA,CDF,SKU,WHITWORTH,4808,2023-06-17,2023-06-17,5,1,1,5.72913,,,{7985848C-0AC2-4BA4-8F0E-29F778652E61},,,2020,"POLYGON ((-13682443.000 5091132.739, -13682445..."
1,2023,CA,LRA,BTU,KAISER,10225,2023-06-02,2023-06-02,5,1,1,13.6024,,,{43EBCC88-B3AC-48EB-8EF5-417FE0939CCF},,,2020,"POLYGON ((-13576727.142 4841226.161, -13576726..."
2,2023,CA,CDF,AEU,JACKSON,17640,2023-07-01,2023-07-02,2,1,1,27.8145,,,{B64E1355-BF1D-441A-95D0-BC1FBB93483B},,,2020,"POLYGON ((-13459243.000 4621236.000, -13458968..."


In [3]:
# Figure out the dimensions of the dataframe
print("Shape of the data:", perimeter.shape)

# Figure out if the columns are the expected datatypes
print("Data types:", perimeter.dtypes)

Shape of the data: (22261, 19)
Data types: YEAR_            int64
STATE           object
AGENCY          object
UNIT_ID         object
FIRE_NAME       object
INC_NUM         object
ALARM_DATE      object
CONT_DATE       object
CAUSE            int64
C_METHOD         int64
OBJECTIVE        int64
GIS_ACRES      float64
COMMENTS        object
COMPLEX_NA      object
IRWINID         object
FIRE_NUM        object
COMPLEX_ID      object
DECADES          int64
geometry      geometry
dtype: object


In [4]:
# Explore data CRS
perimeter.crs

<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World between 85.06°S and 85.06°N.
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [5]:
# Find out if its projected or geographic
perimeter.crs.is_projected

True

From this data exploration, I learned that the dataset is much larger than I need, but does contain useful information in addition to the geometries, such as acres burned, cause, etc. I learned that the names of the fires are in all capitals, and that the year numbers are INT64, so I can treat them as numeric values. Finally, I leaned that the CRS is WGS 84, and that is is projected data, rather than geographic.


b. From this fire perimeter data, select the Thomas Fire boundary. The fire occurred in 2017.

In [6]:
# First, make the column names easier to work with
perimeter.columns = perimeter.columns.str.lower()

# Filter data to only include the Thomas Fire boudnary in 2017
thomas = perimeter[(perimeter['fire_name'] == "THOMAS") & (perimeter['year_'] == 2017)]

thomas

Unnamed: 0,year_,state,agency,unit_id,fire_name,inc_num,alarm_date,cont_date,cause,c_method,objective,gis_acres,comments,complex_na,irwinid,fire_num,complex_id,decades,geometry
2654,2017,CA,USF,VNC,THOMAS,3583,2017-12-04,2018-01-12,9,7,1,281791.0,CONT_DATE based on Inciweb,,,,,2010,"MULTIPOLYGON (((-13316089.016 4088553.040, -13..."


c. Save only the 2017 Thomas Fire boundary as a geospatial file in the format of my choosing. The file should go into the data/ directory in my repository.

In [8]:
# Save the fire boundary as a file that can go into my repository
# Save the filtered GeoDataFrame as a GeoJSON file
path = 'data/thomas.geojson'
thomas.to_file(path, driver='GeoJSON')

d. I chose to use a GeoJSON file format for my perimeter boundary, because it is a common and useful "open format for encoding vector points and their attributes". It comes in one file, as compared to .shp files, which have many dependencies. It requires the data be in WGS84, and since I already verified that this data is in that CRS, it seems like the best possible option for this analysis.  

### Citations:

CAL Fire. “California Fire Perimeters (All).” Data.gov. Metadata created March 30, 2024, updated May 14, 2024. https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436.

C. Galaz García, EDS 220 - Working with Environmental Datasets, Course Notes. 2024. [Online]. Available: https://meds-eds-220.github.io/MEDS-eds-220-course/book/preface.html