# Thomas Fire 2017 Fire Perimeter

### *Author*: Joshua Paul Cohen

#### *GitHub Repository*: https://github.com/silkieMoth/eds220-hwk4

## About

### Purpose

The goal is to extract the 2017 Thomas Fire that burned in Ventura and Santa Barbara counties.


### Analysis Highlights

- Preliminary exploration of the fire perimeter shapefile and proceeding summary.
    - Data contains `year` and `fire_name` column.
    - CRS is WGS 1984.
    - CRS is projected.
- Saving subset of 2017 Thomas Fire to new shapefile.
    - Filtered for all fire names with 'thomas' and all years with 2017.
    - Only one row returned based on these selections.


### Dataset Description

- `California_Fire_Perimeters_(1950%2B).shp`
    - Data on fires dating back to 1898.
    - Contains columns for:
        - Year
        - State
        - Name of the fire
        - Agency responding to the fire
        - Fire discovery date
        - Containment date
        - Cause of fire
        - Perimeter data collection method
        - Fire response tactic
        - Perimeter area
        - Number according to historical numbering system for fires
    - Over 15000 entries.
    - Each row cooresponds to single polygon representing a perimeter for a given fire.


### References to Datasets

The *California Historical Fire Perimeters* dataset was acquired from [California Natural Resources Agency](https://gis.data.cnra.ca.gov/maps/CALFIRE-Forestry::california-historical-fire-perimeters/about).

    California Department of Forestry and Fire Protection (2024), California Historical Fire Perimeters [dataset]. California Natural Resources Agency.


## Data Loading

In [1]:
import geopandas as gpd
import pandas as pd
import os
import matplotlib.pyplot as plt

os.environ['PROJ_LIB'] = '/opt/anaconda3/share/proj'

fp = os.path.join('/', 
             'Users', 
             'jpcohen', 
             'EDS-220', 
             'eds220-hwk4', 
             'data', 
             'California_Fire_Perimeters_(1950+)')

fire_perim = gpd.read_file(fp)

## Cleaning

By removing NA's, simplifying column names, and making all values lowercase, subsetting become's easier and is done with higher confidence.

In [2]:
# Make columns lowercase
fire_perim.columns = fire_perim.columns.str.lower()

# Make values lowercase
fire_perim = fire_perim.map(lambda x: x.lower() if type(x) == str else x)

# Tunicate underscore from year col
fire_perim = fire_perim.rename(columns = {'year_': 'year'})

# Make year col int
fire_perim['year'] = fire_perim['year'].astype('Int64')

# Drop cols with NA in fire name
fire_perim = fire_perim.dropna(subset = 'fire_name')

## Preliminary Data Exploration

#### Obtaining this infomation:
- Column names
- Data types
- Minimum year in dataset
- Total number of fires
- Number of rows that fulfil subset condition
- CRS
    - Value
    - Projected or geographic

In [3]:
fire_perim.head(3)

Unnamed: 0,objectid,year,state,agency,unit_id,fire_name,inc_num,alarm_date,cont_date,cause,...,gis_acres,comments,complex_na,irwinid,fire_num,complex_id,decades,shape__are,shape__len,geometry
0,1,2023,ca,cdf,sku,whitworth,4808,2023-06-17,2023-06-17,5.0,...,5.729125,,,{7985848c-0ac2-4ba4-8f0e-29f778652e61},,,2020.0,41407.839844,1247.166034,"POLYGON ((-13682443.000 5091132.739, -13682445..."
1,2,2023,ca,lra,btu,kaiser,10225,2023-06-02,2023-06-02,5.0,...,13.60238,,,{43ebcc88-b3ac-48eb-8ef5-417fe0939ccf},,,2020.0,93455.878906,1285.51455,"POLYGON ((-13576727.142 4841226.161, -13576726..."
2,3,2023,ca,cdf,aeu,jackson,17640,2023-07-01,2023-07-02,2.0,...,27.81446,,,{b64e1355-bf1d-441a-95d0-bc1fbb93483b},,,2020.0,183028.5,2697.587429,"POLYGON ((-13459243.000 4621236.000, -13458968..."


In [4]:
# Column numbers and names, null counts and dtypes
print('Dataset info: ')
print(fire_perim.info(), '\n')

# Min value in year column
print('Earliest year in dataset: ', int(fire_perim['year'].min()), '\n')

# Number of unique values in fire name column
print('Number of fires: ', fire_perim.shape[0], '\n')

# Number of rows for Thomas fire in 2017
print('Number of rows for 2017 Thomas Fire: ', fire_perim[(fire_perim['fire_name'].str.contains('thomas')) & 
           (fire_perim['year'] == 2017)].shape[0], '\n')

# CRS info
print('CRS: ') 
print('Is projected?: ', fire_perim.crs.is_projected)
print('Is geographic?: ', fire_perim.crs.is_geographic, '\n')
fire_perim.crs

Dataset info: 
<class 'geopandas.geodataframe.GeoDataFrame'>
Index: 15672 entries, 0 to 22260
Data columns (total 22 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   objectid    15672 non-null  int64   
 1   year        15602 non-null  Int64   
 2   state       15672 non-null  object  
 3   agency      15667 non-null  object  
 4   unit_id     15653 non-null  object  
 5   fire_name   15672 non-null  object  
 6   inc_num     15005 non-null  object  
 7   alarm_date  15150 non-null  object  
 8   cont_date   9123 non-null   object  
 9   cause       15658 non-null  float64 
 10  c_method    9329 non-null   float64 
 11  objective   15498 non-null  float64 
 12  gis_acres   15672 non-null  float64 
 13  comments    2191 non-null   object  
 14  complex_na  596 non-null    object  
 15  irwinid     2695 non-null   object  
 16  fire_num    10696 non-null  object  
 17  complex_id  360 non-null    object  
 18  decades     15602 non-null  

<Projected CRS: PROJCS["WGS_1984_Web_Mercator_Auxiliary_Sphere",GE ...>
Name: WGS_1984_Web_Mercator_Auxiliary_Sphere
Axis Info [cartesian]:
- [east]: Easting (Meter)
- [north]: Northing (Meter)
Area of Use:
- undefined
Coordinate Operation:
- name: unnamed
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

### Fire Perimeter Data Summary

This California Fire Perimeters shapefile contains 22261 rows and 21 columns, the ones relevant to our analysis being `year` and `fire_name`. There are 9108 fires in it, with the oldest one being dated year 1878. It's CRS is WGS 1984, but a version that is projected with a meter unit. There is only one row for Thomas Fire in the year 2017.

## Subset and Save

Subset for the 2017 Thomas Fire record and save to `.shp` format.

In [5]:
# Subset for 2017 Thomas Fire
thomas_perim = fire_perim[(fire_perim['fire_name'].str.contains('thomas')) & 
           (fire_perim['year'] == 2017)]

# Make folder to save shapefile
if not os.path.exists('data/thomas_perim'):
    os.makedirs('data/thomas_perim')

# Save subset as .shp
thomas_perim.to_file('data/thomas_perim/thomas_perim.shp')

#### Why save to `.shp`?

We are only looking to get a representation of the 2017 Thomas Fire. In otherwords, all we need is a polygon for the boundary containing attributes. Exporting to `.shp` is the fastest and simplest format to use. It is also good enough to any analyses needed when crossreferencing it with landsat NetCDF file.