## Fire Perimeter Data

Data was collected from `data.gov`, which is an official website of the United States government. The data used included fire perimeter data for all California fires previously recorded. It should be noted that this data is not complete as some historical information is missing, damaged, or lost. This data is updated annually in the spring with fire data collected from the present year. 

## Step 1: Load libraries 

In [1]:
import os 
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd

## Step 2: Load data

In [2]:
perimeters = gpd.read_file('data/California_Fire_Perimeters/California_Fire_Perimeters_(all).shp')

ERROR 1: PROJ: proj_create_from_database: Open of /opt/anaconda3/envs/eds220-env/share/proj failed


## Step 3: Preliminary Exploration & Summary

The first thing you do after loading in your libraries and data should be to explore it. Some questions you want to answer are:
- What kind of data are you working with? 
- How many columns are in you data? 
- What is the type of each column? 

There are different ways of looking at preliminary data. It depends on the question that you want to answer. In this case, we are interested in the Ventura County Thomas Fire from 2017. We should expect to see at least a year, state, fire name, and geometry column. 


In [19]:
# Check the head to view initial column names and cell outputs
perimeters.head()

Unnamed: 0,YEAR_,STATE,AGENCY,UNIT_ID,FIRE_NAME,INC_NUM,ALARM_DATE,CONT_DATE,CAUSE,C_METHOD,OBJECTIVE,GIS_ACRES,COMMENTS,COMPLEX_NA,IRWINID,FIRE_NUM,COMPLEX_ID,DECADES,geometry
0,2023,CA,CDF,SKU,WHITWORTH,4808,2023-06-17,2023-06-17,5,1,1,5.72913,,,{7985848C-0AC2-4BA4-8F0E-29F778652E61},,,2020,"POLYGON ((-13682443.000 5091132.739, -13682445..."
1,2023,CA,LRA,BTU,KAISER,10225,2023-06-02,2023-06-02,5,1,1,13.6024,,,{43EBCC88-B3AC-48EB-8EF5-417FE0939CCF},,,2020,"POLYGON ((-13576727.142 4841226.161, -13576726..."
2,2023,CA,CDF,AEU,JACKSON,17640,2023-07-01,2023-07-02,2,1,1,27.8145,,,{B64E1355-BF1D-441A-95D0-BC1FBB93483B},,,2020,"POLYGON ((-13459243.000 4621236.000, -13458968..."
3,2023,CA,CDF,AEU,CARBON,18821,2023-07-11,2023-07-11,9,1,1,58.7602,,,{CB41DB0A-E4B1-489D-A4EA-738F2CD6DB3B},,,2020,"POLYGON ((-13468077.000 4642260.000, -13467975..."
4,2023,CA,CDF,AEU,LIBERTY,18876,2023-07-11,2023-07-12,14,1,1,70.979,,,{F83F70A4-07A7-40B8-BD51-10CCC1C30D63},,,2020,"POLYGON ((-13468418.000 4614853.000, -13468428..."


In [20]:
# Check column names, types, and non-null values
perimeters.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 22261 entries, 0 to 22260
Data columns (total 19 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   YEAR_       22261 non-null  int64   
 1   STATE       22261 non-null  object  
 2   AGENCY      22208 non-null  object  
 3   UNIT_ID     22194 non-null  object  
 4   FIRE_NAME   15672 non-null  object  
 5   INC_NUM     21286 non-null  object  
 6   ALARM_DATE  22261 non-null  object  
 7   CONT_DATE   22261 non-null  object  
 8   CAUSE       22261 non-null  int64   
 9   C_METHOD    22261 non-null  int64   
 10  OBJECTIVE   22261 non-null  int64   
 11  GIS_ACRES   22261 non-null  float64 
 12  COMMENTS    2707 non-null   object  
 13  COMPLEX_NA  596 non-null    object  
 14  IRWINID     2695 non-null   object  
 15  FIRE_NUM    17147 non-null  object  
 16  COMPLEX_ID  360 non-null    object  
 17  DECADES     22261 non-null  int64   
 18  geometry    22261 non-null  geometry
d

In [21]:
# We are concerned with the Thomas fire, but let's check how many fires are in the dataset
perimeters['FIRE_NAME'].nunique()

9108

In [3]:
# When working with spatial data, it is important to know what CRS you are using
perimeters.crs

<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World between 85.06°S and 85.06°N.
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

### Summary of data

From our preliminary data exploration, we confirm we have the column types that we need in order to continue with our workflow. These being year, state, fire name, and geometries. From these columns we can narrow down to the Thomas fire from 2017.  

## Step 4: Filter to Thomas Fire

Use the following code to filter our dataframe to the 2017 Thomas fire. 

In [4]:
# Select the Thomas Fire Boundary in 2017
thomas_perimeter = perimeters[(perimeters['FIRE_NAME'] == 'THOMAS') 
                              & (perimeters['YEAR_'] == 2017)]

# Check the head to confirm the output that you want 
thomas_perimeter.head()

Unnamed: 0,YEAR_,STATE,AGENCY,UNIT_ID,FIRE_NAME,INC_NUM,ALARM_DATE,CONT_DATE,CAUSE,C_METHOD,OBJECTIVE,GIS_ACRES,COMMENTS,COMPLEX_NA,IRWINID,FIRE_NUM,COMPLEX_ID,DECADES,geometry
2654,2017,CA,USF,VNC,THOMAS,3583,2017-12-04,2018-01-12,9,7,1,281791.0,CONT_DATE based on Inciweb,,,,,2010,"MULTIPOLYGON (((-13316089.016 4088553.040, -13..."


## Step 5: Export the newly created data frame for use

In [26]:
# Save only the 2017 Thomas Fire boundary as a geospatial file

thomas_perimeter.to_file('thomas_perimeter.shp')

I decided to export it as a `shp` file because that is how I originally read in the data. Consistency is one of the most important things when creating a workflow for reproduceability. If we share our code with others, we want to make it as easy as possible to follow along. This is why it is so important to follow best coding practices. We also have been working wthi `shp` files a lot more than `geojson` files so it makes sense to keep it the same. 