Author: Yos Ramirez

# Fire perimeter data retrieval and selection (Thomas Fire)

## About the notebook

**Link to *my* Thomas Fire Analysis GitHub repository:**

https://github.com/YoselynR/Thomas-Fire-Analysis-YR/tree/main

### Purpose

The purpose of this notebook is to save a shapefile for the Thomas Fire perimeter. The Thomas fire perimeter will be selected from fire data avaiable for all of California.

### Highlights
- Data Exploration: The notebook includes a basic exploration of the dataset, printing key metadata like the coordinate reference system (CRS), geometry type, and column information, which helps verify the data quality and prepare it for analysis.
- Data File Management: The script downloads the shapefile from a zip archive, extracts it, and manages file storage by creating and cleaning up directories as necessary.
- Fire Perimeter Selection: The dataset contains fire perimeter data for all of California. The notebook filters the data to isolate the perimeter of the Thomas Fire, which occurred in 2017.
- Shapefile Export: After selecting the relevant fire perimeter data, the Thomas Fire boundary is saved as a new shapefile (.shp) for future use, such as in spatial analysis or mapping.


## About the data
In this task you will use one dataset: which contains data for fires in all of California. 

### First dataset: Fire perimeter

The first dataset is [historical open-access data about fire perimeters in California](https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436) from Data.gov.

U.S. Government. (n.d.). California fire perimeters (ALL) [Data set]. Data.gov. Retrieved November 23, 2024, from https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436

## Fire perimeter data retrieval and selection

### Import libraries

In [None]:
import requests
import zipfile
import io
import os
import geopandas as gpd
import shutil

### Load in temporary data

In [None]:
# URL of the ZIP file
url = 'https://gis.data.cnra.ca.gov/api/download/v1/items/e3802d2abf8741a187e73a9db49d68fe/shapefile?layers=0'

# Extraction directory
extraction_dir = "CA_perimeter_dir"
os.makedirs(extraction_dir, exist_ok=True)

# Download the ZIP, extract all files, and print the extracted files
with zipfile.ZipFile(io.BytesIO(requests.get(url).content)) as zip_ref:
    zip_ref.extractall(extraction_dir)

# List and print extracted files
print(f"Extracted files: {os.listdir(extraction_dir)}")

# Load the shapefile into a GeoDataFrame
shapefile_base = "California_Fire_Perimeters_(all)"  # Base name of the shapefile
shapefile_path = os.path.join(extraction_dir, f"{shapefile_base}.shp")

### Dataset description
Data exploration and a brief summary of the information I obtained from the preliminary information. Summary includes the CRS of the data and whether this is projected or geographic.

The data has a crs of 3857 or meb mercator which has units in metres. The crs type is projected. The data is a geodataframe with 22,261 entries and its geometry type is a ploygon. The entries correpsond to the fires and the polygon makes snese for mapping. The columns correspond to YEAR_, STATE, AGENCY, UNIT_ID, FIRE_NAME, INC_NUM, ALARM_DATE, CONT_DATE, CAUSE, C_METHOD, OBJECTIVE, GIS_ACRES, COMMENTS, COMPLEX_NA, IRWINID, FIRE_NUM, COMPLEX_ID, DECADES, and geometry. The data exploarion further outputs the datatype of each column along with the non-null count of each column.  

### Data exploration

In [None]:
# Read the shapefile into a GeoDataFrame and print info
gdf = gpd.read_file(shapefile_path)
# Save crs and geometry type
crs = gdf.crs
geometry_type = gdf.geometry.type.iloc[0]
# Print crs and geometry type
print(f"CRS: {crs}")
print(f"Geometry Type: {geometry_type}")
# Print info and head
print(gdf.info())
print(gdf.head())
# If else statement to print whether projected or geographic
if crs.is_geographic:
    crs_type = "Geographic (using degrees)"
else:
    crs_type = "Projected"
print(f"\nCRS Type: {crs_type}")

In [None]:
# Look at all fire names with .tolist() but to avoid the huge output I removed it
unique_fire_names = gdf['FIRE_NAME'].unique()
print(unique_fire_names)

### Fire perimeter selection
Selecting the Thomas Fire boundary that occurred in 2017.

In [None]:
# Look for Thomas fire in FIRE_NAME column with if else statement because my eyes could not read through the list
if "THOMAS" in gdf['FIRE_NAME'].unique():
    print("Thomas fire is present in the dataset.")
else:
    print("Thomas fire is not present in the dataset.")

In [None]:
# Select Thomas fire boundary in 2017
thomas_fire_gdf = gdf[(gdf['FIRE_NAME'] == "THOMAS") & (gdf['YEAR_'] == 2017)]

### File management
Creating a `data/` directory inside my `eds220-hwk4` directory. Saving only the 2017 Thomas Fire boundary as a geospatial file in the shapefile format. The file is in the `data/` directory in my repository.


In [None]:
# Shape file path to data folder, but create folder if it doesn't exist
shapefile_output = os.path.join("data", "Thomas_fire_perimeter.shp")
os.makedirs("data", exist_ok=True)

# Save the shapefile if not empty, and delete other files
if not thomas_fire_gdf.empty:
    thomas_fire_gdf.to_file(shapefile_output)
    print(f"Shapefile for Thomas fire saved at: {shapefile_output}")

In [None]:
# Delete directory file that contains all CA fires
shutil.rmtree(extraction_dir)
print(f"Extraction directory '{extraction_dir}' has been deleted.")

### Format explanation
Briefly explaining my reasoning for selecting shapefile format.

The .shp is the one that has the boundary parameters for the Thomas fire that can help visualize the area of the fire. The geometries are used in the next notebook on top of false color imagery.

#### References 

Requests Software Foundation. (n.d.). Requests: HTTP for humans. Requests. Retrieved from https://requests.readthedocs.io

Python Software Foundation. (n.d.). zipfile — Zip archive reading and writing. In Python Standard Library. Retrieved from https://docs.python.org/3/library/zipfile.html

Python Software Foundation. (n.d.). io — Core tools for working with streams. In Python Standard Library. Retrieved from https://docs.python.org/3/library/io.html

Python Software Foundation. (n.d.). os — Miscellaneous operating system interfaces. In Python Standard Library. Retrieved from https://docs.python.org/3/library/os.html

GeoPandas Development Team. (2024). GeoPandas: Python tools for geographic data. GeoPandas. Retrieved from https://geopandas.org

Python Software Foundation. (n.d.). shutil — High-level file operations. In Python Standard Library. Retrieved from https://docs.python.org/3/library/shutil.html