In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("hwk4-task2-fire-perimeter.ipynb")

# Task 2: Visualizing fire perimeter for the Thomas fire

## Instructions

- First, update the following cell to have a link to *your* Homework 4 GitHub repository:

**UPDATE THIS LINK**
https://github.com/YoselynR/eds220-hwk4/tree/main


- Review the [complete rubric for this task](https://docs.google.com/document/d/1-Zm731BLVCs1MXHT5R1H9rr6uvcwHnMZQ2q8xkQ_fWM/edit?tab=t.0) before starting.

- **Meaningful commits should be made every time you finish a major step.** We'll check your repository and view the commit history.

- Comment mindfully in a way that enriches your code. Comments should follow best practices.

- **Do not update the top cell with the `otter` import**, this is used internally for grading.

- Delete all the comments initially included in this notebook (ex: `# Your code here`).


## About the data
In this task you will use one datset:

### First dataset: Fire perimeter

The first dataset is [historical open-access data about fire perimeters in California](https://catalog.data.gov/dataset/california-fire-perimeters-all-b3436) from Data.gov.

## 2. Fire perimeter data retrieval and selection

### Import libraries

In [11]:
import requests
import zipfile
import io
import os
import geopandas as gpd
import shutil

### Load in temporary data

In [12]:
# URL of the ZIP file
url = 'https://gis.data.cnra.ca.gov/api/download/v1/items/e3802d2abf8741a187e73a9db49d68fe/shapefile?layers=0'

# Extraction directory
extraction_dir = "CA_perimeter_dir"
os.makedirs(extraction_dir, exist_ok=True)

# Download the ZIP, extract all files, and print the extracted files
with zipfile.ZipFile(io.BytesIO(requests.get(url).content)) as zip_ref:
    zip_ref.extractall(extraction_dir)

# List and print extracted files
print(f"Extracted files: {os.listdir(extraction_dir)}")

# Load the shapefile into a GeoDataFrame
shapefile_base = "California_Fire_Perimeters_(all)"  # Base name of the shapefile
shapefile_path = os.path.join(extraction_dir, f"{shapefile_base}.shp")

Extracted files: ['California_Fire_Perimeters_(all).prj', 'California_Fire_Perimeters_(all).shp.xml', 'California_Fire_Perimeters_(all).cpg', 'California_Fire_Perimeters_(all).shx', 'California_Fire_Perimeters_(all).dbf', 'California_Fire_Perimeters_(all).shp']


## a. Dataset description
Explore the data and write a brief summary of the information you obtained from the preliminary information. Your summary should include the CRS of the data and whether this is projected or geographic.

The data has a crs of 3857 or meb mercator which has units in metres. The crs type is projected. The data is a geodataframe with 22,261 entries and its geometry type is a ploygon. The entries correpsond to the fires and the polygon makes snese for mapping. The columns correspond to YEAR_, STATE, AGENCY, UNIT_ID, FIRE_NAME, INC_NUM, ALARM_DATE, CONT_DATE, CAUSE, C_METHOD, OBJECTIVE, GIS_ACRES, COMMENTS, COMPLEX_NA, IRWINID, FIRE_NUM, COMPLEX_ID, DECADES, and geometry. The data exploarion further outputs the datatype of each column along with the non-null count of each column.  

### Data exploration

In [13]:
# Read the shapefile into a GeoDataFrame and print info
gdf = gpd.read_file(shapefile_path)
# Save crs and geometry type
crs = gdf.crs
geometry_type = gdf.geometry.type.iloc[0]
# Print crs and geometry type
print(f"CRS: {crs}")
print(f"Geometry Type: {geometry_type}")
# Print info and head
print(gdf.info())
print(gdf.head())
# If else statement to print whether projected or geographic
if crs.is_geographic:
    crs_type = "Geographic (using degrees)"
else:
    crs_type = "Projected"
print(f"\nCRS Type: {crs_type}")

  return ogr_read(


CRS: EPSG:3857
Geometry Type: Polygon
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 22261 entries, 0 to 22260
Data columns (total 19 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   YEAR_       22261 non-null  int32         
 1   STATE       22261 non-null  object        
 2   AGENCY      22208 non-null  object        
 3   UNIT_ID     22194 non-null  object        
 4   FIRE_NAME   15672 non-null  object        
 5   INC_NUM     21286 non-null  object        
 6   ALARM_DATE  22261 non-null  datetime64[ms]
 7   CONT_DATE   22261 non-null  datetime64[ms]
 8   CAUSE       22261 non-null  int32         
 9   C_METHOD    22261 non-null  int32         
 10  OBJECTIVE   22261 non-null  int32         
 11  GIS_ACRES   22261 non-null  float64       
 12  COMMENTS    2707 non-null   object        
 13  COMPLEX_NA  596 non-null    object        
 14  IRWINID     2695 non-null   object        
 15  FIRE_NUM    17147 non-nu

In [14]:
# Look at all fire names with .tolist() but to avoid the huge output I removed it
unique_fire_names = gdf['FIRE_NAME'].unique()
print(unique_fire_names)

['WHITWORTH' 'KAISER' 'JACKSON' ... 'BALD HILL' 'FREITAS RANCH'
 'GOLF CLUB']


## b. Fire perimeter selection
From your fire perimeter data, select the Thomas Fire boundary. The fire occurred in 2017.

In [15]:
# Look for Thomas fire in FIRE_NAME column with if else statement because my eyes could not read through the list
if "THOMAS" in gdf['FIRE_NAME'].unique():
    print("Thomas fire is present in the dataset.")
else:
    print("Thomas fire is not present in the dataset.")

Thomas fire is present in the dataset.


In [16]:
# Select Thomas fire boundary in 2017
thomas_fire_gdf = gdf[(gdf['FIRE_NAME'] == "THOMAS") & (gdf['YEAR_'] == 2017)]

## c. File management
Create a `data/` directory inside your `eds220-hwk4` directory. Save only the 2017 Thomas Fire boundary as a geospatial file in the format of your choosing. The file should go into the `data/` directory in your repository.


In [19]:
# Shape file path to data folder, but create folder if it doesn't exist
shapefile_output = os.path.join("data", "Thomas_fire_perimeter.shp")
os.makedirs("data", exist_ok=True)

# Save the shapefile if not empty, and delete other files
if not thomas_fire_gdf.empty:
    thomas_fire_gdf.to_file(shapefile_output)
    print(f"Shapefile for Thomas fire saved at: {shapefile_output}")

Shapefile for Thomas fire saved at: data/Thomas_fire_perimeter.shp


  ogr_write(
  ogr_write(


In [18]:
# Delete directory file that contains all CA fires
shutil.rmtree(extraction_dir)
print(f"Extraction directory '{extraction_dir}' has been deleted.")

Extraction directory 'CA_perimeter_dir' has been deleted.


## d. Format explanation
Briefly explain your reasoning for selecting that specific file format.

The .shp is the one that has the boundary parameters for the Thomas fire that can help visualize the area of the fire. The geometries are used in the next notebook on top of false color imagery.