# Process Data 
This notebook loads wildfire data available from the combined wildfire dataset aggregated b the US Geological Survey, "USGS_Wildland_Fire_Combined_Dataset.json downloaded on October 30, 2001 from the website https://www.sciencebase.gov/catalog/item/61aa537dd34eb622f699df81. 

It processes the data by (1) excluding fires prior to 1960, which are outside the time period of interest (relevant time period beings in 1963, but a few additional years were included for context), (2) converting the geography information to coordinate system from the 'ESRI:102008' coordinate system in the raw data to the "ESPG 4326" coordinate system; (3) selecting, for each fire, the polygon with the largest perimeter and dropping remaining perimeters; (4) for each fire, estimating the distance between the city of Kearny Nebraska and the closest point on the perimiter of the fire. Results are saved in a series of csv files. Data is loaded and processed in batches.  


# License
The code in this notebook was developed by Sue Boyd in response to Assignment 1 in DATA 512, a course in the UW MS Data Science degree program. The code in this notebook is provided under the MIT license located in the same repository as this notebook.

# Creative Commons Attribution
Some of the functions in this notebook are adapted from code provided by David W. McDonald in a notebook entitled "wildfire_geo_proximity_example.ipbny" for use in UW Course DATA 512. It is licensed under the Creative Commons CC-BY license (https://creativecommons.org/licenses/by/4.0/). Modifications made to the code are noted in context. 

# Chat GPT Attribution
Selected functions or codeblocks in this Notebook were created with assistance from Chat GPT (https://chat.openai.com/). For any function or codeblock that was created with assistance from Chat GPT, the impacted code is isolated in a function or procedure and the use of Chat GPT is noted. Information on the prompts used to query Chat GPT isprovided at the end of the file.


# Step 0 Prepare Notebook 

# IMPORTANT NOTE* 
In addiion to importing some standard packages, you will need to have downloaded the Reader file available in the wildfire.zip module on the 512 course website at https://drive.google.com/drive/folders/1OJktGAx86hvMtirCUkGnS292r-FpPvLo.  

In [1]:
import numpy as np
import pandas as pd
import geopandas as gpd
import geojson
import matplotlib.pyplot as plt
import time, json, folium
from folium.plugins import MarkerCluster
from geopy.geocoders import Nominatim
from pyproj import Transformer, Geod
from shapely.geometry import Polygon
from shapely.geometry import Point
from geopy.distance import geodesic
import math

import sys
sys.path.append("C:/Users/suetb/OneDrive-UW/Data512/ComonAnal") # Replace as necessary for your environment 
from Reader import Reader as WFReader

# Step 1 Define Reader and Load Sample Data
The next code block defines functions to improt data using a reader object, the source file for which can be obtained at # BLAH 

These code blocks load a small sample data set of 13 fires using the reader object, and displays aspect of the data for inspection. The sample data will be used to illustrate the cleaning/processing functionality developed in this notebook.  Note that the code below will not run properly unless the file "Wildfire_short_sample.json" is in the same working directory as this notebook.

Define a reader function for importing the data

In [2]:
# ATTRIBUTION NOTE: This function is adapted from code provided
# by Professor McDonald in a notebook entitled "wildfire_geo_proximity_example.ipbny."  
# It has been modified to operate as a callable function
# rather than freestanding code; to enable skipping forward
# in order to re-do batches that resulted in an erorr; and to specify a cut-off
# year below which no data should be loaded


# define a function that takes as arguments a wfreader object,
# MAX_Feature_Load, the max number of fires to load at one time
# and skip, the number of fires to skip before beginning to load 
# in cases of re-running specific batches, and earlierst_yr 

def read_features(wfreader, MAX_FEATURE_LOAD, skip, earliest_yr): 
    feature_list = list()
    feature_count = 0
    
   
    # Now, read through each of the features, saving them as dictionaries into a list
    # But only saving data if Fire_Year > cut_off_yr
    feature = wfreader.next()
    skip_count = 0
    
    
    # skip to where you want to start 
    while skip_count < skip:
        feature = wfreader.next()
        skip_count += 1
        # if we're skippin a lot of features, print progress
        if (skip_count % 5000) == 0:
            print(f"Skipped {skip_count} features")
 
    while feature:
        fire_year = feature["attributes"]["Fire_Year"]
        if fire_year >= earliest_yr:
            feature_list.append(feature)
        feature_count += 1
        # if this is the first feature we've loaded, print start
        if feature_count == 1:
            ID = feature["attributes"]["OBJECTID"]
            print("This batch is starting with fire ID", ID, "in year: ", fire_year)
        # if we're loading a lot of features, print progress
        if (feature_count % 10000) == 0:
            print(f"Examined {feature_count} features")
        # loaded the max we're allowed then break
        if feature_count >= MAX_FEATURE_LOAD:
            break
        feature = wfreader.next()

    #    Print the number of items (features) we examined
    print(f"Examined a total of {feature_count} features")

    #    How many items were loaded get into the list? Could be 
    #    Fewer than number examined due to cut off year 
    print(f"Variable 'feature_list' contains {len(feature_list)} features")

    return(feature_list)


Illustrate the use of the reader by loading a short sample file 

In [3]:
file_name = "Wildfire_short_sample.json"
wfreader = WFReader(file_name)
feature_list = read_features(wfreader, 15, 0, 1890)
#feature_list[0]["attributes"]["OBJECTID"]

This batch is starting with fire ID 4956 in year:  1932
Examined a total of 13 features
Variable 'feature_list' contains 13 features


Here is an example where we examined more features than we loaded, becuase we specified cutoff year of 2010

In [4]:
wfreader = WFReader(file_name)
feature_list_post_2010 = read_features(wfreader, 15, 0, 2010)

This batch is starting with fire ID 4956 in year:  1932
Examined a total of 13 features
Variable 'feature_list' contains 5 features


In [5]:
# Here is what an individual feature looks like 
feature_list[0]

{'attributes': {'OBJECTID': 4956,
  'USGS_Assigned_ID': 4956,
  'Assigned_Fire_Type': 'Wildfire',
  'Fire_Year': 1932,
  'Fire_Polygon_Tier': 1,
  'Fire_Attribute_Tiers': '1 (1), 3 (3)',
  'GIS_Acres': 219999.23754748085,
  'GIS_Hectares': 89030.53273921262,
  'Source_Datasets': 'Comb_National_NIFC_Interagency_Fire_Perimeter_History (1), Comb_National_USFS_Final_Fire_Perimeter (1), Comb_National_WFDSS_Interagency_Fire_Perimeter_History (1), Comb_State_California_Wildfire_Polygons (1)',
  'Listed_Fire_Types': 'Wildfire (3), Likely Wildfire (1)',
  'Listed_Fire_Names': 'MATILIJA (4)',
  'Listed_Fire_Codes': 'No code provided (4)',
  'Listed_Fire_IDs': '0 (3)',
  'Listed_Fire_IRWIN_IDs': '',
  'Listed_Fire_Dates': 'Listed Wildfire Discovery Date(s): 1932-09-07 (2) | Listed Other Fire Date(s): 1899-12-30 - REVDATE field (1), 1932-09-07 - DATE_CUR field (1)',
  'Listed_Fire_Causes': '9 - Miscellaneous (1)',
  'Listed_Fire_Cause_Class': 'Undetermined (3), Human (1)',
  'Listed_Rx_Reported_Ac

# Step 2 - Write Functions for Data Prep/Cleaning 
The following set of functions enables us to convert from the JSON format returned from the reader into a geopandas database, and to convert to a different geo coordinate system that will enable us to compute distances between points.   

First, define a function that converts the feature list to a geopandas dataframe
including all attributes from the JSON file.  

In [6]:
# define a function that converts the feature list to a geopandas dataframe
# including all attributes from he JSON file 

# See Chat GPT Attribution Note at end of notebook    

def feature_list_to_gpd(feature_list):
    # extract the attributes 
    attributes_list = [feature['attributes'] for feature in feature_list]
    attribute_df = pd.DataFrame(attributes_list)
    # extract the geoseries data, and convert to a polygon
    # when multiple "rings" present, use the first, biggest ring 
    geometry_list = [Polygon(feature['geometry']["rings"][0]) for feature in feature_list]
    geometry_series = gpd.GeoSeries(geometry_list, crs="ESRI:102008")
    # combine into one df 
    fires_gpd = gpd.GeoDataFrame(attribute_df, geometry=geometry_series)
    return(fires_gpd)
    

We'll also create a slimmed down geopandas dataset, containing only Object ID and the geometry information.  We'll use this when pulling the big dataset to speed things up, and then rejoin it with a csv file of attributes joining on the Object ID column. 

In [7]:
# define a function that converts the feature list to a geopandas dataframe
# including onlyt the OBJECT ID attribute and the geometry  

# See Chat GPT Attribution Note at end of notebook  

def feature_list_to_gpd_slim(feature_list):
    # extract OBJECT ID as the only attribute  
    OBJECTID_list = [feature['attributes']["OBJECTID"] for feature in feature_list]
    attribute_df = pd.DataFrame(OBJECTID_list)
    
    # extract the geoseries data, and convert to a polygon
    # when multiple "rings" present, use the first, biggest ring 
    geometry_list = []
    for feature in feature_list: 
        try:
            next_geom = Polygon(feature['geometry']["rings"][0])
        except: 
            print("Found a curve ring at OBJECTID", feature["attributes"]["OBJECTID"])   
            next_geom = float('nan')
                             
        geometry_list.append(next_geom) 
    geometry_series = gpd.GeoSeries(geometry_list, crs="ESRI:102008")
    # combine into one df 
    fires_gpd = gpd.GeoDataFrame(attribute_df, geometry=geometry_series)
    fires_gpd = fires_gpd.set_index(fires_gpd.columns[0])
    fires_gpd.index.name = 'OBJECTID'
    return(fires_gpd)

Here's an example aplying the first, "thick" feature_list_to_gpd function to our small sample set. 

In [8]:
fires = feature_list_to_gpd(feature_list)

In [9]:
# display columns 
fires.columns

Index(['OBJECTID', 'USGS_Assigned_ID', 'Assigned_Fire_Type', 'Fire_Year',
       'Fire_Polygon_Tier', 'Fire_Attribute_Tiers', 'GIS_Acres',
       'GIS_Hectares', 'Source_Datasets', 'Listed_Fire_Types',
       'Listed_Fire_Names', 'Listed_Fire_Codes', 'Listed_Fire_IDs',
       'Listed_Fire_IRWIN_IDs', 'Listed_Fire_Dates', 'Listed_Fire_Causes',
       'Listed_Fire_Cause_Class', 'Listed_Rx_Reported_Acres',
       'Listed_Map_Digitize_Methods', 'Listed_Notes', 'Processing_Notes',
       'Wildfire_Notice', 'Prescribed_Burn_Notice', 'Wildfire_and_Rx_Flag',
       'Overlap_Within_1_or_2_Flag', 'Circleness_Scale', 'Circle_Flag',
       'Exclude_From_Summary_Rasters', 'Shape_Length', 'Shape_Area',
       'geometry'],
      dtype='object')

In [10]:
# display a subset of columns in the dataframe 
display_col = ["OBJECTID", "Fire_Year", "GIS_Acres","geometry"]
fires[display_col].head()

Unnamed: 0,OBJECTID,Fire_Year,GIS_Acres,geometry
0,4956,1932,219999.237547,"POLYGON ((-2007910.370 -382654.303, -2007886.8..."
1,46089,2003,271157.846172,"POLYGON ((-1847979.728 -641621.933, -1847975.9..."
2,46922,2003,29.356241,"POLYGON ((-1643028.688 578595.772, -1643029.70..."
3,47304,2003,1.146085,"POLYGON ((-2134635.364 254909.344, -2134634.63..."
4,52158,2007,240358.650507,"POLYGON ((-2048898.512 -355311.697, -2048888.4..."


And here's an exmaple aplying the feature_list_to_gpd_slim function to our small sample set. 

In [11]:
fires_slim = feature_list_to_gpd_slim(feature_list)
fires_slim.head()

Unnamed: 0_level_0,geometry
OBJECTID,Unnamed: 1_level_1
4956,"POLYGON ((-2007910.370 -382654.303, -2007886.8..."
46089,"POLYGON ((-1847979.728 -641621.933, -1847975.9..."
46922,"POLYGON ((-1643028.688 578595.772, -1643029.70..."
47304,"POLYGON ((-2134635.364 254909.344, -2134634.63..."
52158,"POLYGON ((-2048898.512 -355311.697, -2048888.4..."


In [12]:
# See chat GPT attribution note at end of notebook 

# Convert the GeoDataFrame to EPSG:4326
# Set the initial CRS (ESRI:102008 in this case)
fires.crs = 'ESRI:102008'
# Convert the GeoDataFrame to EPSG:4326
fires_4326 = fires.to_crs(epsg=4326)

fires_4326[display_col].head()


Unnamed: 0,OBJECTID,Fire_Year,GIS_Acres,geometry
0,4956,1932,219999.237547,"POLYGON ((-119.30481 34.63155, -119.30416 34.6..."
1,46089,2003,271157.846172,"POLYGON ((-116.86173 32.78723, -116.86159 32.7..."
2,46922,2003,29.356241,"POLYGON ((-117.73172 43.29726, -117.73174 43.2..."
3,47304,2003,1.146085,"POLYGON ((-122.74336 39.59129, -122.74336 39.5..."
4,52158,2007,240358.650507,"POLYGON ((-119.83913 34.77197, -119.83897 34.7..."


In [13]:
#f = "fires_small.csv"
#fires_4326.to_csv(f)

# MAP The Data; Make Sure it Looks Reasonable 

First, let's map some West Coast cities near where these fires are, 
plus I'll add the city of Kearney, Nebraska since I will be interested in that one later.   

In [14]:
# ATTRIBUTION NOTE: This dictionary of cities and their lat/long is 
# adapted from code provided by Professor McDonald 
# in a notebook entitled "wildfire_geo_proximity_example.ipbny."  It has been modified to
# add the city of Kearny, Nebraska    

CITY_LOCATIONS = {
    'anchorage' :     {'city'   : 'Anchorage',
                       'latlon' : [61.2176, -149.8997] },
    'ocean_shores' :  {'city'   : 'Ocean Shores',    
                       'latlon' : [47.0074, -124.1614] },
    'seaside' :       {'city'   : 'Seaside',
                       'latlon' : [45.9932, -123.9226] }, 
    'bend' :          {'city'   : 'Bend',
                       'latlon' : [44.0582, -121.3153] }, 
    'medford' :       {'city'   : 'Medford',
                       'latlon' : [42.3265, -122.8756] }, 
    'crescent_city' : {'city'   : 'Crescent City',
                       'latlon' : [41.7558, -124.2026] }, 
    'tomales' :       {'city'   : 'Tomales',
                       'latlon' : [38.2411, -122.9033] }, 
    'barstow' :       {'city'   : 'Barstow',
                       'latlon' : [34.8958, -117.0173] }, 
    'redding' :       {'city'   : 'Redding',
                       'latlon' : [40.5865, -122.3916] }, 
    'encinitas' :     {'city'   : 'Encinitas',
                       'latlon' : [33.0370, -117.2920]}, 
    'Kearny' :        {'city'   : 'Kearny',
                       'latlon' : [40.7017, -99.0825]}   # latlon from Wikipedia 
}

In [15]:
# Define the US map's center coordinates
us_map = folium.Map(location=[37.0902, -95.7129], zoom_start=4)
#us_map


In [16]:
# Add the cities to the map 
# See chat GPT attribution note at end of notebook 

for city, data in CITY_LOCATIONS.items():
    latlon = data['latlon']
    city_name = data['city']
    folium.Marker(location=latlon, popup=city_name).add_to(us_map)

# Save the map to an HTML file or display it in your Jupyter Notebook
#us_map.save("us_cities_map.html")

# Display in Notebook 
#us_map


In [17]:
# Add the fire locations on the map in red circles 

# See chat GPT attribution note at end of notebook 

# Iterate over each row in the fires_4326 GeoDataFrame
for index, row in fires_4326.iterrows():
    # Extract the polygon and Fire_Year from the current row
    polygon = row['geometry']
    fire_year = str(row['Fire_Year'])  # Convert Fire_Year to a string

    # Convert the shapely polygon to GeoJSON
    geojson = gpd.GeoSeries([polygon]).__geo_interface__

    # Customize the style to make the data point larger
    style = {
    'color': 'red',
    'weight': 5,
    'fillColor': 'orange',
    'fillOpacity': 0.5,
    }
    
    # Create a GeoJSON representation on the map with a popup displaying Fire_Year
    folium.GeoJson(
        geojson,
        name=fire_year,
        style_function=lambda x: style,
        popup=folium.Popup(fire_year, parse_html=True)
    ).add_to(us_map)

us_map


# Step 4 - Create Functions to Calculate Distances

First, we'll define a function that calculates the distance between two points

In [18]:
# ATTRIBUTION NOTE: This function is adapted from code provided by Professor McDonald 
# in a notebook entitled "wildfire_geo_proximity_example.ipbny."  It has been modified to
# perform a single calculation (whereas Prof. McDonald's code was a for loop) and to 
# be part of a defined function rather than freestanding code block 


# define a function that takes 4 arguments, the long, and lat respectively
# of the first point, and the long, lat of the second point
# and returns the distance in miles between the two points. 
def calc_point_dist(p1_long, p1_lat, p2_long, p2_lat):
    geodcalc = Geod(ellps='WGS84') 
    distance = geodcalc.inv(p1_long,p1_lat,p2_long,p2_lat)
    d_meters = distance[2]
    d_miles = d_meters * 0.00062137 # constant to convert meters to miles
    return d_miles

Let's check to make sure the function accurately calculates the distance between Redding and Medford. 

In [19]:
# Calc distance bewteen Redding and Medford using calc_point_dist function

city1 = CITY_LOCATIONS["redding"]
city2 = CITY_LOCATIONS["medford"]
p1_long = city1['latlon'][1]
p1_lat = city1['latlon'][0]
p2_long = city2['latlon'][1]
p2_lat = city2['latlon'][0]
d =calc_point_dist(p1_long, p1_lat, p2_long, p2_lat)
print("Distance between", city1["city"], "and", city2["city"], "is", round(d), "miles")

Distance between Redding and Medford is 123 miles


The calculated distance of 123 miles between Redding and Medford match calculated results in Prof. McDonald's notebook. 

Now, we'll define a function that calculates the distance between a place(point) and a fire polygon. We'll use the method that calculates the shortest distance between the place and a point on the perimiter of the fire polygon 

In [20]:
# ATTRIBUTION NOTE: This function is adapted from code provided by Professor McDonald 
# in a notebook entitled "wildfire_geo_proximity_example.ipbny."  It has been modified to
# take as an argument a list of fire polygon exterior coordinates, rather than ring data 
# it is assumed that the fire polygon exterior coordinates are already in crs 4326
# it also takes the place long, lat coordinates as separate arguments 

#    Define a function that takes three parameters 
#        p1_long - the longitude of a place, from which distance to fire will be measured 
#        p2_lat -  the latitude  of a place, from which distance to fire will be measured 
#        exterior_coords - a list of the exterior coordinates of the fire polygon.  This is a list, 
#        where each element in the list is a tuple containing the (long, lat) in decimal degrees 

#    The function returns a list containing the shortest distance to the perimeter and the point where that is

def shortest_distance_from_place_to_fire_perimeter(p1_long, p1_lat, exterior_coords):
    geodcalc = Geod(ellps='WGS84')
    closest_point = list()
    # run through each point in the exterior_coords
    for coord in exterior_coords:
        c_long = coord[0]
        c_lat = coord[1]
        # Note that the 'inv()' function wants coordinates in Longitude,Latitude order by default
        d = geodcalc.inv(p1_long, p1_lat, c_long, c_lat)  
        # convert the distance to miles
        distance_in_miles = d[2]*0.00062137
        # if it's closer to the city than the point we have, save it
        if not closest_point:
            closest_point.append(distance_in_miles)
            closest_point.append(coord)
        elif closest_point and closest_point[0]>distance_in_miles:
            closest_point = list()
            closest_point.append(distance_in_miles)
            closest_point.append(coord)
    return closest_point            




Let's demonstrate the functionality by measuring the distance between redding and a few of the wild fires 

In [21]:
city = CITY_LOCATIONS["redding"]
p1_long = city['latlon'][1]
p1_lat = city['latlon'][0]


for idx in range (3):
    row = fires_4326.iloc[idx]
    fire_poly = row["geometry"]
    fire_name = row['Listed_Fire_Names'].split(',')[0]
    exterior_coords = list(fire_poly.exterior.coords) 
    closest_point = shortest_distance_from_place_to_fire_perimeter(p1_long, p1_lat, exterior_coords)

    print("Distance between", fire_name, "and", city["city"], "was", round(closest_point[0], 2), "miles") 
    print("The closest perimiter point was at long, lat", closest_point[1])
    print("")
    


Distance between MATILIJA (4) and Redding was 441.12 miles
The closest perimiter point was at long, lat (-119.41551960560358, 34.64321607909176)

Distance between CEDAR (6) and Redding was 618.96 miles
The closest perimiter point was at long, lat (-116.86327594379371, 32.786728032123094)

Distance between CEDAR MOUNTAIN (4) and Redding was 304.12 miles
The closest perimiter point was at long, lat (-117.7380131593176, 43.298936850119055)



These calculated distances match calculated results in Prof. McDonald's notebook. 

# Step 5: Find Fires Within 1250 miles of your assigned city

In [22]:
# set some constants 
CITY = "Kearny"
KEARNY_LAT = 40.7017
KEARNY_LONG = -99.0825

In [23]:
def distance_to_my_city (exterior_coords):
    d = shortest_distance_from_place_to_fire_perimeter(KEARNY_LONG, KEARNY_LAT, exterior_coords)
    return d   

# Step 6 Load and Process the Big DataSet

In [24]:
batch_sz = 5000
file_name = "USGS_Wildland_Fire_Combined_Dataset.json"
wfreader = WFReader(file_name)
cut_off_yr = 1960
#cut_off_yr = 1800

In [25]:
def replace_geo_with_distance (dat):
    distances = []
    closest_lat = []
    closest_long = []
    for idx in range (dat.shape[0]):
        try: 
            fire_poly = dat.iloc[idx]["geometry"]
            exterior_coords = list(fire_poly.exterior.coords)
            d = shortest_distance_from_place_to_fire_perimeter(KEARNY_LONG, KEARNY_LAT, exterior_coords)
        except:
            d = [float('nan'), [float('nan'), float('nan')]]
            
        distances.append(d[0])
        long = d[1][0]
        closest_long.append(long)
        lat = d[1][1]
        closest_lat.append(lat)
    dat["Distance to Kearny"] = distances
    dat["Closest long"] = closest_long
    dat["Closest lat"] = closest_lat
    dat = dat[dat.columns.drop("geometry")]
       
    return dat 

In [26]:
def run_one_batch(batch_no, batch_sz, cut_off_yr):
    feature_list = read_features(wfreader, batch_sz, 0, cut_off_yr)
    dat = feature_list_to_gpd_slim(feature_list)
    try: 
        dat2 =  dat.to_crs(epsg=4326)
        dat3 = replace_geo_with_distance(dat2)
        f = f"fires_distance_{batch_no}.csv"
        dat3.to_csv(f)
        print(f"Finished saving batch_{batch_no}")
        print("")
        return dat3
    except: 
        print(f"Error saving batch_{batch_no}, may be because batch had no fires before {cut_off_yr}")
        print("")


In [27]:
# a re-run function when you want to start part way through the file 

def re_run_batch(batch_no, batch_sz, cut_off_yr, skip_to):
    # reset the reader at zero
    wfreader = WFReader(file_name)
    
    # make use of the skip_to function in the read_features
    feature_list = read_features(wfreader, batch_sz, skip_to, cut_off_yr)
    
    dat = feature_list_to_gpd_slim(feature_list)
    try: 
        dat2 =  dat.to_crs(epsg=4326)
        dat3 = replace_geo_with_distance(dat2)
        f = f"fires_distance_{batch_no}.csv"
        dat3.to_csv(f)
        print(f"Finished saving batch_{batch_no}")
        print("")
        return dat3
    except: 
        print(f"Error saving batch_{batch_no}, may be because batch had no fires before {cut_off_yr}")
        print("")
   
   

In [28]:
dat0 = run_one_batch(0, batch_sz, cut_off_yr)

This batch is starting with fire ID 1 in year:  1860
Examined a total of 5000 features
Variable 'feature_list' contains 0 features
Error saving batch_0, may be because batch had no fires before 1960



In [29]:
for i in range(1,10):
    dat = run_one_batch(i, batch_sz, cut_off_yr)

This batch is starting with fire ID 5001 in year:  1932
Examined a total of 5000 features
Variable 'feature_list' contains 0 features
Error saving batch_1, may be because batch had no fires before 1960

This batch is starting with fire ID 10001 in year:  1950
Examined a total of 5000 features
Variable 'feature_list' contains 1924 features
Finished saving batch_2

This batch is starting with fire ID 15001 in year:  1965
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_3

This batch is starting with fire ID 20001 in year:  1976
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_4

This batch is starting with fire ID 25001 in year:  1984
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_5

This batch is starting with fire ID 30001 in year:  1989
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finishe

In [30]:
for i in range(10,20):
    dat = run_one_batch(i, batch_sz, cut_off_yr)

This batch is starting with fire ID 50001 in year:  2005
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_10

This batch is starting with fire ID 55001 in year:  2008
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_11

This batch is starting with fire ID 60001 in year:  2012
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_12

This batch is starting with fire ID 65001 in year:  2015
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_13

This batch is starting with fire ID 70001 in year:  2017
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_14

This batch is starting with fire ID 75001 in year:  2019
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_15

This batch is st

In [31]:
for i in range(20,25):
    dat = run_one_batch(i, batch_sz, cut_off_yr)

This batch is starting with fire ID 100001 in year:  2003
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_20

This batch is starting with fire ID 105001 in year:  2016
Examined a total of 5000 features
Variable 'feature_list' contains 4969 features
Found a curve ring at OBJECTID 109605
Finished saving batch_21

This batch is starting with fire ID 110001 in year:  2007
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Found a curve ring at OBJECTID 110224
Found a curve ring at OBJECTID 110639
Found a curve ring at OBJECTID 111431
Found a curve ring at OBJECTID 111776
Found a curve ring at OBJECTID 111897
Found a curve ring at OBJECTID 112410
Found a curve ring at OBJECTID 112415
Found a curve ring at OBJECTID 113411
Found a curve ring at OBJECTID 113665
Found a curve ring at OBJECTID 113738
Found a curve ring at OBJECTID 113766
Found a curve ring at OBJECTID 113805
Found a curve ring at OBJECTID 114309

In [32]:
dat = run_one_batch(25, batch_sz, cut_off_yr)

This batch is starting with fire ID 125001 in year:  2018
Examined a total of 5000 features
Variable 'feature_list' contains 4960 features
Found a curve ring at OBJECTID 125046
Found a curve ring at OBJECTID 125745
Found a curve ring at OBJECTID 127492
Finished saving batch_25



In [33]:
dat = run_one_batch(26, batch_sz, cut_off_yr)

This batch is starting with fire ID 130001 in year:  1986
Examined a total of 5000 features
Variable 'feature_list' contains 5000 features
Finished saving batch_26



In [34]:
dat = run_one_batch(27, batch_sz, cut_off_yr)

This batch is starting with fire ID 135001 in year:  2019
Examined a total of 61 features
Variable 'feature_list' contains 61 features
Finished saving batch_27



# CHAT GPT ATTRIBUTION

The following function(s) or codeblock(s) contained in this notebook were written with assistance from Chat GPT available at: https://chat.openai.com/. In some cases, code suggested by Chat GPT was then further modified by the Notebook author, Sue Boyd.

***
For assistance in writing the "feature_list_to_gpd" and "feature_list_to_gpd_slim" functions, Chat GPT was given the following prompts:

"I have a list called feature_list.  Each element of the list is a JSON dictionary.  Write code to convert it to a geopandas dataframe"  

AND

"Each list is in this format: {Insert specific instance of output for the specific JSON file loaded in this notebook}. Write code to convert the list to a geopandas dataframe. 

***
For assistance in writing the code block used to convert coordinate systems, Chat GPT was given the following prompt: 

"I have a df in geopandas with geometries in the ESRI:102008 system.  Write code to convert it to EPSG:4326."

***
For assistance in writing the code block that displayed the city data on the US map, Chat GPT was given the following prompt: 

"I have the following data of city locations.  Write python code to map them onto a map of the US"

AND

"Sorry, here' s the data I want you to map 
CITY_LOCATIONS = {
    'anchorage' :     {'city'   : 'Anchorage',
                       'latlon' : [61.2176, -149.8997] }, . . . } 

***

For assistance in writing the code block that displayed fire data on the US map,  Chat GPT was given the following prompts:

"I have a geo json datapoint: {'type': 'FeatureCollection', 'features': {'id': '0', 'type': 'Feature', 'properties': {}, 'geometry': {'type': 'Polygon', 'coordinates': (((-119.30481002793371, 34.63154707144714), (-119.30415605226565, 34.63043240077834), (-119.30344631669736, 34.629981564085355),   Write code to disply that on a map."

AND 

"redo that code using folium"

AND

"Adust that code so that the datapoint is much bigger and there is a popup tool that dispalys the word "Fire"

***

