## Purpose

The purpose of this file is to filter the raw data into the information we need. Specifically we need to limit the data to: 
1. The last 60 years (1963-2023)
2. Fires within 1250 miles of Pahrump, Nevada (Nye County)
3. Fires which burned from May 1st through October 31st within the allotted years


We will begin by loading basic python libraries.

In [1]:
#Import python libraries
import json
import time
import urllib.parse
import requests
import pandas as pd

Next, we will load the combined fire dataset JSON from the raw data folder.

In [3]:
#Loading wildfire data to see what it looks like
wf_file = open('../../USGS_Wildland_Fire_Combined_Dataset.json')
 
#Makes a dictionary from file
wf_dict = json.load(wf_file)

We will remove all fires with a Fire_Year before 1963. After 2023 is not a concern as the fire data only goes until 2020. We will save the reduced fire list to a file so the programmer can load it into version control software like github, or in the event that the programmer wants to return to the data later

In [53]:
'''#Creating a new wf dict
lim_wf = {}

#Parsing through wf_dict['features'] list for each attribute seeing fire year
#if in range, add to the lim_df list
for fire in wf_dict['features']:
    if fire['attributes']['Fire_Year'] >= 1963:
        lim_wf[fire['attributes']['OBJECTID']] = fire['attributes']
        
#Checking the range of dates
dates = []
for key in list(lim_wf.keys()):
    dates.append(lim_wf[key]['Fire_Year'])
lim_dates = set(dates)
sort_dates = list(lim_dates)
if min(sort_dates) < 1963:
    print("Wrong dates - included something smaller than 1963")
else: print("Fire dates start in {0}".format(min(sort_dates)))
if max(sort_dates) > 2023:
    print("Wrong dates - included something bigger than 2023")
else: print("Fire dates end in {0}".format(max(sort_dates)))
if len(sort_dates) == max(sort_dates)-min(sort_dates)+1:
    print("Fires happen ever year between {0} and {1}".format(
        min(sort_dates), max(sort_dates)))
else:
    print("WARNING: fires don't happen every year")
    
#Making the fire dict a JSON object
fire_json_object = json.dumps(lim_wf, indent = 4) 

#Writing to the file
with open('../intermediate_data/limited_fires.json', 'w') as outfile:
    outfile.write(fire_json_object)'''



OLD - We will now load the intermediate data in the event the programmer chooses to come back to run the rest of the project.

In [31]:
'''#Loading wildfire data to see what it looks like
limited_fire_file = open('../intermediate_data/limited_fires.json')
 
#Makes a dictionary from the file
lim_fire_dict = json.load(limited_fire_file)'''

To limit the amount of data for distance processing (all fires must be within 1250 miles of Pahrump), we will keep only those wich occurred after 1963 inclusive.

In [35]:
#Creating list of OBJECTIDs for fires which happened after 1963
post_63_id = []

#Parsing through wf_dict['features'] list for each attribute seeing fire year
#if in range, add to the lim_df list
for fire in wf_dict['features']:
    if fire['attributes']['Fire_Year'] >= 1963:
        post_63_id.append(fire['attributes']['OBJECTID'])

print(len(post_63_id)) #Should be 117578

117578


In [47]:
#Copy the original fire dict
filter_wf_dict = wf_dict.copy()

#Making the features blank
filter_wf_dict['features'] = []

#Adding in only the features/fires I want
for fire in wf_dict['features']:
    fire_count = 0
    if fire['attributes']['OBJECTID'] in post_63_id:
        filter_wf_dict['features'].append(fire)
        fire_count = fire_count + 1
        if fire_count % 1000 == 0:
            print("Added {0} fires to filtered list".format(fire_count))

In [52]:
#Making the fire dict a JSON object
filter_wf_json_object = json.dumps(filter_wf_dict, indent = 4) 

#Writing to the file
with open('../intermediate_data/filtered_fires.json', 'w') as outfile:
    outfile.write(filter_wf_json_object)

In [54]:
len(filter_wf_dict['features'])

117578

In [56]:
filter_wf_dict['features'][0]

{'attributes': {'OBJECTID': 14299,
  'USGS_Assigned_ID': 14299,
  'Assigned_Fire_Type': 'Wildfire',
  'Fire_Year': 1963,
  'Fire_Polygon_Tier': 1,
  'Fire_Attribute_Tiers': '1 (1), 3 (3)',
  'GIS_Acres': 40992.45827111476,
  'GIS_Hectares': 16589.05930244248,
  'Source_Datasets': 'Comb_National_NIFC_Interagency_Fire_Perimeter_History (1), Comb_SubState_MNSRBOPNCA_Wildfires_Historic (1), Comb_SubState_BLM_Idaho_NOC_FPER_Historica_Fire_Polygons (1), Comb_National_BLM_Fire_Perimeters_LADP (1)',
  'Listed_Fire_Types': 'Wildfire (1), Likely Wildfire (3)',
  'Listed_Fire_Names': 'RATTLESNAKE (4)',
  'Listed_Fire_Codes': 'No code provided (4)',
  'Listed_Fire_IDs': '1963-NA-000000 (2)',
  'Listed_Fire_IRWIN_IDs': '',
  'Listed_Fire_Dates': 'Listed Wildfire Discovery Date(s): 1963-08-06 (3) | Listed Wildfire Controlled Date(s): 1963-12-31 (3)',
  'Listed_Fire_Causes': 'Unknown (3)',
  'Listed_Fire_Cause_Class': 'Undetermined (4)',
  'Listed_Rx_Reported_Acres': None,
  'Listed_Map_Digitize_Meth