<a href="https://colab.research.google.com/github/mnocerino23/Wildfire-Forecaster/blob/main/Elevation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this Jupyter notebook, I correct for a few minor errors in the dataset and add in the final feature which is the elevation at which the fire took place

In [125]:
import pandas as pd
from google.colab import drive
drive.mount('/content/drive')

#Read in the two datasets. The first contains over 110,000 fires from 2001-2015 while the second has 1,000 more recent, larger fires.
wildfire_set1 = pd.read_csv('/content/drive/MyDrive/Data_Science_Projects/Wildfires/wildfires1_w_snow.csv')
wildfire_set2 = pd.read_csv('/content/drive/MyDrive/Data_Science_Projects/Wildfires/wildfires2_w_snow.csv')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


  exec(code_obj, self.user_global_ns, self.user_ns)


In [126]:
print(wildfire_set1.shape)
print(wildfire_set2.shape)

(114558, 38)
(1197, 38)


Before starting to build our classifiers, I take care of a few small issues and add an additional feature. From inspecting the dataset, I found that some invalid coordinates with (latitude = 0,longitude = 0) appear in the datasets so we quickly take care of that issue with the code below:

In [127]:
for index, row in wildfire_set1.iterrows():
  if wildfire_set1.at[index,'Latitude'] == 0 and wildfire_set1.at[index,'Longitude'] == 0:
    wildfire_set1.drop([index], inplace = True)
wildfire_set1.reset_index()

for index, row in wildfire_set2.iterrows():
  if wildfire_set2.at[index,'Latitude'] == 0 and wildfire_set2.at[index,'Longitude'] == 0:
    wildfire_set2.drop([index], inplace = True)
wildfire_set2.reset_index()

Unnamed: 0.1,index,Unnamed: 0,Year,Name,AcresBurned,Fire Size Rank,Cause,SOURCE_REPORTING_UNIT_NAME,DaysBurn,Discovery Month,...,PRCP_6M,PRCP_RS,DX90_2M,DP10_2M,Receives Snow,Snow Station,River Basin,Mar_SP,Mar_WC,Mar_Dens
0,0,0,2016,Soberanes Fire,132127.0,G,,,83.0,Jul,...,14.11,21.42,0.0,1.0,0,,,0.0,0.0,0.00
1,1,1,2016,Erskine Fire,48019.0,G,,,18.0,Jun,...,4.68,4.88,15.0,4.0,1,mineral_king,Kaweah,36.0,16.0,0.44
2,2,2,2016,Chimney Fire,46344.0,G,,,24.0,Aug,...,2.52,8.09,43.0,0.0,0,,,0.0,0.0,0.00
3,3,3,2016,Blue Cut Fire,36274.0,G,,,7.0,Aug,...,3.41,6.45,43.0,0.0,0,,,0.0,0.0,0.00
4,4,4,2016,Gap Fire,33867.0,G,,,1.0,Aug,...,18.03,54.17,0.0,2.0,1,parks_creek,Shasta,77.0,34.0,0.44
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1151,1192,1192,2019,Eagle Fire,9.0,B,,,,Oct,...,0.49,12.66,48.0,0.0,0,,,0.0,0.0,0.00
1152,1193,1193,2019,Long Fire,2.0,B,,,,Jun,...,67.97,69.29,0.0,17.0,1,eureka_lake,Feather,110.0,48.0,0.44
1153,1194,1194,2019,Cashe Fire,,B,,,,Nov,...,3.29,21.47,13.0,0.0,0,,,0.0,0.0,0.00
1154,1195,1195,2019,Oak Fire,,B,,,,Oct,...,0.00,0.00,0.0,0.0,0,,,0.0,0.0,0.00


From inspecting the shape of the dataframes before and after, we see that this shaved off around 40 invalid coordinates from the second dataset.

In [128]:
print(wildfire_set1.shape)
print(wildfire_set2.shape)

(114558, 38)
(1156, 38)


Add in one final feature Elevation. Make a post request to the open elevation API (https://developer.mapquest.com/documentation/open/elevation-api/#:~:text=The%20Open%20Elevation%20API%20provides,by%20the%20lat%2Flng%20collection) which allows us to get elevation given latitude and longitude. Below, we create a dictionary which has the key location mapped to a list of dictionaries each holding the individual fire locations which is the format required for post requests to the API as described in its github documentation. (https://github.com/Jorl17/open-elevation/blob/master/docs/api.md)

In [129]:
import requests
import json

In [130]:
wildfire_set1['Has_Elevation'] = 0
wildfire_set2['Has_Elevation'] = 0

In [131]:
def batch_of_coordinates(df):
  #use the .loc function to limit rows we take in to those that still don't have elevation yet
  not_visited = df.loc[df['Has_Elevation'] == 0]
  coordinates = []
  for index, row in not_visited.iterrows():
    #limit our query to 1500 pairs coordinates at a time
    if len(coordinates) < 1500:
      d = {}
      d["latitude"] = df.at[index,"Latitude"]
      d["longitude"] = df.at[index,"Longitude"]
      coordinates.append(d)
      df.at[index, 'Has_Elevation'] = 1
    else:
      break
  #return a list of 1500 coordinates that we will feed in to the API via python post request, which will return all the associated elevations efficiently
  return coordinates

In [132]:
elevations = []
final = {}

In [133]:
while 0 in list(wildfire_set1['Has_Elevation']):
  coord = batch_of_coordinates(wildfire_set1)
  final["locations"] = coord

  j = json.dumps(final)
  json_object = json.loads(j)
  r = requests.post(url= 'https://api.open-elevation.com/api/v1/lookup', json= json_object, timeout = 30)
  y = json.loads(r.text)
  for item in y['results']:
    elevations.append(item['elevation'])

In [134]:
print(elevations)
print(len(elevations))

[904, 1892, 1053, 2365, 2316, 2507, 2020, 399, 869, 2052, 2000, 2554, 1663, 724, 1990, 1092, 1872, 1526, 1580, 924, 490, 328, 842, 1818, 1850, 379, 2321, 509, 1213, 2655, 824, 2467, 2288, 1003, 98, 388, 803, 2244, 2610, 2570, 2078, 1880, 1219, 1879, 127, 773, 1963, 768, 611, 718, 406, 910, 784, 784, 797, 2166, 786, 783, 2005, 792, 888, 2294, 2733, 811, 2937, 374, 1950, 468, 2279, 2290, 2321, 1736, 2394, 698, 456, 1169, 874, 410, 1012, 1571, 848, 492, 562, 1511, 1176, 401, 661, 1765, 405, 1752, 2599, 2546, 2049, 1573, 1740, 764, 2402, 815, 1897, 2460, 2125, 1007, 2239, 443, 638, 815, 798, 805, 2614, 2278, 2117, 2480, 2361, 2073, 2174, 1667, 2420, 2343, 2196, 2390, 2565, 2788, 1990, 1536, 1993, 2271, 1949, 2176, 2213, 1475, 1343, 76, 2188, 1096, 334, 1063, 2162, 704, 2686, 707, 529, 2123, 409, 431, 343, 1740, 959, 1152, 992, 1724, 1164, 659, 1038, 1180, 1112, 1602, 291, 670, 493, 473, 758, 579, 410, 979, 505, 1194, 1056, 1443, 1136, 707, 623, 697, 924, 1475, 1010, 466, 1355, 1029, 485, 2

In [135]:
elevations2 = []
final2 = {}

In [136]:
while 0 in list(wildfire_set2['Has_Elevation']):
  coord = batch_of_coordinates(wildfire_set2)
  final2["locations"] = coord

  j = json.dumps(final2)
  json_object = json.loads(j)
  r = requests.post(url= 'https://api.open-elevation.com/api/v1/lookup', json= json_object, timeout = 30)
  y = json.loads(r.text)
  for item in y['results']:
    elevations2.append(item['elevation'])

In [137]:
print(elevations2)
print(len(elevations2))

[293, 1033, 320, 1278, 989, 1867, 197, 1475, 701, 1606, 531, 472, 466, 466, 2384, 714, 909, 1607, 424, 244, 606, 1528, 1749, 1449, 321, 1917, 420, 516, 392, 501, 319, 290, 461, 697, 369, 1052, 2343, 117, 47, 65, 536, 384, 403, 333, 1511, 1246, 1438, 90, 195, 291, 504, 1150, 56, 129, 331, 576, 876, 123, 914, 99, 131, 526, 2198, 38, 589, 370, 903, 480, 369, 89, 595, 299, 368, 505, 233, 358, 546, 495, 139, 943, 254, 860, 253, 620, 21, 390, 86, 27, 73, 352, 714, 252, 99, 585, 113, 68, 526, 290, 103, 261, 421, 418, 249, 445, 73, 310, 359, 468, 965, 480, 41, 749, 244, 360, 531, 548, 159, 67, 200, 2268, 1127, 484, 172, 188, 289, 278, 107, 938, 62, 54, 330, 1606, 168, 132, 61, 197, 553, 177, 807, 156, 130, 635, 289, 246, 36, 886, 988, 957, 420, 1557, 2254, 1879, 834, 2336, 821, 305, 305, 1283, 499, 1325, 221, 221, 411, 411, 803, 1585, 201, 201, 623, 291, 177, 529, 491, 2279, 1721, 435, 437, 685, 1257, 1487, 215, 154, 117, 1857, 428, 2249, 1731, 514, 1559, 1489, 249, 390, 1374, 462, 305, 2319, 

Unit conversion: change elevation in meters to elevation in feet by multipying by 3.2808

In [145]:
elevations_ft = []
elevations2_ft = []

for ele in elevations:
  elevations_ft.append(3.2808*ele)
for ele2 in elevations2:
  elevations2_ft.append(3.2808*ele2)

In [146]:
wildfire_set1['Elevation'] = elevations_ft
wildfire_set2['Elevation'] = elevations2_ft

In [147]:
wildfire_set1.to_csv('elevations.csv', index = False)
wildfire_set2.to_csv('elevations2.csv', index = False)