# Process WFABBA FIgLib Join

<b>Summary:</b><br>
Reads in parsed WFABBA data from csv files created from 1_process_wfabba_merge_files.ipynb (WFABBA GOES-16 & WFABBA GOES-17 detections), smokeynet_test.json & smokeynet_valid.json (SmokeyNet predictions), and camera_metadata_hpwren.csv (contains locations of camera stations associated with SmokeyNet predictions). Join SmokeyNet detections with camera metadata to associate coordinates with every SmokeyNet prediction. For every camera station, join SmokeyNet predictions with potential WFABBA GOES-16 and WFABBA GOES-17 detections, then output results to csv files.<br>

- Read in parsed WFABBA data (outputted from 1_process_wfabba_merge_files.ipynb), SmokeyNet predictions, and camera metadata.
- Join SmokeyNet predictions with potential WFABBA GOES-16/GOES-17 detections.
- Output results to csv files

<b>Output:</b><br>
../..<br>
└── data<br>
&emsp;&emsp;&emsp;└── processed<br>
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;└── \<CAMERA_STATION_NAME\>_all_hard_voting_35.csv<br>

<b>Areas for Improvement:</b><br>
Need to further look into approaches to join SmokeyNet detections with WFABBA GOES-16/GOES-17 detections. Currently looking at joining SmokeyNet predictions with WFABBA detections by location proximity distances (default of 35 miles), camera direction, and whether the detections happen at the exact minute. May need to look into temporal joins if going with current implementation of joins. 
If considering another join approach instead: Currently considering each image as an independent event. May need to consider groupings of images as an event instead. Consider finding first instance of SmokeyNet, WFABBA GOES-16, WFABBA GOES-17 detections?

In [1]:
import pandas as pd
import urllib.request
import datetime as dt
import requests
from bs4 import BeautifulSoup
from datetime import datetime, timedelta
from haversine import haversine, Unit
from shapely.geometry import Point
import geopandas as gpd
from geopandas import GeoDataFrame
import pytz
from sklearn.metrics import accuracy_score
import numpy as np
pd.set_option('display.max_columns', None)

## 1) Read in WFABBA Data and Consolidate into WFABBA GOES-16/GOES-17

In [2]:
# definte the processed and raw data directories
processed_data_dir = "../../data/processed/wfabba/"
raw_data_dir = "../../data/raw/"

In [3]:
# read in GOES 16 inputs
wfabba_goes_16_2019_df = pd.read_csv(processed_data_dir + "GOES-16-2019.csv")
wfabba_goes_16_2020_df = pd.read_csv(processed_data_dir + "GOES-16-2020.csv")
wfabba_goes_16_jan_2021_df = pd.read_csv(processed_data_dir + "GOES-16-Jan-2021.csv")
wfabba_goes_16_2021_df = pd.read_csv(processed_data_dir + "GOES-16-2021.csv")
wfabba_goes_16_2022_df = pd.read_csv(processed_data_dir + "GOES-16-2022.csv")

In [4]:
# get rid of unnecessary columns including ones which contain the same values or all NaN
wfabba_goes_16_2019_df = wfabba_goes_16_2019_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"])
wfabba_goes_16_2020_df = wfabba_goes_16_2020_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"])
wfabba_goes_16_jan_2021_df = wfabba_goes_16_jan_2021_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"])
wfabba_goes_16_2021_df = wfabba_goes_16_2021_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"]) #2021 detections
wfabba_goes_16_2022_df = wfabba_goes_16_2022_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"]) #2022 detections

In [5]:
# filter out any January data in wfabba_goes_16_2021_df since it already exists in wfabba_goes_16_jan_2021_df
print(len(wfabba_goes_16_2021_df))
wfabba_goes_16_2021_df = wfabba_goes_16_2021_df[wfabba_goes_16_2021_df["Timestamp"] >= "2021-02-01"]
wfabba_goes_16_2021_df = wfabba_goes_16_2021_df.reset_index()
wfabba_goes_16_2021_df = wfabba_goes_16_2021_df.drop(columns=["index"])
print(len(wfabba_goes_16_2021_df))
wfabba_goes_16_2021_df

93784
92739


Unnamed: 0,Version,Timestamp,Satellite,FlightModel,ScanMode,ProductType,FileName,MissingValueCode,Latitude,Longitude,Code,FRP,Fire Size,Fire Temp,Pixel Size,Obs BT4,Obs BT11,Bkg BT4,Bkg BT11,SolZen,SatZen,RelAzi,Eco
0,6_6_001g,2021-02-01 23:16:11,GOES-16,FM?,GOES-16,GOES-16,f2021032231611.rev.6_6_001g.FDCC.GOES-16.txt,-9999,34.5628,-119.9396,15,30.3,-9999.0,-9999.0,10018348,294.403,269.097,288.986,286.090,67.554,62.232,-9999,26
1,6_6_001g,2021-02-01 23:21:11,GOES-16,FM?,GOES-16,GOES-16,f2021032232111.rev.6_6_001g.FDCC.GOES-16.txt,-9999,34.8464,-120.0688,15,22.7,-9999.0,-9999.0,10113770,292.327,272.043,287.418,281.058,68.429,62.486,-9999,91
2,6_6_001g,2021-02-01 23:40:14,GOES-16,FM?,GOES-16,GOES-16,f2021032234014.rev.6_6_001g.FDCF.GOES-16.txt,-9999,34.8464,-120.0688,15,32.7,-9999.0,-9999.0,10113154,292.929,271.184,286.603,282.059,71.431,62.486,-9999,91
3,6_6_001g,2021-02-01 23:41:11,GOES-16,FM?,GOES-16,GOES-16,f2021032234111.rev.6_6_001g.FDCC.GOES-16.txt,-9999,34.8464,-120.0688,15,31.3,-9999.0,-9999.0,10113770,292.657,272.489,286.602,282.154,71.593,62.486,-9999,91
4,6_6_001g,2021-02-01 23:41:11,GOES-16,FM?,GOES-16,GOES-16,f2021032234111.rev.6_6_001g.FDCC.GOES-16.txt,-9999,34.5600,-119.9001,15,25.7,-9999.0,-9999.0,10005460,292.216,273.886,287.452,282.747,71.528,62.199,-9999,91
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
92734,6_6_001g,2021-12-30 21:36:17,GOES-16,FM?,GOES-16,GOES-16,f2021364213617.rev.6_6_001g.FDCC.GOES-16.txt,-9999,32.2080,-114.8324,12,103.3,-9999.0,-9999.0,8377921,308.244,285.893,293.029,284.347,62.096,56.890,-9999,51
92735,6_6_001g,2021-12-30 21:40:20,GOES-16,FM?,GOES-16,GOES-16,f2021364214020.rev.6_6_001g.FDCF.GOES-16.txt,-9999,33.0938,-115.5449,12,84.0,-9999.0,-9999.0,8665165,307.792,287.315,293.792,288.115,62.984,57.974,-9999,37
92736,6_6_001g,2021-12-30 21:40:20,GOES-16,FM?,GOES-16,GOES-16,f2021364214020.rev.6_6_001g.FDCF.GOES-16.txt,-9999,32.2080,-114.8324,12,108.3,-9999.0,-9999.0,8377922,308.531,284.880,292.491,284.153,62.526,56.890,-9999,51
92737,6_6_001g,2021-12-30 21:41:17,GOES-16,FM?,GOES-16,GOES-16,f2021364214117.rev.6_6_001g.FDCC.GOES-16.txt,-9999,33.0938,-115.5449,10,104.0,2236.0,923.0,8665164,310.078,287.636,293.749,288.087,63.089,57.974,-9999,37


In [9]:
# join all GOES-16 dataframes into unified wfabba_goes_16_df
wfabba_goes_16_df = pd.concat([wfabba_goes_16_2019_df, wfabba_goes_16_2020_df, wfabba_goes_16_jan_2021_df, wfabba_goes_16_2021_df, wfabba_goes_16_2022_df])
wfabba_goes_16_df["timestamp_converted"] = pd.to_datetime(wfabba_goes_16_df["Timestamp"], infer_datetime_format=True, origin="unix", utc=True)
wfabba_goes_16_df = wfabba_goes_16_df.reset_index()
wfabba_goes_16_df = wfabba_goes_16_df.drop(columns=["index"])
wfabba_goes_16_df

Unnamed: 0,Version,Timestamp,Satellite,FlightModel,ScanMode,ProductType,FileName,MissingValueCode,Latitude,Longitude,Code,FRP,Fire Size,Fire Temp,Pixel Size,Obs BT4,Obs BT11,Bkg BT4,Bkg BT11,SolZen,SatZen,RelAzi,Eco,timestamp_converted
0,6_5_012g,2019-06-01 00:06:41,GOES-16,FM1,C,FDCC,f2019152000641.rev.6_5_012g.FDCC.GOES-16,-9999,34.8046,-119.1102,15,58.2,-9999.0,-9999.0,12645720,299.160,276.622,291.865,290.292,56.120,61.721,-9999,22,2019-06-01 00:06:41+00:00
1,6_5_012g,2019-06-01 00:06:41,GOES-16,FM1,C,FDCC,f2019152000641.rev.6_5_012g.FDCC.GOES-16,-9999,33.8161,-116.9279,12,58.0,-9999.0,-9999.0,11221792,305.075,285.306,298.966,293.278,58.010,59.470,-9999,46,2019-06-01 00:06:41+00:00
2,6_5_012g,2019-06-01 00:10:44,GOES-16,FM1,F,FDCF,f2019152001044.rev.6_5_012g.FDCF.GOES-16,-9999,34.8046,-119.1102,15,56.4,-9999.0,-9999.0,12645721,297.737,275.300,290.263,288.759,56.938,61.721,-9999,22,2019-06-01 00:10:44+00:00
3,6_5_012g,2019-06-01 00:11:41,GOES-16,FM1,C,FDCC,f2019152001141.rev.6_5_012g.FDCC.GOES-16,-9999,34.8046,-119.1102,15,57.2,-9999.0,-9999.0,12645720,297.646,275.515,290.003,288.553,57.143,61.721,-9999,22,2019-06-01 00:11:41+00:00
4,6_5_012g,2019-06-01 00:16:41,GOES-16,FM1,C,FDCC,f2019152001641.rev.6_5_012g.FDCC.GOES-16,-9999,34.6392,-113.5781,15,35.1,-9999.0,-9999.0,10160281,308.435,297.187,305.010,302.197,62.694,57.443,-9999,51,2019-06-01 00:16:41+00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
314293,6_6_001g,2022-06-27 01:21:17,GOES-16,FM?,GOES-16,GOES-16,f2022178012117.rev.6_6_001g.FDCC.GOES-16.txt,-9999,34.7708,-116.3968,15,7.2,-9999.0,-9999.0,9141029,310.676,291.569,310.559,305.116,71.709,59.628,-9999,51,2022-06-27 01:21:17+00:00
314294,6_6_001g,2022-06-27 01:26:17,GOES-16,FM?,GOES-16,GOES-16,f2022178012617.rev.6_6_001g.FDCC.GOES-16.txt,-9999,32.7308,-113.9479,15,20.1,-9999.0,-9999.0,8284628,308.972,291.042,307.019,299.238,75.255,56.520,-9999,37,2022-06-27 01:26:17+00:00
314295,6_6_001g,2022-06-27 01:30:20,GOES-16,FM?,GOES-16,GOES-16,f2022178013020.rev.6_6_001g.FDCF.GOES-16.txt,-9999,32.7247,-114.6485,15,27.2,-9999.0,-9999.0,8420192,310.795,298.449,307.288,302.021,75.495,57.057,-9999,1,2022-06-27 01:30:20+00:00
314296,6_6_001g,2022-06-27 01:46:17,GOES-16,FM?,GOES-16,GOES-16,f2022178014617.rev.6_6_001g.FDCC.GOES-16.txt,-9999,32.7092,-113.5860,15,42.5,-9999.0,-9999.0,8211141,309.159,289.289,304.087,296.556,79.503,56.228,-9999,51,2022-06-27 01:46:17+00:00


In [10]:
#read in GOES 17 inputs
wfabba_goes_17_2019_df = pd.read_csv(processed_data_dir + "GOES-17-2019.csv")
wfabba_goes_17_2020_df = pd.read_csv(processed_data_dir + "GOES-17-2020.csv")
wfabba_goes_17_jan_2021_df = pd.read_csv(processed_data_dir + "GOES-17-Jan-2021.csv")
wfabba_goes_17_2021_df = pd.read_csv(processed_data_dir + "GOES-17-2021.csv")
wfabba_goes_17_2022_df = pd.read_csv(processed_data_dir + "GOES-17-2022.csv")

In [11]:
#get rid of unnecessary columns including ones which contain the same values or all NaN
wfabba_goes_17_2019_df = wfabba_goes_17_2019_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"])
wfabba_goes_17_2020_df = wfabba_goes_17_2020_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"])
wfabba_goes_17_jan_2021_df = wfabba_goes_17_jan_2021_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"])
wfabba_goes_17_2021_df = wfabba_goes_17_2021_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"]) #2021 detections
wfabba_goes_17_2022_df = wfabba_goes_17_2022_df.drop(columns = ["Unnamed: 0", "Algorithm","Instrument","DataSource", "DataCreationTimestamp", "NavProjSubPtLong", "ActualSatSubPtLong", "NumFire", "Line", "Element"]) #2022 detections

In [12]:
# filter out any January data in wfabba_goes_17_2021_df since it already exists in wfabba_goes_17_jan_2021_df
print(len(wfabba_goes_17_2021_df))
wfabba_goes_17_2021_df = wfabba_goes_17_2021_df[wfabba_goes_17_2021_df["Timestamp"] >= "2021-02-01"]
wfabba_goes_17_2021_df = wfabba_goes_17_2021_df.reset_index()
wfabba_goes_17_2021_df = wfabba_goes_17_2021_df.drop(columns=["index"])
print(len(wfabba_goes_17_2021_df))
wfabba_goes_17_2021_df

376689
334881


Unnamed: 0,Version,Timestamp,Satellite,FlightModel,ScanMode,ProductType,FileName,MissingValueCode,Latitude,Longitude,Code,FRP,Fire Size,Fire Temp,Pixel Size,Obs BT4,Obs BT11,Bkg BT4,Bkg BT11,SolZen,SatZen,RelAzi,Eco
0,6_6_001g,2021-02-01 20:24:25,GOES-17,FM?,GOES-17,GOES-17,f2021032202425.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,34.9345,-119.7105,15.0,19.7,-9999.0,-9999.0,6156362,303.374,280.572,299.357,290.776,52.198,44.873,-9999,51
1,6_6_001g,2021-02-01 20:25:25,GOES-17,FM?,GOES-17,GOES-17,f2021032202525.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,34.9345,-119.7105,15.0,10.7,-9999.0,-9999.0,6156362,303.526,279.576,301.773,291.597,52.212,44.873,-9999,51
2,6_6_001g,2021-02-01 20:26:17,GOES-17,FM?,GOES-17,GOES-17,f2021032202617.rev.6_6_001g.FDCC.GOES-17.txt,-9999,34.9345,-119.7105,15.0,19.8,-9999.0,-9999.0,6156361,303.602,277.521,299.613,291.022,52.227,44.873,-9999,51
3,6_6_001g,2021-02-01 20:26:25,GOES-17,FM?,GOES-17,GOES-17,f2021032202625.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,34.9345,-119.7105,15.0,11.5,-9999.0,-9999.0,6156362,303.639,278.675,301.692,291.601,52.227,44.873,-9999,51
4,6_6_001g,2021-02-02 18:22:25,GOES-17,FM?,GOES-17,GOES-17,f2021033182225.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,34.3942,-116.7416,15.0,11.7,-9999.0,-9999.0,6270828,299.477,274.375,297.268,287.651,56.353,45.776,-9999,51
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
334876,6_6_001g,2021-12-31 21:21:25,GOES-17,FM?,GOES-17,GOES-17,f2021365212125.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,34.8448,-120.2181,15.0,18.2,-9999.0,-9999.0,6117177,296.506,280.503,291.394,285.431,61.135,44.560,-9999,91
334877,6_6_001g,2021-12-31 21:22:25,GOES-17,FM?,GOES-17,GOES-17,f2021365212225.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,34.8448,-120.2181,15.0,17.5,-9999.0,-9999.0,6117177,296.313,280.264,291.413,285.475,61.207,44.560,-9999,91
334878,6_6_001g,2021-12-31 21:26:17,GOES-17,FM?,GOES-17,GOES-17,f2021365212617.rev.6_6_001g.FDCC.GOES-17.txt,-9999,34.5955,-120.3267,15.0,23.3,-9999.0,-9999.0,6080542,297.220,276.814,290.461,285.016,61.242,44.264,-9999,91
334879,6_6_001g,2021-12-31 21:27:25,GOES-17,FM?,GOES-17,GOES-17,f2021365212725.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,34.5955,-120.3267,15.0,23.5,-9999.0,-9999.0,6078608,297.220,277.062,290.378,285.005,61.318,44.264,-9999,91


In [13]:
# join all GOES-17 dataframes into unified wfabba_goes_17_df
wfabba_goes_17_df = pd.concat([wfabba_goes_17_2019_df, wfabba_goes_17_2020_df, wfabba_goes_17_jan_2021_df, wfabba_goes_17_2021_df, wfabba_goes_17_2022_df])
wfabba_goes_17_df["timestamp_converted"] = pd.to_datetime(wfabba_goes_17_df["Timestamp"], infer_datetime_format=True, origin="unix", utc=True)
wfabba_goes_17_df = wfabba_goes_17_df.reset_index()
wfabba_goes_17_df = wfabba_goes_17_df.drop(columns=["index"])
wfabba_goes_17_df

Unnamed: 0,Version,Timestamp,Satellite,FlightModel,ScanMode,ProductType,FileName,MissingValueCode,Latitude,Longitude,Code,FRP,Fire Size,Fire Temp,Pixel Size,Obs BT4,Obs BT11,Bkg BT4,Bkg BT11,SolZen,SatZen,RelAzi,Eco,timestamp_converted
0,6_5_012g,2019-06-01 00:01:01,GOES-16,FM2,M2,FDCM2,f2019152000101.rev.6_5_012g.FDCM2.GOES-17,-9999,34.6913,-113.4925,15.0,2.2,-9999.0,-9999.0,6932942,311.355,299.558,311.023,307.003,59.600,47.798,-9999,51,2019-06-01 00:01:01+00:00
1,6_5_012g,2019-06-01 00:10:33,GOES-16,FM2,F,FDCF,f2019152001033.rev.6_5_012g.FDCF.GOES-17,-9999,34.7402,-113.5010,15.0,13.8,-9999.0,-9999.0,6939613,310.251,298.833,308.758,305.011,61.522,47.837,-9999,51,2019-06-01 00:10:33+00:00
2,6_5_012g,2019-06-01 00:11:19,GOES-16,FM2,C,FDCC,f2019152001119.rev.6_5_012g.FDCC.GOES-17,-9999,34.7403,-113.5010,15.0,13.1,-9999.0,-9999.0,6939925,310.069,297.604,308.684,304.963,61.725,47.837,-9999,51,2019-06-01 00:11:19+00:00
3,6_5_012g,2019-06-01 00:11:57,GOES-16,FM2,M2,FDCM2,f2019152001157.rev.6_5_012g.FDCM2.GOES-17,-9999,34.7403,-113.5010,15.0,12.7,-9999.0,-9999.0,6939730,310.038,297.722,308.726,305.003,61.827,47.837,-9999,51,2019-06-01 00:11:57+00:00
4,6_5_012g,2019-06-01 00:12:57,GOES-16,FM2,M2,FDCM2,f2019152001257.rev.6_5_012g.FDCM2.GOES-17,-9999,34.7403,-113.5010,15.0,10.7,-9999.0,-9999.0,6939730,309.669,296.126,308.658,304.858,62.030,47.837,-9999,51,2019-06-01 00:12:57+00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1665531,6_6_001g,2022-06-09 15:35:25,GOES-17,FM?,GOES-17,GOES-17,f2022160153525.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,32.9253,-113.3350,15.0,9.3,-9999.0,-9999.0,6343660,313.618,294.387,312.149,302.610,52.576,46.333,-9999,51,2022-06-09 15:35:25+00:00
1665532,6_6_001g,2022-06-09 15:36:17,GOES-17,FM?,GOES-17,GOES-17,f2022160153617.rev.6_6_001g.FDCC.GOES-17.txt,-9999,32.9495,-113.3269,15.0,10.8,-9999.0,-9999.0,6347058,313.618,293.873,312.369,301.972,52.358,46.359,-9999,51,2022-06-09 15:36:17+00:00
1665533,6_6_001g,2022-06-09 15:39:25,GOES-17,FM?,GOES-17,GOES-17,f2022160153925.rev.6_6_001g.FDCM1.GOES-17.txt,-9999,32.9485,-113.3524,15.0,12.8,-9999.0,-9999.0,6346612,313.781,295.380,312.286,302.549,51.752,46.343,-9999,51,2022-06-09 15:39:25+00:00
1665534,6_6_001g,2022-06-09 15:50:32,GOES-17,FM?,GOES-17,GOES-17,f2022160155032.rev.6_6_001g.FDCF.GOES-17.txt,-9999,32.8758,-113.3766,15.0,4.8,-9999.0,-9999.0,6335160,313.645,289.171,313.620,303.794,49.474,46.265,-9999,51,2022-06-09 15:50:32+00:00


In [15]:
# filter out any low probability fire detections in both GOES-16 and GOES-17 WFABBA data
print(wfabba_goes_16_df.shape)
print(wfabba_goes_17_df.shape)

wfabba_goes_16_df = wfabba_goes_16_df[(wfabba_goes_16_df["Code"] != 15) & (wfabba_goes_16_df["Code"] != 35)]
wfabba_goes_17_df = wfabba_goes_17_df[(wfabba_goes_17_df["Code"] != 15) & (wfabba_goes_17_df["Code"] != 35)]

print(wfabba_goes_16_df.shape)
print(wfabba_goes_17_df.shape)

(314298, 24)
(1665536, 24)
(193005, 24)
(902206, 24)


### Convert the coordinates of WFABBA GOES-17 and GOES-16 from EPSG 4326 to EPSG 3310 to allow for distance calculations down to the meter

In [16]:
#convert WFABBA GOES 16 coordinates from EPSG 4326 to EPSG 3310
coords = [Point(xy) for xy in zip(wfabba_goes_16_df['Longitude'], wfabba_goes_16_df['Latitude'])]
wfabba_goes_16_df = GeoDataFrame(wfabba_goes_16_df, crs = "EPSG:4326", geometry = coords) 
wfabba_goes_16_df = wfabba_goes_16_df.to_crs('EPSG:3310')
wfabba_goes_16_df[["Latitude","Longitude","geometry"]]

Unnamed: 0,Latitude,Longitude,geometry
1,33.8161,-116.9279,POINT (284426.435 -461954.924)
7,34.3607,-117.3335,POINT (245186.209 -402701.328)
10,34.3607,-117.3335,POINT (245186.209 -402701.328)
21,34.6340,-119.8882,POINT (10245.661 -375812.677)
22,34.6340,-119.8882,POINT (10245.661 -375812.677)
...,...,...,...
314259,34.9267,-118.3408,POINT (151477.338 -342000.817)
314263,34.9267,-118.3408,POINT (151477.338 -342000.817)
314266,34.9267,-118.3408,POINT (151477.338 -342000.817)
314267,34.9267,-118.3408,POINT (151477.338 -342000.817)


In [17]:
#convert WFABBA GOES 17 coordinates from EPSG 4326 to EPSG 3310
coords = [Point(xy) for xy in zip(wfabba_goes_17_df['Longitude'], wfabba_goes_17_df['Latitude'])]
wfabba_goes_17_df = GeoDataFrame(wfabba_goes_17_df, crs = "EPSG:4326", geometry = coords) 
wfabba_goes_17_df = wfabba_goes_17_df.to_crs('EPSG:3310')
wfabba_goes_17_df[["Latitude","Longitude","geometry"]]

Unnamed: 0,Latitude,Longitude,geometry
11,32.6012,-114.9570,POINT (473918.908 -588584.507)
13,32.6012,-114.9570,POINT (473918.908 -588584.507)
14,32.6012,-114.9570,POINT (473918.908 -588584.507)
15,32.6012,-114.9570,POINT (473918.908 -588584.507)
18,32.6012,-114.9570,POINT (473918.908 -588584.507)
...,...,...,...
1665507,32.3641,-114.9550,POINT (475502.723 -614784.940)
1665508,32.3402,-114.9622,POINT (474965.240 -617462.561)
1665509,32.3641,-114.9550,POINT (475502.723 -614784.940)
1665510,32.3641,-114.9550,POINT (475502.723 -614784.940)


## 2) Camera Metadata Processing

In [18]:
# read in camera metadata
camera_metadata_df = pd.read_csv("../../data/processed/camera_metadata_hpwren.csv")
camera_metadata_df

Unnamed: 0,camera_id,direction,camera_name,camera_abbrev,image_id,prev_image_ids,long,lat,elevation,geometry.type,geometry.coordinates,x_resolution,y_resolution,center_lat,center_long,center_angle,properties.description.url,intersections
0,hpwren0_unknown direction,unknown direction,,tje,tje-1-mobo-c,[''],-117.120000,32.550000,10.0,Point,"[-117.12, 32.55, 10]",,,,,,http://hpwren.ucsd.edu/cameras/TJE.html,
1,hpwren1_north,north,Big Black Mountain,bm,bm-n-mobo-c,[''],-116.808092,33.159927,4055.0,Point,"[-116.8081, 33.1599, 4055]",3072.0,2048.0,33.181599,-116.807554,-0.024816,http://hpwren.ucsd.edu/cameras/BBlackMtn.html,"[('bl', 'north'), ('bl', 'east'), ('bh', 'nort..."
2,hpwren1_east,east,Big Black Mountain,bm,bm-e-mobo-c,[''],-116.808092,33.159927,4055.0,Point,"[-116.8081, 33.1599, 4055]",3072.0,2048.0,33.158781,-116.790230,-0.064085,http://hpwren.ucsd.edu/cameras/BBlackMtn.html,"[('bh', 'east'), ('bh', 'south'), ('cp', 'nort..."
3,hpwren1_south,south,Big Black Mountain,bm,bm-s-mobo-c,[''],-116.808092,33.159927,4055.0,Point,"[-116.8081, 33.1599, 4055]",3072.0,2048.0,33.157932,-116.807962,0.065022,http://hpwren.ucsd.edu/cameras/BBlackMtn.html,"[('bl', 'east'), ('bl', 'south'), ('bh', 'sout..."
4,hpwren1_west,west,Big Black Mountain,bm,bm-w-mobo-c,[''],-116.808092,33.159927,4055.0,Point,"[-116.8081, 33.1599, 4055]",3072.0,2048.0,33.159091,-116.858706,0.016519,http://hpwren.ucsd.edu/cameras/BBlackMtn.html,"[('bl', 'north'), ('bl', 'east'), ('bl', 'sout..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
133,hpwren_missing7_south,south,SMER TCS9 HPWREN,smer,smer-tcs9-s-mobo-c,[''],-117.530000,33.710000,5669.0,,,,,,,,,"[('chino', 'east'), ('chino', 'south'), ('rm',..."
134,hpwren_missing7_west,west,SMER TCS9 HPWREN,smer,smer-tcs9-w-mobo-c,['smer-tcs9-mobo-c'],-117.530000,33.710000,5669.0,,,,,,,,,"[('chino', 'north'), ('chino', 'east'), ('chin..."
135,hpwren_missing8_west,east,SMER TCS3 HPWREN,smer,smer-tcs3-mobo-c,['smer-tcs3-mobo-c'],-117.180000,33.450000,1400.0,,,,,,,,,"[('bm', 'north'), ('bm', 'west'), ('bl', 'nort..."
136,hpwren_missing9_west,west,SMER TCS8 HPWREN,smer,smer-tcs8-mobo-c,['smer-tcs8-mobo-c'],-117.150000,33.460000,2063.0,,,,,,,,,"[('bl', 'north'), ('bl', 'west'), ('bh', 'west..."


In [19]:
# read in camera metadata
camera_metadata_df = pd.read_csv("../../data/processed/camera_image_id_mappings.csv")
camera_metadata_df

Unnamed: 0,camera_id,image_id,lat,long,direction
0,hpwren0_unknown direction,tje-1-mobo-c,32.550000,-117.120000,unknown direction
1,hpwren1_north,bm-n-mobo-c,33.159927,-116.808092,north
2,hpwren1_east,bm-e-mobo-c,33.159927,-116.808092,east
3,hpwren1_south,bm-s-mobo-c,33.159927,-116.808092,south
4,hpwren1_west,bm-w-mobo-c,33.159927,-116.808092,west
...,...,...,...,...,...
166,hpwren_missing5_north,sp-n-mobo-c,33.711172,-117.534115,north
167,hpwren_missing5_east,sp-e-mobo-c,33.711172,-117.534115,east
168,hpwren_missing5_south,sp-s-mobo-c,33.711172,-117.534115,south
169,hpwren_missing5_west,sp-w-mobo-c,33.711172,-117.534115,west


## 3) Matching WFABBA to SmokeyNet 

In [20]:
#Create dataframe for every minute of specified time period
times = []
start = datetime(2019, 6 , 1, 0, 0, 0, 0, pytz.UTC)
end = datetime(2021, 7, 11, 23, 59, 0, 0, pytz.UTC)

while start <= end:
    times.append(start)
    start = start + timedelta(minutes = 1)

minutes_df = pd.DataFrame(times, columns = ["timestamp"])
minutes_df

Unnamed: 0,timestamp
0,2019-06-01 00:00:00+00:00
1,2019-06-01 00:01:00+00:00
2,2019-06-01 00:02:00+00:00
3,2019-06-01 00:03:00+00:00
4,2019-06-01 00:04:00+00:00
...,...
1111675,2021-07-11 23:55:00+00:00
1111676,2021-07-11 23:56:00+00:00
1111677,2021-07-11 23:57:00+00:00
1111678,2021-07-11 23:58:00+00:00


In [21]:
# Create testing SmokeyNet df
df_test = pd.read_json(raw_data_dir + "smokeynet_test.json", orient="index").reset_index().rename(columns={"index":"filepath"})
df_test["type"] = "test"
df_test 

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test
...,...,...,...,...,...,...,...
4880,20161113_FIRE_bm-n-mobo-c/1479069033_+02100,20161113_FIRE_bm-n-mobo-c,1,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, ...",test
4881,20161113_FIRE_bm-n-mobo-c/1479069093_+02160,20161113_FIRE_bm-n-mobo-c,1,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, ...",test
4882,20161113_FIRE_bm-n-mobo-c/1479069153_+02220,20161113_FIRE_bm-n-mobo-c,1,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, ...",test
4883,20161113_FIRE_bm-n-mobo-c/1479069213_+02280,20161113_FIRE_bm-n-mobo-c,1,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, ...",test


In [22]:
#Create validating SmokeyNet df
df_valid = pd.read_json(raw_data_dir + "smokeynet_valid.json", orient="index").reset_index().rename(columns={"index":"filepath"})
df_valid["type"] = "valid"
df_valid

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type
0,20200807_AppleFire-backfire-operation_hp-n-mob...,20200807_AppleFire-backfire-operation_hp-n-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
1,20200807_AppleFire-backfire-operation_hp-n-mob...,20200807_AppleFire-backfire-operation_hp-n-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
2,20200807_AppleFire-backfire-operation_hp-n-mob...,20200807_AppleFire-backfire-operation_hp-n-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
3,20200807_AppleFire-backfire-operation_hp-n-mob...,20200807_AppleFire-backfire-operation_hp-n-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
4,20200807_AppleFire-backfire-operation_hp-n-mob...,20200807_AppleFire-backfire-operation_hp-n-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
...,...,...,...,...,...,...,...
4906,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
4907,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
4908,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
4909,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid


In [23]:
#Join the SmokeyNet DFs together. For now just joining validation and test DFs
df_labels = pd.concat([df_test, df_valid]).reset_index().drop(columns = ["index"])
df_labels

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test
...,...,...,...,...,...,...,...
9791,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
9792,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
9793,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid
9794,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid


In [24]:
# set the date and year columns
df_labels["date"] = df_labels["camera_name"].str.split("_", n=1, expand=True)[0]
df_labels["year"] = df_labels["date"].str[:4]
df_labels["date"] = pd.to_datetime(df_labels["date"])
df_labels

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type,date,year
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test,2019-10-01,2019
...,...,...,...,...,...,...,...,...,...
9791,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020
9792,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020
9793,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020
9794,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020


In [25]:
# keeping only entries from 2019-06-01 onwards
df_labels_filtered = df_labels[df_labels["date"] >= "2019-06-01"].reset_index().drop(columns=["index"])
df_labels_filtered

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type,date,year
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,20191001_FIRE_lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test,2019-10-01,2019
...,...,...,...,...,...,...,...,...,...
8820,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020
8821,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020
8822,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020
8823,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,20200813_Ranch2Fire_marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020


In [26]:
#create time, datetime, event_name, camera_name attributes
df_labels_filtered["time"] = df_labels_filtered["filepath"].str.split("/").str[1]
df_labels_filtered["time"] = df_labels_filtered["time"].str.split("_").str[0]
df_labels_filtered["datetime"] = pd.to_datetime(df_labels_filtered["time"], unit="s", origin="unix", utc=True)
df_labels_filtered["event_name"] = df_labels_filtered["filepath"].str.split("/").str[0]
df_labels_filtered["camera_name"] = df_labels_filtered["camera_name"].str.split("_").str[-1]
df_labels_filtered

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type,date,year,time,datetime,event_name
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950465,2019-10-01 17:21:05+00:00,20191001_FIRE_lp-s-mobo-c
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950525,2019-10-01 17:22:05+00:00,20191001_FIRE_lp-s-mobo-c
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950585,2019-10-01 17:23:05+00:00,20191001_FIRE_lp-s-mobo-c
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950645,2019-10-01 17:24:05+00:00,20191001_FIRE_lp-s-mobo-c
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test,2019-10-01,2019,1569950705,2019-10-01 17:25:05+00:00,20191001_FIRE_lp-s-mobo-c
...,...,...,...,...,...,...,...,...,...,...,...,...
8820,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360042,2020-08-13 23:07:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c
8821,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360098,2020-08-13 23:08:18+00:00,20200813_Ranch2Fire_marconi-n-mobo-c
8822,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360162,2020-08-13 23:09:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c
8823,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360219,2020-08-13 23:10:19+00:00,20200813_Ranch2Fire_marconi-n-mobo-c


In [27]:
# join SmokeyNet data with camera metadata
df_labels_filtered = df_labels_filtered.merge(camera_metadata_df, left_on="camera_name", right_on="image_id", how="left")
df_labels_filtered

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type,date,year,time,datetime,event_name,camera_id,image_id,lat,long,direction
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950465,2019-10-01 17:21:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950525,2019-10-01 17:22:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950585,2019-10-01 17:23:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950645,2019-10-01 17:24:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test,2019-10-01,2019,1569950705,2019-10-01 17:25:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8978,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360042,2020-08-13 23:07:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north
8979,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360098,2020-08-13 23:08:18+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north
8980,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360162,2020-08-13 23:09:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north
8981,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360219,2020-08-13 23:10:19+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north


In [28]:
# convert joined SmokeyNet-camera metadata dataframe's coordinates from EPSG 4326 to EPSG 3310
coords = [Point(xy) for xy in zip(df_labels_filtered['long'], df_labels_filtered['lat'])]
df_labels_filtered = GeoDataFrame(df_labels_filtered, crs = "EPSG:4326", geometry = coords) 
df_labels_filtered = df_labels_filtered.to_crs('EPSG:3310')
df_labels_filtered

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type,date,year,time,datetime,event_name,camera_id,image_id,lat,long,direction,geometry
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950465,2019-10-01 17:21:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981)
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950525,2019-10-01 17:22:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981)
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950585,2019-10-01 17:23:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981)
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950645,2019-10-01 17:24:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981)
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test,2019-10-01,2019,1569950705,2019-10-01 17:25:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981)
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8978,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360042,2020-08-13 23:07:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326)
8979,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360098,2020-08-13 23:08:18+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326)
8980,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360162,2020-08-13 23:09:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326)
8981,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360219,2020-08-13 23:10:19+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326)


In [29]:
# rounds timestamps to nearest minute on the dot
def round_secs(x):
    x = x + timedelta(minutes = 1)
    x = x.replace(second=0)
    return x

In [30]:
# determines if a WFABBA detection is within the same direction as the camera
def is_in_camera_direction(camera_geometry_pt, direction, wfabba_geometry_pt):
    if direction == "north":
        return wfabba_geometry_pt.y >= camera_geometry_pt.y
    elif direction == "south":
        return wfabba_geometry_pt.y <= camera_geometry_pt.y
    elif direction == "east":
        return wfabba_geometry_pt.x >= camera_geometry_pt.x
    elif direction == "west":
        return wfabba_geometry_pt.x <= camera_geometry_pt.x
    else:
        # unknown or something else
        pass
    

In [31]:
# finds any matches with specified WFABBA dataset based off of 
# whether distance to camera is within specified radius & camera direction
def matches_distance_prox(camera_geometry, direction, radius_miles, wfabba_df):    
    wfabba_df["distance_m"] = wfabba_df["geometry"].distance(camera_geometry)
    wfabba_df["distance_mi"] = wfabba_df["distance_m"]/1609.344        
    match_results_df = wfabba_df[(wfabba_df["distance_mi"] <= radius_miles)].copy()
    
    #filter for detections within same direction
    match_results_df["is_in_direction"] = match_results_df.apply(
        lambda row: is_in_camera_direction(camera_geometry, direction, row["geometry"]), axis=1
    )
    match_results_df = match_results_df[match_results_df["is_in_direction"] == True]

    return match_results_df

In [32]:
# round the SmokeyNet timestamps to nearest minute on the dot
df_labels_filtered["datetime_rounded"] = df_labels_filtered["datetime"].apply(lambda x: round_secs(x))
df_labels_filtered

Unnamed: 0,filepath,camera_name,image_gt,tile_gt,image_pred,tile_pred,type,date,year,time,datetime,event_name,camera_id,image_id,lat,long,direction,geometry,datetime_rounded
0,20191001_FIRE_lp-s-mobo-c/1569950465_-02281,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950465,2019-10-01 17:21:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981),2019-10-01 17:22:00+00:00
1,20191001_FIRE_lp-s-mobo-c/1569950525_-02221,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950525,2019-10-01 17:22:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981),2019-10-01 17:23:00+00:00
2,20191001_FIRE_lp-s-mobo-c/1569950585_-02161,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950585,2019-10-01 17:23:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981),2019-10-01 17:24:00+00:00
3,20191001_FIRE_lp-s-mobo-c/1569950645_-02101,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",test,2019-10-01,2019,1569950645,2019-10-01 17:24:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981),2019-10-01 17:25:00+00:00
4,20191001_FIRE_lp-s-mobo-c/1569950705_-02041,lp-s-mobo-c,0,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ...",test,2019-10-01,2019,1569950705,2019-10-01 17:25:05+00:00,20191001_FIRE_lp-s-mobo-c,hpwren12_south,lp-s-mobo-c,32.701517,-116.764561,south,POINT (303757.711 -584899.981),2019-10-01 17:26:00+00:00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8978,20200813_Ranch2Fire_marconi-n-mobo-c/159736004...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360042,2020-08-13 23:07:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326),2020-08-13 23:08:00+00:00
8979,20200813_Ranch2Fire_marconi-n-mobo-c/159736009...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360098,2020-08-13 23:08:18+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326),2020-08-13 23:09:00+00:00
8980,20200813_Ranch2Fire_marconi-n-mobo-c/159736016...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360162,2020-08-13 23:09:22+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326),2020-08-13 23:10:00+00:00
8981,20200813_Ranch2Fire_marconi-n-mobo-c/159736021...,marconi-n-mobo-c,1,[],0,"[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...",valid,2020-08-13,2020,1597360219,2020-08-13 23:10:19+00:00,20200813_Ranch2Fire_marconi-n-mobo-c,hpwren30_north,marconi-n-mobo-c,33.519127,-117.480907,north,POINT (234115.176 -496381.326),2020-08-13 23:11:00+00:00


In [33]:
# get all unique cameras being considered
unique_cameras = df_labels_filtered["camera_name"].unique()
unique_cameras

array(['lp-s-mobo-c', 'pi-s-mobo-c', 'pi-n-mobo-c', 'ml-w-mobo-c',
       'lo-s-mobo-c', 'om-e-mobo-c', 'lp-n-mobo-c', 'mlo-s-mobo-c',
       'bh-w-mobo-c', 'sm-e-mobo-c', 'sm-s-mobo-c', 'om-w-mobo',
       'ml-s-mobo-c', 'om-s-mobo-c', 'bm-w-mobo-c', 'so-w-mobo-c',
       'hp-e-mobo-c', 'om-w-mobo-c', 'dwpgm-n-mobo-c', 'hp-n-mobo-c',
       'rm-e-mobo-c', 'cp-s-mobo-c', 'tp-s-mobo-c', 'smer-tcs8-mobo-c',
       'wc-e-mobo-c', 'wc-n-mobo-c', 'rm-w-mobo-c', 'smer-tcs9-mobo-c',
       'wc-s-mobo-c', 'sclm-e-mobo-c', 'mg-n-mobo-c', 'lp-w-mobo-c',
       'hp-s-mobo-c', 'mlo-n-mobo-c', 'pi-s-mobo', 'pi-e-mobo-c',
       'om-n-mobo-c', 'sp-n-mobo-c', 'om-s-mobo', 'sm-w-mobo-c',
       'rm-n-mobo-c', 'hp-w-mobo-c', 'tp-w-mobo-c', '69bravo-e-mobo-c',
       'pi-w-mobo-c', 'sjh-n-mobo-c', 'bl-n-mobo-c', 'lp-s-mobo',
       'syp-w-mobo-c', 'sm-n-mobo-c', 'vo-n-mobo-c', 'lp-e-mobo-c',
       'bm-e-mobo-c', 'bl-s-mobo-c', 'marconi-n-mobo-c'], dtype=object)

In [34]:
# if there are cameras that don't have associated directions, filter them out
unusable_cameras = df_labels_filtered[df_labels_filtered["direction"].isna()]["camera_name"].unique()
unique_cameras = np.setdiff1d(unique_cameras, unusable_cameras)
unique_cameras 

array(['69bravo-e-mobo-c', 'bh-w-mobo-c', 'bl-n-mobo-c', 'bl-s-mobo-c',
       'bm-e-mobo-c', 'bm-w-mobo-c', 'cp-s-mobo-c', 'dwpgm-n-mobo-c',
       'hp-e-mobo-c', 'hp-n-mobo-c', 'hp-s-mobo-c', 'hp-w-mobo-c',
       'lo-s-mobo-c', 'lp-e-mobo-c', 'lp-n-mobo-c', 'lp-s-mobo',
       'lp-s-mobo-c', 'lp-w-mobo-c', 'marconi-n-mobo-c', 'mg-n-mobo-c',
       'ml-s-mobo-c', 'ml-w-mobo-c', 'mlo-n-mobo-c', 'mlo-s-mobo-c',
       'om-e-mobo-c', 'om-n-mobo-c', 'om-s-mobo', 'om-s-mobo-c',
       'om-w-mobo', 'om-w-mobo-c', 'pi-e-mobo-c', 'pi-n-mobo-c',
       'pi-s-mobo', 'pi-s-mobo-c', 'pi-w-mobo-c', 'rm-e-mobo-c',
       'rm-n-mobo-c', 'rm-w-mobo-c', 'sclm-e-mobo-c', 'sjh-n-mobo-c',
       'sm-e-mobo-c', 'sm-n-mobo-c', 'sm-s-mobo-c', 'sm-w-mobo-c',
       'smer-tcs8-mobo-c', 'smer-tcs9-mobo-c', 'so-w-mobo-c',
       'sp-n-mobo-c', 'syp-w-mobo-c', 'tp-s-mobo-c', 'tp-w-mobo-c',
       'vo-n-mobo-c', 'wc-e-mobo-c', 'wc-n-mobo-c', 'wc-s-mobo-c'],
      dtype=object)

### Find all potential WFABBA GOES-16 and GOES-17 matches for SmokeyNet predictions for each camera station
Loop through each camera station. For each camera station's SmokeyNet predictions, match with potential WFABBA GOES-16 and WFABBA GOES-17 detections that are within a specified radius (default value of 35 miles), the same camera direction, and the same minute. Join these potentially matched SmokeyNet, WFABBA GOES-16, and WFABBA GOES-17 predictions together. To determine hard voting ensemble final prediction for each grouping of detections, add up how many individual models detected smoke, with a majority vote being declared a positive prediction. Then output the camera station's predictions into a csv file for future accuracy evaluation.

In [35]:
%%time

csv_suffix = "_all_hard_voting_35.csv"

# spatial radius of potential WFABBA matches
distance_miles = 35

#looping for each camera station
for camera in unique_cameras:
    
    print("Camera:",camera)
    camera_df = df_labels_filtered[df_labels_filtered["camera_name"].str.contains(camera)].copy()

    camera_instance = camera_df.iloc[0]
    
    #Find GOES-16 matches
    goes_16_dist_match_df = matches_distance_prox(camera_instance["geometry"], camera_instance["direction"], distance_miles, wfabba_goes_16_df)
    goes_16_dist_match_df["timestamp_converted_rounded"] = goes_16_dist_match_df["timestamp_converted"].apply(lambda x: round_secs(x))
    goes_16_dist_match_df = goes_16_dist_match_df.drop_duplicates(subset = ["timestamp_converted_rounded"])

    #Find GOES-17 matches
    goes_17_dist_match_df = matches_distance_prox(camera_instance["geometry"], camera_instance["direction"], distance_miles, wfabba_goes_17_df)
    goes_17_dist_match_df["timestamp_converted_rounded"] = goes_17_dist_match_df["timestamp_converted"].apply(lambda x: round_secs(x))
    goes_17_dist_match_df = goes_17_dist_match_df.drop_duplicates(subset = ["timestamp_converted_rounded"])

    #SmokeyNet_join
    test_df = minutes_df.merge(camera_df, left_on = "timestamp", right_on = "datetime_rounded",how="left")
    test_df = test_df.rename(columns = {"geometry":"HPWREN_Station_geometry", "lat":"HPWREN_lat", "long":"HPWREN_long", "datetime_rounded":"SmokeyNet_datetime_rounded"})
    print("joined SmokeyNet")
    
    #GOES-16 Join
    test_df = test_df.merge(goes_16_dist_match_df[["timestamp_converted_rounded", "geometry"]], left_on = "timestamp", right_on = "timestamp_converted_rounded",how="left")
    test_df = test_df.rename(columns = {"geometry":"WFABBA_GOES16_geometry", "timestamp_converted_rounded":"WFABBA_GOES16_timestamp_converted_rounded"})
    test_df = test_df[["timestamp","camera_name", "image_gt", "image_pred", "type", "WFABBA_GOES16_geometry"]]
    test_df.loc[test_df["WFABBA_GOES16_geometry"] != None,'goes16_pred'] = 1
    test_df.loc[test_df["WFABBA_GOES16_geometry"] == None,'goes16_pred'] = 0
    print("joined GOES16")

    #GOES-17 Join
    test_df = test_df.merge(goes_17_dist_match_df[["timestamp_converted_rounded", "geometry"]], left_on = "timestamp", right_on = "timestamp_converted_rounded",how="left")
    test_df = test_df.rename(columns = {"geometry":"WFABBA_GOES17_geometry", "timestamp_converted_rounded":"WFABBA_GOES17_timestamp_converted_rounded"})
    test_df = test_df[["timestamp","camera_name", "image_gt", "image_pred", "type", "WFABBA_GOES16_geometry", "goes16_pred", "WFABBA_GOES17_geometry"]]
    test_df.loc[test_df["WFABBA_GOES17_geometry"] != None,'goes17_pred'] = 1
    test_df.loc[test_df["WFABBA_GOES17_geometry"] == None,'goes17_pred'] = 0
    print("joined GOES17")

    #Get all votes and determine if smoke was detected by majority rule
    test_df["final_vote"] = test_df["image_pred"] + test_df["goes16_pred"] + test_df["goes17_pred"]
    test_df.loc[test_df["final_vote"] >= 2,'final_pred'] = 1
    test_df.loc[test_df["final_vote"] < 2,'final_pred'] = 0

    image_labels = test_df[~test_df["image_gt"].isna()]["image_gt"]
    smokeynet_preds = test_df[~test_df["image_gt"].isna()]["image_pred"]
    ensemble_preds = test_df[~test_df["image_gt"].isna()]["final_pred"]

    baseline_score = accuracy_score(image_labels, smokeynet_preds)
    ensemble_score = accuracy_score(image_labels, ensemble_preds)
    
    print("Baseline score:", baseline_score)
    print("Ensemble score:", ensemble_score)
    test_df[~test_df["image_gt"].isna()][["timestamp","image_gt", "image_pred", "goes16_pred", "goes17_pred", "final_pred","type"]]\
        .to_csv(processed_data_dir + camera + csv_suffix)
    print("=====================================================")

Camera: 69bravo-e-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.9625
Ensemble score: 0.4875
Camera: bh-w-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.9617834394904459
Ensemble score: 0.7006369426751592
Camera: bl-n-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.75
Ensemble score: 0.4875
Camera: bl-s-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.9230769230769231
Ensemble score: 0.6282051282051282
Camera: bm-e-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.9358974358974359
Ensemble score: 0.5
Camera: bm-w-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.8625
Ensemble score: 0.4875
Camera: cp-s-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.9113924050632911
Ensemble score: 0.7531645569620253
Camera: dwpgm-n-mobo-c
joined SmokeyNet
joined GOES16
joined GOES17
Baseline score: 0.975
Ensemble score: 0.5
Camera: hp-e-mobo-c
joined Sm