## Kelowna Meteorological and Forest Fire Ignition Dataset 
This notebook provides a summary of the Kelowna Dataset. It also outputs rows with non-ignitions(0) and rows with ignitions(1) as distinct files.

The Kelowna Data Set contains meteorological and fire ignition data for the area surrounding Kelowna, British Columbia, Canada. It covers the years from 1980 to 2020 at a temporal resolution of 1 hour. The months from October thru March are excluded. This document contains summary histograms of the data set, a numerical summary of each feature, a description of each feature, and a location map of the data points.

In [1]:
import pandas as pd

In [2]:
path = "./finals/kelowna_dataset/"

In [3]:
# create dataframes
features = pd.read_csv(path + "features_kelowna.csv")
targets = pd.read_csv(path + "targets_kelowna.csv")

In [4]:
# concatenate features and targets
df = pd.concat([features, targets], axis=1)
# remove index columns
df = df.drop(['Unnamed: 0', 'X.1', 'X', 'Unnamed: 0'], axis=1)

In [5]:
# select only rows with ignitions
df_ignitions = df.loc[df['ignition'] > 0]
df_ignitions.info()

<class 'pandas.core.frame.DataFrame'>
Index: 51918 entries, 3061 to 68437049
Data columns (total 26 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   date      51918 non-null  float64
 1   lon       51918 non-null  float64
 2   lat       51918 non-null  float64
 3   u10       51819 non-null  float64
 4   v10       51819 non-null  float64
 5   d2m       51819 non-null  float64
 6   t2m       51819 non-null  float64
 7   e         51819 non-null  float64
 8   cvh       51819 non-null  float64
 9   cvl       51819 non-null  float64
 10  skt       51819 non-null  float64
 11  stl1      51819 non-null  float64
 12  stl2      51819 non-null  float64
 13  stl3      51819 non-null  float64
 14  stl4      51819 non-null  float64
 15  slt       51819 non-null  float64
 16  sp        51819 non-null  float64
 17  tp        51819 non-null  float64
 18  swvl1     51819 non-null  float64
 19  swvl2     51819 non-null  float64
 20  swvl3     51819 non-null  f

In [6]:
# select only rows with ignitions
df_non_ignitions = df.loc[df['ignition'] == 0]
df_non_ignitions.info()

<class 'pandas.core.frame.DataFrame'>
Index: 68385282 entries, 0 to 68437199
Data columns (total 26 columns):
 #   Column    Dtype  
---  ------    -----  
 0   date      float64
 1   lon       float64
 2   lat       float64
 3   u10       float64
 4   v10       float64
 5   d2m       float64
 6   t2m       float64
 7   e         float64
 8   cvh       float64
 9   cvl       float64
 10  skt       float64
 11  stl1      float64
 12  stl2      float64
 13  stl3      float64
 14  stl4      float64
 15  slt       float64
 16  sp        float64
 17  tp        float64
 18  swvl1     float64
 19  swvl2     float64
 20  swvl3     float64
 21  swvl4     float64
 22  month     int64  
 23  day       int64  
 24  hour      int64  
 25  ignition  int64  
dtypes: float64(22), int64(4)
memory usage: 13.8 GB


In [8]:
df_non_ignitions_100000 = df_non_ignitions.sample(n = 100000, random_state = 42)

In [11]:
# write out the dataframes as csv
#df_non_ignitions.to_csv(path+"non_ignition_rows.csv", index=False)
#df_ignitions.to_csv(path+"ignition_rows.csv", index=False)
df_non_ignitions_100000.to_csv(path+"non_ignition_rows_100000.csv", index=False)

The Kelowna Data Set contains modified Copernicus Climate Change Service information from 2020. We have reduced the number of features contained in the information, the geographic area represented, and the number of years available. Neither the European Commission nor ECMWF is responsible for any use that may be made of the Copernicus information or data it contains.

For more information refer to:

Copernicus Climate Change Service (C3S) (2023): ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). 10.24381/cds.adbb2d47 (Accessed on 07-MAR-2023)

Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., Thépaut, J-N. (2018): ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). 10.24381/cds.adbb2d47 (Accessed on 07-MAR-2023)