# 2. Quality Control Filtering

*Date: August 1, 2023*  
*Author: Alicia Larsen*     
*Institution: The Research Institute of Sweden (RISE)*   
*Contact: alicia.hh.larsen@gmail.com*  

This is the 3rd notebook of 7, in the series "RISE Wildfire Prediction Using Machine Learning"

##### Keywords: LST, LSR, Fire detection, MODIS, Python

## Reference
This notebook is based on the procedures in the notebook found on this [link](https://github.com/ornldaac/modis_restservice_qc_filter_Python/blob/master/modis_restservice_qc_filter_Python.ipynb). This notebook can also be found in /initial-eda/data-procurement/reference-notebook/download-modis-data-example-notebook.ipynb, on github.com:larsenalicia/RISE-wildfire-prediction.git

## Overview
Some pixels are missing, others have bad quality due to clouds or other factors. This notebook will filter out bad-quality data points.

## Prerequisites: 

* Python 2 or 3   
* Libraries: requests, json, datetime, pandas, numpy, matplotlib
* Having run 1_data_procurement.ipynb, and have the resulting csv files in the directory /data/..

---

## Imports:

In [None]:
# Imports
import requests
import json
import pandas as pd

from globals.global_vars import url, header, coordinate_description, lat, lon, start_year, end_year, products, bands, above_below_left_right

## Land Surface Temperature (LST) 
In order to filter the pixels with questionable quality from our LST time series, we need to understand the QC bit layer. 

**MOD11A2** uses 8-bit unsigned integers to indicate the quality of each pixel. See the [MOD11A2](https://lpdaac.usgs.gov/products/mod11a2v061/) page for additional resources. 

I.e., <code>df_qc</code>, takes values ranging from 1-255:

In [None]:
# Variable of interest
var = 'lst'
product = products[var]

# Load the datasets for LST, rename, and set new index
df_lst_data = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][0]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')
df_lst_qc = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][1]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')
df_lst_qc.head()

To make the classification more computationaly efficient, we only want to calculate the classification of the values contained in the data.

Retrieve the unique values:

In [None]:
# Check for unique values in df_lst_qc
qcvals_lst = pd.unique(df_lst_qc.values.ravel())
qcvals_lst

Refer to the user-guide found at [MOD11A2](https://lpdaac.usgs.gov/products/mod11a2v061/) for the bit classifications. Note that bit 0 is the least significant bit in the table below:

| Bits | Long Name | Key |
|:--------:|:--------:|:--------|
|  1 & 0   |  Mandatory QA flags   |  <code>00</code> = LST produced, good quality, not necessary to examine more detailed OA, <br><code>01</code>=LST produced, other quality, recommend examination of more detailed OA, <br><code>10</code>=LST not produced due to cloud effects, <br><code>11</code>=LST not produced primarily due to reasons other than cloud |
|  3 & 2   |  Data quality flag   |  <code>00</code>=good data quality, <br><code>01</code>-other quality data,<br> <code>10</code>=TBD, <br><code>11</code>=TBD |
|  5 & 4   |  Emis Error flag   |  <code>00</code>=average emissivity error <= 0.01, <br><code>01</code>=average emissivity error <= 0.02, <br><code>10</code>=average emissivity error <= 0.04, <br><code>11</code>=average emissivity error > 0.04   |
|  7 & 6   |  LST Error flag   |  <code>00</code>=average LST error <= 1K <br> <code>01</code>=average LST error <= 2K, <br> <code>10</code>=average LST error <= 3K, <br> <code>11</code>=average LST error > 3K   |

(The following features will not be provided with a table as above, instead, refer to the user manuals.)

Now we can decide which QC filtering criteria satisfy our needs. In this study, we will filter using two levels of restrictions:

More restricted filtering:
* Pixels that were not produced (cload cover or other reason)
* Pixels of 'other quality' that have an LST error > 2K

Less restricted filtering:
* Pixels that were not produced (cload cover or other reason)

In [None]:
lst_qc_data: list = []

# Iterate through the list of 8-bit integers and populate QC table with bit definitions 
for integer in qcvals_lst:
    bits = list(map(int, list("{0:b}".format(integer).zfill(8))))
    
    # Describe each of the bits. Remember bits are big endian so bits[7] == bit 0
    # Mandatory_QA bits description
    if (bits[6] == 0 and bits[7] == 0):
        Mandatory_QA = 'LST GOOD'
    if (bits[6] == 0 and bits[7] == 1):
        Mandatory_QA = 'LST Produced,Other Quality'
    if (bits[6] == 1 and bits[7] == 0):
        Mandatory_QA = 'No Pixel,clouds'
    if (bits[6] == 1 and bits[7] == 1):
        Mandatory_QA = 'No Pixel, Other QA'
        
    # Data_Quality bits description
    if (bits[4] == 0 and bits[5] == 0):
        Data_Quality = 'Good Data'
    if (bits[4] == 0 and bits[5] == 1):
        Data_Quality = 'Other Quality'
    if (bits[4] == 1 and bits[5] == 0):
        Data_Quality = 'TBD'
    if (bits[4] == 1 and bits[5] == 1):
        Data_Quality = 'TBD'
        
    # Emiss_Err bits description
    if (bits[2] == 0 and bits[3] == 0):
        Emiss_Err = 'Emiss Err <= .01'
    if (bits[2] == 0 and bits[3] == 1):
        Emiss_Err = 'Emiss Err <= .02'
    if (bits[2] == 1 and bits[3] == 0):
        Emiss_Err = 'Emiss Err <= .04'
    if (bits[2] == 1 and bits[3] == 1):
        Emiss_Err = 'Emiss Err > .04'
        
    # LST_Err bits description
    if (bits[0] == 0 and bits[1] == 0):
        LST_Err = 'LST Err <= 1K'
    if (bits[0] == 0 and bits[1] == 1):
        LST_Err = 'LST Err <= 2K'
    if (bits[0] == 1 and bits[1] == 0):
        LST_Err = 'LST Err <= 3K'
    if (bits[0] == 1 and bits[1] == 1):
        LST_Err = 'LST Err > 3K' 
    
    # Append this integers bit values and descriptions to list
    lst_qc_data.append([integer] + bits + [Mandatory_QA, Data_Quality, Emiss_Err, LST_Err])
    
# Convert QC bits and descriptions to pandas data frame
lst_qc_data = pd.DataFrame(lst_qc_data, columns=['Integer_Value', 'Bit7', 'Bit6', 'Bit5', 'Bit4', 'Bit3', 'Bit2', 'Bit1', 'Bit0', 'Mandatory_QA', 'Data_Quality', 'Emiss_Err', 'LST_Err'])
lst_qc_data

In [None]:
# Define the filters as a pandas-mask.
lst_qc_data_hard = lst_qc_data.loc[lst_qc_data['Integer_Value'].isin([2,3]) | ((lst_qc_data['Bit0'] == 1) & (lst_qc_data['Bit1'] == 0) & (lst_qc_data['Bit6'] != 0))]
lst_qc_data_loose = lst_qc_data.loc[lst_qc_data['Integer_Value'].isin([2,3])]

lst_qc_data_hard

We can use the **pandas** function **mask()** to filter the remaining QC integer values from our LST time series:

In [None]:
# Apply the filters
filter_hard = lst_qc_data_hard['Integer_Value'].tolist()
filter_loose = lst_qc_data_loose['Integer_Value'].tolist()

lst_data_filt_hard = df_lst_data.mask(df_lst_qc.isin(filter_hard))
lst_data_filt_loose = df_lst_data.mask(df_lst_qc.isin(filter_loose))

lst_data_filt_hard.head()

In [None]:
# Compare the lengths of the dataframes
lst_data_points = 40401* len(df_lst_data.index)
lst_hard_filtering: int = sum(lst_data_filt_hard.isna().sum())
lst_loose_filtering: int = sum(lst_data_filt_loose.isna().sum())

print(f"""
FILTERING EFFECT
------------------------------------------
number of original datapoints:      {lst_data_points}
percentage of datapoints removed: 
    loose:                          {round(100*lst_loose_filtering/lst_data_points, 2)}%
    hard:                           {round(100*lst_hard_filtering/lst_data_points, 2)}%
""")

Apply the scale factor for LST (0.02). We can retrieve the scale factor from a new subset request using the same global variables as the actual data request. The unit is Kelvin.

In [None]:
# arbitrary date, included by the product
date = 'A2010001'

# Join LST request parameters to URL string and submit request
response = requests.get("".join([
    url, products['lst'], "/subset?",
    "latitude=", str(lat),
    "&longitude=", str(lon),
    "&band=", str(bands[products['lst']][0]),
    "&startDate=", str(date),
    "&endDate=", str(date),
    "&kmAboveBelow=", str(above_below_left_right),
    "&kmLeftRight=", str(above_below_left_right)
]), headers=header)

In [None]:
# Finally, scale the dataframes using the meta-file with the response.
scale = json.loads(response.text)['scale']
lst_data_filt_scale_hard = lst_data_filt_hard*float(scale)
lst_data_filt_scale_loose = lst_data_filt_loose*float(scale)

lst_data_filt_scale_hard.head()

## Land Surface Reflectance

In [None]:
# Variable of interest
var = 'lsr'
product = products[var]

# Load the datasets for LST, rename, and set new index
df_nir = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][0]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')
df_swir = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][1]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')
df_lsr_qc = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][2]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')

In order to filter the pixels with questionable quality from our LSR time series, we need to understand the QC bit layer. 

**MOD09A1** uses 32-bit unsigned integers to indicate the quality of each pixel. See the [MOD09A1](https://lpdaac.usgs.gov/products/mod09a1v061/) page for additional resources. 

I.e., <code>df_qc</code>, takes values ranging from 0-4,294,967,295:

In [None]:
df_lsr_qc.head()

Run the cell below to see the unique values in the data. Only these will be classified.

In [None]:
# Get unique values of df_lsr_qc
qcvals_lsr = pd.unique(df_lsr_qc.values.ravel())
qcvals_lsr

Build a table describing the bits for the unique QC 16-bit integer values contained in <code>df_qc</code>:

In [None]:
# Create empty list to store QC bit information
lsr_qc_data: list  = []

# Iterate through the list of 8-bit integers and populate QC table with bit definitions 
for integer in qcvals_lsr:
    bits = list(map(int, list("{0:b}".format(integer).zfill(32))))
    
    # Describe each of the bits. Remember bits are big endian so bits[31] == bit 0
    
    # MODLAND QA bits
    if (bits[30] == 0 and bits[31] == 0):
        produced = 'ideal quality'
    if (bits[30] == 0 and bits[31] == 1):
        produced = 'less than ideal quality'
    if bits[30] == 1:
        produced = 'not produced'

    # -------------------------
    # Specific for the NIR band
    # -------------------------
    
    # band 1 data quality, four bit range
    if (bits[22] == 0 and bits[23] == 0 and bits[24] == 0 and bits[25] == 0):
        nir_quality = 'highest quality'
    if not (bits[22] == 0 and bits[23] == 0 and bits[24] == 0 and bits[25] == 0):
        nir_quality = 'insufficient quality'

    # --------------------------
    # Specific for the SWIR band
    # --------------------------
    
    # band 1 data quality, four bit range
    if (bits[6] == 0 and bits[7] == 0 and bits[8] == 0 and bits[9] == 0):
        swir_quality = 'highest quality'
    if not (bits[6] == 0 and bits[7] == 0 and bits[8] == 0 and bits[9] == 0):
        swir_quality = 'insufficient quality'


    # Append this integers bit values and descriptions to list
    lsr_qc_data.append([integer] + [produced, nir_quality, swir_quality])
    

# Convert QC bits and descriptions to pandas data frame
lsr_qc_data = pd.DataFrame(lsr_qc_data, columns=['integer_value', 'produced', 'nir_quality', 'swir_quality'])
lsr_qc_data.head()

Now we can decide which QC filtering criteria satisfy our needs. In this study, we will filter using only one filtering criteria:

* When the datpoints where not correctly produced
* When the quality is not the "highest"

Subset the QC table again to include only rows that represent QC criteria for pixels that we want to filter:

In [None]:
# Define the filtering criteria.
lsr_qc_data = lsr_qc_data.loc[
                      (lsr_qc_data['nir_quality'] == 'insufficient quality') |
                      (lsr_qc_data['swir_quality'] == 'insufficient quality') ]

lsr_qc_data.head()

In [None]:
filter_refl = lsr_qc_data['integer_value'].tolist()

# Define the filter as a pandas-mask.
nir_data_filt = df_nir.mask(df_lsr_qc.isin(filter_refl))
swir_data_filt = df_swir.mask(df_lsr_qc.isin(filter_refl))

In [None]:
df_nir.head()

In [None]:
# Compare the lengths of the dataframes
lsr_data_points = 160800* len(df_nir.index)
lsr_filtering: int = sum(nir_data_filt.isna().sum())

print(f"""
FILTERING EFFECT
------------------------------------------
number of original datapoints:      {lsr_data_points}
percentage of datapoints removed:   {round(100*lsr_filtering/lsr_data_points, 3)}%
""")

Apply the scale factor for LSR. We can retrieve the scale factor from a new subset request using the same global variables as the actual data request. The unit is percentage of reflectance.

In [None]:
# arbitrary date, included by the product
date = 'A2010001'

# Join LST request parameters to URL string and submit request
response = requests.get("".join([
    url, products['lsr'], "/subset?",
    "latitude=", str(lat),
    "&longitude=", str(lon),
    "&band=", str(bands[products['lsr']][0]),
    "&startDate=", str(date),
    "&endDate=", str(date),
    "&kmAboveBelow=", str(above_below_left_right),
    "&kmLeftRight=", str(above_below_left_right)
]), headers=header)

In [None]:
scale = json.loads(response.text)['scale']

# Finally: Scaling the data frames
nir_data_filt_scale_hard = nir_data_filt*float(scale)
swir_data_filt_scale_hard = swir_data_filt*float(scale)

nir_data_filt_scale_hard.head()

## EVI

In [None]:
# Variable of interest
var = 'evi'
product = products[var]

# Load the datasets for LST, rename, and set new index
df_evi_data = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][0]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')
df_evi_qc = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][1]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')

Next, we will filter the EVI dataset.

**MOD13Q1** uses 16-bit unsigned integers to indicate the quality of each pixel. See the [MOD13Q1](https://lpdaac.usgs.gov/products/MOD13Q1v061/) page for additional resources.

I.e., <code>df_qc</code>, takes values ranging from 0-65535:

In [None]:
df_evi_qc.head()

To make the classification more computationaly efficient, we only want to calculate the classification of the values contained in the data.

Retrieve the unique values:

In [None]:
# Define a list of all the unique values in df_evi_qc
qcvals_evi = pd.unique(df_evi_qc.values.ravel())
len(qcvals_evi)

Build a table describing the bits of interest for the unique QC 16-bit integer values contained in <code>df_evi_qc</code>:

In [None]:
# Create empty list to store QC bit information
evi_qc_data = []
vi_quality = 0
vi_usefulness = 0
# Iterate through the list of 8-bit integers and populate QC table with bit definitions 
for integer in qcvals_evi:
    bits = list(map(int, list("{0:b}".format(integer).zfill(16))))
    
    # Describe each of the bits. Remember bits are small endian so bit[0] == bit 0
    # VI Quality
    if (bits[0] == 0 and bits[1] == 0):
        vi_quality = 'VI produced with good quality'
    if (bits[0] == 0 and bits[1] == 1):
        vi_quality = 'VI produced, but check other QA'
    if (bits[0] == 1 and bits[1] == 0):
        vi_quality = 'Pixel produced, but most probably cloudy'
    if (bits[0] == 1 and bits[1] == 1):
        vi_quality = 'Pixel not produced due to other reasons than clouds'
        
    # VI Usefulness
    if (bits[2] == 0 and bits[3] == 0 and bits[4] == 0 and bits[5] == 0):
        vi_usefulness = 'Highest quality'
    if (bits[2] == 0 and bits[3] == 0 and bits[4] == 0 and bits[5] == 1):
        vi_usefulness = 'Lower quality'
    if (bits[2] == 0 and bits[3] == 0 and bits[4] == 1 and bits[5] == 0):
        vi_usefulness = 'Decreasing quality'
    if (bits[2] == 0 and bits[3] == 1 and bits[4] == 0 and bits[5] == 0):
        vi_usefulness = 'Decreasing quality'
    if (bits[2] == 1 and bits[3] == 0 and bits[4] == 0 and bits[5] == 0):
        vi_usefulness = 'Decreasing quality'
    if (bits[2] == 1 and bits[3] == 0 and bits[4] == 0 and bits[5] == 1):
        vi_usefulness = 'Decreasing quality'
    if (bits[2] == 1 and bits[3] == 0 and bits[4] == 1 and bits[5] == 0):
        vi_usefulness = 'Decreasing quality'
    if (bits[2] == 1 and bits[3] == 1 and bits[4] == 0 and bits[5] == 0):
        vi_usefulness = 'Lowest quality'
    if (bits[2] == 1 and bits[3] == 1 and bits[4] == 0 and bits[5] == 1):
        vi_usefulness = 'Quality so low that it is not useful'
    if (bits[2] == 1 and bits[3] == 1 and bits[4] == 1 and bits[5] == 0):
        vi_usefulness = 'L1B data faulty'
    if (bits[2] == 1 and bits[3] == 1 and bits[4] == 1 and bits[5] == 1):
        vi_usefulness = 'Not useful for any other reason/not processed'
    

    # Append this integers bit values and descriptions to list
    evi_qc_data.append([integer] + [vi_quality, vi_usefulness])
    

# Convert QC bits and descriptions to pandas data frame
evi_qc_data = pd.DataFrame(evi_qc_data, columns=['integer_value', 'vi_quality', 'vi_usefulness'])
evi_qc_data.head()

Now we can decide which QC filtering criteria satisfy our needs. In this study, we will filter using two levels of restrictions.

More restricted filtering:
* Pixel produced, but most probably cloudy
* Pixels that were not produced due to other reasons than clouds
* Lowest quality of pixel
* Pixels of 'quality so low that it is not useful', 'not useful for any other reason/not processed', 'L1B data faulty'

Less restricted filtering
* Pixels that were not produced due to other reasons than clouds
* Pixels of 'quality so low that it is not useful', 'not useful for any other reason/not processed', 'L1B data faulty'

Subset the QC table again to include only rows that represent QC criteria for pixels that we want to filter:

In [None]:
# Define the filtering criteria
evi_qc_data_hard = evi_qc_data.loc[(evi_qc_data['vi_quality'] == 'Pixel produced, but most probably cloudy') |
                                (evi_qc_data['vi_quality'] == 'Pixel not produced due to other reasons than clouds') | 
                                (evi_qc_data['vi_usefulness'] == 'Lowest quality') |
                                (evi_qc_data['vi_usefulness'] == 'Quality so low that it is not useful') |
                                (evi_qc_data['vi_usefulness'] == 'L1B data faulty') |
                                (evi_qc_data['vi_usefulness'] == 'Not useful for any other reason/not processed')]   

evi_qc_data_loose = evi_qc_data.loc[(evi_qc_data['vi_quality'] == 'Pixel not produced due to other reasons than clouds') | 
                                (evi_qc_data['vi_usefulness'] == 'Quality so low that it is not useful') |
                                (evi_qc_data['vi_usefulness'] == 'L1B data faulty') |
                                (evi_qc_data['vi_usefulness'] == 'Not useful for any other reason/not processed')] 
                                
evi_qc_data.head()

We can use the **pandas** function **mask()** to filter the remaining QC integer values from our EVI time series:

In [None]:
# Apply the filtering as a pandas-mask.
filter_hard = evi_qc_data_hard['integer_value'].tolist()
filter_loose = evi_qc_data_loose['integer_value'].tolist()

evi_data_filt_hard = df_evi_data.mask(df_evi_qc.isin(filter_hard))
evi_data_filt_loose = df_evi_data.mask(df_evi_qc.isin(filter_loose))

evi_data_filt_hard.head()


Apply the scale factor for EVI (0.0001).

In [None]:
# Compare the lengths of the dataframes
evi_data_points = 641600* len(df_evi_data.index)
evi_filtering_h: int = sum(evi_data_filt_hard.isna().sum())
evi_filtering_l: int = sum(evi_data_filt_loose.isna().sum())

print(f"""
FILTERING EFFECT
------------------------------------------
number of original datapoints:      {lsr_data_points}
percentage of datapoints removed:   
    hard:                           {round(100*evi_filtering_h/evi_data_points, 2)}%
    loose:                          {round(100*evi_filtering_l/evi_data_points, 2)}%
""")

In [None]:
# Finally, scale the dataframes.
scale = 0.0001
evi_data_filt_scale_hard = evi_data_filt_hard*float(scale)
evi_data_filt_scale_loose = evi_data_filt_loose*float(scale)

evi_data_filt_scale_hard.head()

## Fire detection

In [None]:
# Variable of interest
var = 'fire'
product = products[var]

# Load the datasets for LST, rename, and set new index
df_fire_data = pd.read_csv(f'data/procurement/{var}/{product}_{bands[product][0]}_{start_year}-{end_year}_{coordinate_description}.csv').rename(columns={'Unnamed: 0': 'date'}).set_index('date')

Although there is a QC data-set for the MOD14A2 product, also the FireMask band contains the data of interest.

The data is stored as an integers scheme ranging from 0-10:

- 0: not processed (missing input data)
- 1: not processed (obsolete; not used since Collection 1)
- 2: not processed (other reason)
- 3: non-fire water pixel
- 4: cloud (land or water)
- 5: non-fire land pixel
- 6: unknown (land or water)
- 7: fire (low confidence, land or water)
- 8: fire (nominal confidence, land or water)
- 9: fire (high confidence, land or water)


We can directly decide which filtering criteria satisfy our needs. In this study, we will filter:

* When the pixel is not processed (0-2)
* When the data is over water (3)
* When there is cloud (4)
* When the value is unknown (6)

Subset the QC table again to include only rows that represent QC criteria for pixels that we want to filter:
See the [MOD14A2](https://lpdaac.usgs.gov/products/mod14a2v061/) page for additional resources. 

In [None]:
# Take a look at the data
df_fire_data.head()

We can use the **pandas** function **mask()** to filter the remaining QC integer values from our LSR time series:

In [None]:
# Define two different filters, although they are the same, for more convenience later (the differentiation happens later)
filter_hard = [0, 1, 2, 3, 4, 6]
filter_loose = [0, 1, 2, 3, 4, 6]

fire_data_filt_hard = df_fire_data.mask(df_fire_data.isin(filter_hard))
fire_data_filt_loose = df_fire_data.mask(df_fire_data.isin(filter_loose))

In [None]:
# Compare the lengths of the dataframes
fire_data_points = 40400* len(df_fire_data.index)
fire_filtering_h: int = sum(fire_data_filt_hard.isna().sum())
fire_filtering_l: int = sum(fire_data_filt_loose.isna().sum())

print(f"""
FILTERING EFFECT
------------------------------------------
number of original datapoints:      {lsr_data_points}
percentage of datapoints removed:   
    hard:                           {round(100*fire_filtering_h/fire_data_points, 2)}%
    loose:                          {round(100*fire_filtering_l/fire_data_points, 2)}%
""")

In [None]:
fire_data_filt_hard.head()

## Data storage

In [None]:
# Save filtered data in csv files, in either data/filtered/hard or data/filtered/loose

# LST
lst_data_filt_scale_hard.to_csv(f'data/filtered/hard/lst_{start_year}-{end_year}_{coordinate_description}.csv')
lst_data_filt_scale_loose.to_csv(f'data/filtered/loose/lst_{start_year}-{end_year}_{coordinate_description}.csv')

# NIR & SWIR
nir_data_filt_scale_hard.to_csv(f'data/filtered/hard/nir_{start_year}-{end_year}_{coordinate_description}.csv')
swir_data_filt_scale_hard.to_csv(f'data/filtered/hard/swir_{start_year}-{end_year}_{coordinate_description}.csv')

# EVI
evi_data_filt_scale_hard.to_csv(f'data/filtered/hard/evi_{start_year}-{end_year}_{coordinate_description}.csv')
evi_data_filt_scale_loose.to_csv(f'data/filtered/loose/evi_{start_year}-{end_year}_{coordinate_description}.csv')

# FIRE
fire_data_filt_hard.to_csv(f'data/filtered/hard/fire_{start_year}-{end_year}_{coordinate_description}.csv')
fire_data_filt_loose.to_csv(f'data/filtered/loose/fire_{start_year}-{end_year}_{coordinate_description}.csv')

## Wrap-up
Now you should have filtered the datasets.

Have a nice day!

/ Alicia