## **Task 1**

##### **Chosen Data Model: Embedded**

The designed schema consists of one collection: *climate_historic.csv* in which the documents from *hotspot_historic.csv* have been embedded. Since each climate record for a particular date may have several hotspot records, each document in the *climate_historic* collection has a key called *hotspots* which holds an araay of all the hotspot documents (fire occurences) having the same date as the climate record. Hence, this one-to-many relationship is modelled using embedded documents. The date field has been removed from each of embedded hotspot documents to prevent duplication.

The structure of the data model is as follows:

```
climate_document =  {
    "_id": bson.objectid.ObjectId,
    "station": int
    "date": str,
    "air_temperature_celcius": int,
    "relative_humidity": float,
    "windspeed_knots": float,
    "max_wind_speed": float, 
    "preciptation": str,
    "GHI_w/m2": int,
    "hotspots": [
                    {
                        "latitude": float,
                        "longitude": float,
                        "datetime": str,
                        "confidence": int,
                        "surface_temperature_celcius": int
                    }
            ]
    }
```

**Example of climate document containing embedded corresponding hotspot documents:**
```
{'GHI_w/m2': 152,
 '_id': ObjectId('663d77f85178059695ff6188'),
 'air_temperature_celcius': 18,
 'date': '15/12/2023',
 'hotspots': [{'confidence': 92,
               'datetime': '2023-12-15 13:17:17',
               'latitude': -37.627,
               'longitude': 149.33,
               'surface_temperature_celcius': 42},
              {'confidence': 78,
               'datetime': '2023-12-15 13:17:17',
               'latitude': -37.658,
               'longitude': 149.339,
               'surface_temperature_celcius': 36},
              {'confidence': 51,
               'datetime': '2023-12-15 13:17:17',
               'latitude': -37.623,
               'longitude': 149.323,
               'surface_temperature_celcius': 38},
              {'confidence': 65,
               'datetime': '2023-12-15 01:16:23',
               'latitude': -38.038,
               'longitude': 142.986,
               'surface_temperature_celcius': 40}],
 'max_wind_speed': 14.0,
 'precipitation': '0.00I',
 'relative_humidity': 52.0,
 'station': 948702,
 'windspeed_knots': 7.1}
 ```

##### **Reasons for choosing the embedded data model:**
The embedded data model is chosen over the reference data model to optimise the read operations by facilitating the retrieval of related hotspot and climate data in lesser database queries. Below are the detailed reasons:

- For the given use case, queries fetch related data from both the climate and hotspot documents more often as compared to fetching data distinctly from each individual dataset. For example, studying the surface temperature (hotspot data) together with air temperature (climate data) is a crucial metric for analysing trends in fire occurences. Many such queries using related data from both datasets are found in the given use case (task 2) whereas lesser queries using distinct data are applied. The embedded data model helps in fetching related/associated data in a single database operation resulting in efficient querying, whereas referencing requires multiple queries to fetch related data as the hotspot and climate collections would have to be queried separetely and the references would need to be resolved. Hence, embedded data model is the better fit for this case.

- The given datasets for climate and hotspots are not very large, hence the embedded model fits in this case as referencing is often only used for extremely large datasets with over thousands of documents. Furthermore, even if the datasets were to expand, the scenario of having a very large number of fire occurences on the same day is highly unlikely. Therefore, the problem of exceeding the 16MB document size limit is not encountered with these datasets. Since the embedded documents are still relatively smaller in size, querying performance is not compromised.



## **Task 2.1**

**Set up mongo client and get/create collection**

In [4]:
from pymongo import MongoClient

#setup mongo client
ip_address = '10.192.68.151'

mongo_client = MongoClient(ip_address, 27017)

#create or get database 
db = mongo_client["fit3182_assignment_db"]

#create collection for climate data
climate_col = db.climates_historic

*Note: Please run the test code below to clear up the current data in the collection before running the following cell which populates the climate_col collection with climate data*

In [4]:
#! uncomment the following line before running the next cell to clear current contents from collection
# climate_col.delete_many({})

**Feed climate data with embedded hotspots into created collection**

In [4]:
import pandas as pd
from pprint import pprint

#read csv files and extract rows
climate_data = pd.read_csv("climate_historic.csv").iterrows()
data = pd.read_csv("hotspot_historic.csv").iterrows()

hotspots = []

#extract hotspot documents
for id, data_row in data:
    row_dict = data_row.to_dict()  #convert csv row to document
    hotspots.append(row_dict)

#populate climate collection with documents
for id, data_row in climate_data:
    row_dict = data_row.to_dict()   #extract climate documents
    row_dict["hotspots"] = []

    #find all fires which occured on current climate record's date
    cur_climate_date = row_dict["date"]

    #embed all matching hotspots into climate document
    for hotspot in hotspots:
        if hotspot["date"] == cur_climate_date:

            #remove date to avoid data duplication since it already exists in the climate document
            embed_hotspot = hotspot.copy()
            embed_hotspot.pop("date")   #remove date field to avoid duplicate data

            row_dict["hotspots"].append(embed_hotspot)

    climate_col.insert_one(row_dict)

366


In [5]:
from pprint import pprint
for document in climate_col.find({}):
    pprint(document)

{'GHI_w/m2': 154,
 '_id': ObjectId('663d77f75178059695ff602c'),
 'air_temperature_celcius': 19,
 'date': '31/12/2022',
 'hotspots': [],
 'max_wind_speed': 11.1,
 'precipitation': '0.00I',
 'relative_humidity': 56.8,
 'station': 948700,
 'windspeed_knots': 7.9}
{'GHI_w/m2': 128,
 '_id': ObjectId('663d77f75178059695ff602d'),
 'air_temperature_celcius': 15,
 'date': '2/1/2023',
 'hotspots': [],
 'max_wind_speed': 13.0,
 'precipitation': '0.02G',
 'relative_humidity': 50.7,
 'station': 948700,
 'windspeed_knots': 9.2}
{'GHI_w/m2': 133,
 '_id': ObjectId('663d77f75178059695ff602e'),
 'air_temperature_celcius': 16,
 'date': '3/1/2023',
 'hotspots': [],
 'max_wind_speed': 15.0,
 'precipitation': '0.00G',
 'relative_humidity': 53.6,
 'station': 948700,
 'windspeed_knots': 8.1}
{'GHI_w/m2': 186,
 '_id': ObjectId('663d77f75178059695ff602f'),
 'air_temperature_celcius': 24,
 'date': '4/1/2023',
 'hotspots': [],
 'max_wind_speed': 14.0,
 'precipitation': '0.00I',
 'relative_humidity': 61.6,
 'stati

## **Task 2.2**

In [6]:
from pprint import pprint

##### **a) Climate data on 12th December 2023**


In [6]:
climate_12_dec = climate_col.find({"date": "12/12/2023"})  #filter query using find

#extract document from returned cursor object
for document in climate_12_dec:
    pprint(document)

{'GHI_w/m2': 156,
 '_id': ObjectId('663d77f85178059695ff6185'),
 'air_temperature_celcius': 19,
 'date': '12/12/2023',
 'hotspots': [{'confidence': 53,
               'datetime': '2023-12-12 00:45:38',
               'latitude': -37.903,
               'longitude': 145.25,
               'surface_temperature_celcius': 44}],
 'max_wind_speed': 12.0,
 'precipitation': '0.00I',
 'relative_humidity': 55.3,
 'station': 948702,
 'windspeed_knots': 6.2}


##### **b) Selected data for surface temperature (°C) between 65 °C and 100 °C**


In [7]:
#extract only required attributes from each matching hotspot document 

#to show hotspot attributes as distinct attributes rather than embedded inside a hotspot document, 
#the path to the attributes is given instead of a boolean value during projection
projected_fields = {
                     "_id": 0,
                     "latitude": "$hotspots.latitude",
                     "longitude": "$hotspots.longitude",
                     "surface_temperature_celcius": "$hotspots.surface_temperature_celcius",
                     "confidence": "$hotspots.confidence"
                    }


pipeline_stages = [{"$unwind": "$hotspots"},  #extract embedded hotspot documents
                   {"$match": {"hotspots.surface_temperature_celcius": {"$gte": 65, "$lte": 100}}},  #filter query on surface temperature
                   {"$project": projected_fields}  #use projection to retrieve selected attributes
                  ]

data = climate_col.aggregate(pipeline_stages)  #execute aggregation pipeline

for document in data:
    pprint(document)

{'confidence': 94,
 'latitude': -37.2284,
 'longitude': 147.9187,
 'surface_temperature_celcius': 73}
{'confidence': 97,
 'latitude': -37.6572,
 'longitude': 142.0703,
 'surface_temperature_celcius': 80}
{'confidence': 84,
 'latitude': -37.0193,
 'longitude': 148.1459,
 'surface_temperature_celcius': 71}
{'confidence': 100,
 'latitude': -37.4229,
 'longitude': 147.027,
 'surface_temperature_celcius': 99}
{'confidence': 80,
 'latitude': -37.0055,
 'longitude': 148.1582,
 'surface_temperature_celcius': 68}
{'confidence': 85,
 'latitude': -37.4128,
 'longitude': 147.0242,
 'surface_temperature_celcius': 98}
{'confidence': 90,
 'latitude': -34.357,
 'longitude': 141.5361,
 'surface_temperature_celcius': 67}
{'confidence': 93,
 'latitude': -34.3539,
 'longitude': 141.5547,
 'surface_temperature_celcius': 72}
{'confidence': 90,
 'latitude': -36.9939,
 'longitude': 148.2244,
 'surface_temperature_celcius': 68}
{'confidence': 95,
 'latitude': -36.9959,
 'longitude': 148.2118,
 'surface_tempera

##### **c) Selected data for 15th and 16th of December 2023**


In [10]:
#extract only required attributes from each climate document with embedded hotspot document 
projected_fields = {
                     "_id": 0,
                     "date": 1,
                     "air_temperature_celcius": 1,
                     "relative_humidity": 1,
                     "max_wind_speed": 1,
                     "surface_temperature_celcius": "$hotspots.surface_temperature_celcius"  #to show surface temperature as a distinct attribute rather than embedded inside a hotspot document
                    }

pipeline_stages = [{"$match": {"date": {"$in": ["15/12/2023", "16/12/2023"]}}},  #filter query based on given two dates
                   {"$project": projected_fields}  #use projection to retrieve selected attributes
                  ]

data = climate_col.aggregate(pipeline_stages)

for document in data:
    pprint(document)


{'air_temperature_celcius': 18,
 'date': '15/12/2023',
 'max_wind_speed': 14.0,
 'relative_humidity': 52.0,
 'surface_temperature_celcius': [42, 36, 38, 40]}
{'air_temperature_celcius': 18,
 'date': '16/12/2023',
 'max_wind_speed': 13.0,
 'relative_humidity': 53.7,
 'surface_temperature_celcius': [43,
                                 33,
                                 54,
                                 73,
                                 55,
                                 75,
                                 55,
                                 66,
                                 56,
                                 60,
                                 73,
                                 48,
                                 55,
                                 64,
                                 57]}


##### **d) Selected data for confidence between 80 and 100**


In [9]:
#extract only required attributes from each climate document with embedded hotspot document 

#to show hotspot attributes as distinct attributes rather than embedded inside a hotspot document, 
#the path to the attributes is given instead of a boolean value during projection
projected_fields = {
                     "_id": 0,
                     "datetime": "$hotspots.datetime", 
                     "air_temperature_celcius": 1,
                     "surface_temperature_celcius": "$hotspots.surface_temperature_celcius",
                     "confidence": "$hotspots.confidence" 
                    }

pipeline_stages = [{"$unwind": "$hotspots"},  
                   {"$match": {"hotspots.confidence": {"$gte": 80, "$lte": 100}}},  
                   {"$project": projected_fields} 
                  ]

data = climate_col.aggregate(pipeline_stages)  #execute aggregation pipeline
 
for document in data:
    pprint(document)


{'air_temperature_celcius': 20,
 'confidence': 87,
 'datetime': '2023-03-06 05:06:30',
 'surface_temperature_celcius': 62}
{'air_temperature_celcius': 20,
 'confidence': 85,
 'datetime': '2023-03-06 05:06:20',
 'surface_temperature_celcius': 59}
{'air_temperature_celcius': 19,
 'confidence': 88,
 'datetime': '2023-03-07 04:16:10',
 'surface_temperature_celcius': 64}
{'air_temperature_celcius': 23,
 'confidence': 86,
 'datetime': '2023-03-09 13:23:40',
 'surface_temperature_celcius': 41}
{'air_temperature_celcius': 19,
 'confidence': 100,
 'datetime': '2023-03-10 04:48:40',
 'surface_temperature_celcius': 105}
{'air_temperature_celcius': 19,
 'confidence': 100,
 'datetime': '2023-03-10 04:46:20',
 'surface_temperature_celcius': 109}
{'air_temperature_celcius': 19,
 'confidence': 94,
 'datetime': '2023-03-10 04:45:30',
 'surface_temperature_celcius': 73}
{'air_temperature_celcius': 19,
 'confidence': 97,
 'datetime': '2023-03-10 04:45:30',
 'surface_temperature_celcius': 80}
{'air_temper

##### **e) Top 10 records with the highest surface temperature (°C)**

In [10]:
pipeline_stages = [{"$unwind": "$hotspots"},  #unwind to be able to order all documents individually
                   {"$sort": {"hotspots.surface_temperature_celcius": -1}},  #sort in descending order based on surface temp
                   {"$limit": 10} #output only top 10 records
                  ]

data = climate_col.aggregate(pipeline_stages)  #execute aggregation pipeline

for document in data:
    pprint(document)

{'GHI_w/m2': 122,
 '_id': ObjectId('663d77f75178059695ff6097'),
 'air_temperature_celcius': 15,
 'date': '18/4/2023',
 'hotspots': {'confidence': 100,
              'datetime': '2023-04-18 04:52:00',
              'latitude': -38.1665,
              'longitude': 143.062,
              'surface_temperature_celcius': 124},
 'max_wind_speed': 9.9,
 'precipitation': '0.00I',
 'relative_humidity': 56.1,
 'station': 948701,
 'windspeed_knots': 5.1}
{'GHI_w/m2': 140,
 '_id': ObjectId('663d77f75178059695ff6089'),
 'air_temperature_celcius': 16,
 'date': '4/4/2023',
 'hotspots': {'confidence': 100,
              'datetime': '2023-04-04 04:32:50',
              'latitude': -36.343,
              'longitude': 142.1986,
              'surface_temperature_celcius': 123},
 'max_wind_speed': 12.0,
 'precipitation': '0.00I',
 'relative_humidity': 47.5,
 'station': 948701,
 'windspeed_knots': 5.4}
{'GHI_w/m2': 121,
 '_id': ObjectId('663d77f75178059695ff60a4'),
 'air_temperature_celcius': 14,
 'date': '

##### **f) Number of fires each day**


In [11]:
#project only date and number of fires
projected_fields = {
                     "_id": 0,
                     "date": 1, 
                     "number_of_fires": {
                         "$size": "$hotspots"  #number of fires each day = size of each hotspots list
                     }
                    }

pipeline_stages = [{"$project": projected_fields}]

data = climate_col.aggregate(pipeline_stages)  #execute aggregation pipeline

for document in data:
    pprint(document)


{'date': '31/12/2022', 'number_of_fires': 0}
{'date': '2/1/2023', 'number_of_fires': 0}
{'date': '3/1/2023', 'number_of_fires': 0}
{'date': '4/1/2023', 'number_of_fires': 0}
{'date': '5/1/2023', 'number_of_fires': 0}
{'date': '6/1/2023', 'number_of_fires': 0}
{'date': '7/1/2023', 'number_of_fires': 0}
{'date': '8/1/2023', 'number_of_fires': 0}
{'date': '9/1/2023', 'number_of_fires': 0}
{'date': '10/1/2023', 'number_of_fires': 0}
{'date': '11/1/2023', 'number_of_fires': 0}
{'date': '12/1/2023', 'number_of_fires': 0}
{'date': '13/1/2023', 'number_of_fires': 0}
{'date': '14/1/2023', 'number_of_fires': 0}
{'date': '15/1/2023', 'number_of_fires': 0}
{'date': '16/1/2023', 'number_of_fires': 0}
{'date': '17/1/2023', 'number_of_fires': 0}
{'date': '18/1/2023', 'number_of_fires': 0}
{'date': '19/1/2023', 'number_of_fires': 0}
{'date': '20/1/2023', 'number_of_fires': 0}
{'date': '21/1/2023', 'number_of_fires': 0}
{'date': '22/1/2023', 'number_of_fires': 0}
{'date': '23/1/2023', 'number_of_fires'

##### **g) Records of fires with confidence below 70**


In [12]:
#extract only embedded hotspot attributes from each climate document

#to show hotspot attributes as distinct attributes rather than embedded inside a hotspot document, 
#the path to the attributes is given instead of a boolean value during projection
projected_fields = {
                     "_id": 0,
                     "latitude": "$hotspots.latitude",
                     "longitude": "$hotspots.longitude",
                     "datetime": "$hotspots.datetime",
                     "confidence": "$hotspots.confidence",
                     "surface_temperature_celcius": "$hotspots.surface_temperature_celcius",
                    }

pipeline_stages = [{"$unwind": "$hotspots"},  #hotspots = fire records
                   {"$match": {"hotspots.confidence": {"$lt": 70}}},  #filter for confidence < 70
                   {"$project": projected_fields} 
                  ]

data = climate_col.aggregate(pipeline_stages)  #execute aggregation pipeline
 
for document in data:
    pprint(document)


{'confidence': 68,
 'datetime': '2023-03-08 04:51:00',
 'latitude': -37.7885,
 'longitude': 141.9352,
 'surface_temperature_celcius': 55}
{'confidence': 54,
 'datetime': '2023-03-09 03:57:00',
 'latitude': -37.7171,
 'longitude': 147.5866,
 'surface_temperature_celcius': 44}
{'confidence': 55,
 'datetime': '2023-03-10 04:43:00',
 'latitude': -36.2544,
 'longitude': 148.0353,
 'surface_temperature_celcius': 42}
{'confidence': 54,
 'datetime': '2023-03-10 04:42:30',
 'latitude': -37.2197,
 'longitude': 147.9621,
 'surface_temperature_celcius': 43}
{'confidence': 56,
 'datetime': '2023-03-13 23:58:50',
 'latitude': -37.0286,
 'longitude': 148.155,
 'surface_temperature_celcius': 42}
{'confidence': 52,
 'datetime': '2023-03-13 12:57:00',
 'latitude': -37.0316,
 'longitude': 148.1519,
 'surface_temperature_celcius': 35}
{'confidence': 52,
 'datetime': '2023-03-15 00:42:50',
 'latitude': -37.4075,
 'longitude': 147.0233,
 'surface_temperature_celcius': 43}
{'confidence': 66,
 'datetime': '20

##### **h) Average surface temperature (°C) of each day**

In [13]:

#project only date and number of fires
projected_fields = {
                     "_id": 0,
                     "date": 1,

                     #using conditionals to show average temperature only for dates with hotspots
                     "average_surface_temperature": {
                         "$cond": {
                             "if": {"$eq": ["$hotspots", []]},   #if no hotspots, surface temp = N/A
                             "then": "N/A",
                             "else": {"$avg": "$hotspots.surface_temperature_celcius"}
                         }
                     }
                    }

pipeline_stages = [{"$project": projected_fields}]

data = climate_col.aggregate(pipeline_stages)  #execute aggregation pipeline


for document in data:
    pprint(document)

{'average_surface_temperature': 'N/A', 'date': '31/12/2022'}
{'average_surface_temperature': 'N/A', 'date': '2/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '3/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '4/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '5/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '6/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '7/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '8/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '9/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '10/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '11/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '12/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '13/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '14/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '15/1/2023'}
{'average_surface_temperature': 'N/A', 'date': '16/1/2023'}
{'average_surface_temperature': 'N/A', 'date': 

##### **i) Top 10 records with the lowest GHI** 

In [14]:
pipeline_stages = [{"$sort": {"GHI_w/m2": 1}},  #sort in ascending order based on GHI
                   {"$limit": 10} #output only top 10 records
                  ]

data = climate_col.aggregate(pipeline_stages)  #execute aggregation pipeline

for document in data:
    pprint(document)

{'GHI_w/m2': 47,
 '_id': ObjectId('663d77f85178059695ff6101'),
 'air_temperature_celcius': 5,
 'date': '2/8/2023',
 'hotspots': [{'confidence': 94,
               'datetime': '2023-08-02 03:45:40',
               'latitude': -37.4796,
               'longitude': 141.9403,
               'surface_temperature_celcius': 87},
              {'confidence': 54,
               'datetime': '2023-08-02 03:45:00',
               'latitude': -37.491,
               'longitude': 141.936,
               'surface_temperature_celcius': 40}],
 'max_wind_speed': 5.1,
 'precipitation': '0.00I',
 'relative_humidity': 38.6,
 'station': 948701,
 'windspeed_knots': 1.8}
{'GHI_w/m2': 48,
 '_id': ObjectId('663d77f75178059695ff60e0'),
 'air_temperature_celcius': 5,
 'date': '30/6/2023',
 'hotspots': [{'confidence': 78,
               'datetime': '2023-06-30 04:41:25',
               'latitude': -36.834,
               'longitude': 142.524,
               'surface_temperature_celcius': 44},
              {'confi

##### **j) Records with a 24-hour precipitation recorded between 0.20 to 0.35**

In [15]:
#since precipitation is stored in alphanumeric format, regex pattern matching is used to fetch required data
#since 24 hour precipitation record is required, the flags checked in regex are D(4*6=24), F(2*12=24) and G(1*24=24)
data = climate_col.find({"precipitation": {"$regex": "^(0\.(2[0-9]|3[0-5]))[D|F|G]"}}) 

for document in data:
    pprint(document)

{'GHI_w/m2': 157,
 '_id': ObjectId('663d77f75178059695ff6038'),
 'air_temperature_celcius': 19,
 'date': '13/1/2023',
 'hotspots': [],
 'max_wind_speed': 18.1,
 'precipitation': '0.31G',
 'relative_humidity': 54.1,
 'station': 948700,
 'windspeed_knots': 11.2}
{'GHI_w/m2': 146,
 '_id': ObjectId('663d77f75178059695ff6083'),
 'air_temperature_celcius': 17,
 'date': '29/3/2023',
 'hotspots': [{'confidence': 69,
               'datetime': '2023-03-29 00:48:40',
               'latitude': -34.2648,
               'longitude': 141.6325,
               'surface_temperature_celcius': 51}],
 'max_wind_speed': 21.0,
 'precipitation': '0.24G',
 'relative_humidity': 49.9,
 'station': 948701,
 'windspeed_knots': 12.2}
{'GHI_w/m2': 166,
 '_id': ObjectId('663d77f75178059695ff6099'),
 'air_temperature_celcius': 20,
 'date': '20/4/2023',
 'hotspots': [{'confidence': 84,
               'datetime': '2023-04-20 04:44:20',
               'latitude': -36.8871,
               'longitude': 145.1536,
         

## **Task 2.3**

##### **Creating Indexes**

In [16]:
from pymongo import ASCENDING, DESCENDING

#set index models for date and surface temp attributes
climate_col.create_index([("date", ASCENDING)], unique=True)
climate_col.create_index([("hotspots.surface_temperature_celcius", DESCENDING)])

print((list(climate_col.index_information())))

['_id_', 'date_1', 'hotspots.surface_temperature_celcius_-1']


##### **Rationale behind the used indexing**

Simple indexes have been used here by creating indexes for the `date` field of the climate documents and the `surface_temperature_celcius` field of the hotspot documents. Both of these fields have been indexed because they are the most frequently used in queries in the given use case. Most of the queries use date to filter the resulting documents while a lot of queries access values of the surface temperature. Hence indexing both of these fields will improve query execution times for larger datasets as the data retrieval will be faster for the most commonly accessed fields. 

The date indexes keys have been sorted in ascending order to allow data to be fetched and studied in a chronological order as given in the dataset. The date indexes also have the unique contraint to ensure that each station produces only 1 climate record on any particular day, thus ensuring consistency. In contrast, the surface temperature index keys have been sorted in descending order because top records with high surface temperatures should primarly be studied for fire predictions.