# FIT5148 - Distributed Databases and Big Data

# Assignment 2 - Task B Solution Workbook


**Student Details**
- Name: Pushan Mukerjee
- Student ID: 29052971


In [2]:
#Libraries 

import pymongo #library for connecting to MongoDb from Python
from pymongo import MongoClient #for connecting to mongo db server as the client
from pprint import pprint #for printing Mongodb output in the queries
import pandas as pd #for data load 
import json #for data load
from bson.son import SON #used for aggregate sorting

## Q1) Data Model

There is a 1:Many relationship between climate and fire. One climatic day contains 0 or many fires, whereas one fire belongs to only one climatic day, with Date being the join column.  

The model chosen was an **Embedding** model consisting of 2 collections:
* Climate
* Fire

The Climate collection contains all the climate records from the ClimateData-Part1.csv

Whereas the Fire collection contains all the fire records from the FireData-Part1.csv with each fire record having an embedded climate object, thus modelling the fact that each fire is associated to one climatic day. To avoid data redundancy, the "Date" field was removed from the embedded climate object, as it already exists in the fire object.  

Since there are climatic days that don't have fires, not all climate objects in the climate collection (eg. 2016-12-31 and 2018-01-01) are embedded in a fire object (fire collection).

Sample of the model is below.

**Sample Climate Object:**

```
{'_id': ObjectId('5bb051a29343690c6ac71c12')
 'Air_Temperature_Celcius': 19,
 'Date': '2016-12-31',
 'Max_Wind_Speed': 11.1,
 'Precipitation ': ' 0.00I',
 'Relative_Humidity': 56.8,
 'Station': 948700,
 'WindSpeed_knots': 7.9
}
```

**Note:** The order of the JSON elements is unspecified and is put into alphabetical order by **loads** package.
Hence in Mongodb, the above climate document may appear as per the below sample: 

```
{'Air_Temperature_Celcius': 19,
 'Date': '2016-12-31',
 'Max_Wind_Speed': 11.1,
 'Precipitation ': ' 0.00I',
 'Relative_Humidity': 56.8,
 'Station': 948700,
 'WindSpeed_knots': 7.9,
 '_id': ObjectId('5bb051a29343690c6ac71c12')}

```



**Sample Fire Object with embedded Climate Object:**

```
{
    "_id" : ObjectId("5baa31d4fa3c2e011bb72abe"),
	"Latitude" : -37.651,
	"Longitude" : 149.345,
	"Surface_Temperature_Kelvin" : 337.8,
	"Datetime" : "2017-12-16T00:20:53",
	"Power" : 42.2,
	"Confidence" : 82,
	"Date" : "2017-12-16",
	"Surface_Temperature_Celcius" : 64,
	"Climate" : {"Station" : 948702,
			     "Air_Temperature_Celcius" : 18,
			     "Relative_Humidity" : 53.7,
			     "WindSpeed_knots" : 9,
			     "Max_Wind_Speed" : 13,
			     "Precipitation" : "0.00I"
		        }
}
```
**Note:** The order of the JSON elements is unspecified and is put into alphabetical order by **loads** package.
Hence in Mongodb, the above fire document may appear as per the below sample: 

```
{'Climate':  {'Air_Temperature_Celcius': 28,
              'Max_Wind_Speed': 15.9,
              'Precipitation ': ' 0.00I',
              'Relative_Humidity': 58.3,
              'Station': 948702,
              'WindSpeed_knots': 9.3
             },
 'Confidence': 78,
 'Date': '2017-12-27',
 'Datetime': '2017-12-27T04:16:51',
 'Latitude': -37.966,
 'Longitude': 145.051,
 'Power': 26.7,
 'Surface_Temperature_Celcius': 68,
 'Surface_Temperature_Kelvin': 341.8,
 '_id': ObjectId('5baf7d2893436908f141b36e')
}
```


**Justification for the model:**

The model chosen was an Embedding model that embedded the climate information into each fire. Due to the 1:Many relationship between Climate and Fire, the model can be limited to only one embedded Climate object per fire, hence eliminating the chance of the 16MB document limit from breaching. As a result, this model enables all information about a single fire to be retrieved in a single seek, which is faster than referencing.   

However, not all days in the climate data set have fires, as there can be 0 or Many fires for a climatic day. So if we only had a fire collection, we would lose the climate data which weren't associated to any fire (eg. '2016-12-31 or '2018-01-01'). Hence it is necessary to retain a seperate climate collection, consisting of all climate data documents only. It provides additional flexibility, as queries involving climate data only can be performed against the climate collection; queries involving the fire data only can be performed against the fire collection; and queries involving fire and related climate data can be performed against the fire collection only. This model eliminates the need for joins between collections, thus preventing performance issues since MongoDb is not suited to applications requiring relational joins.        

Furthermore, in the fire collection, the "Date" field was removed from the embedded Climate object as it is redundant data and would not be needed in the case of a Mongo Aggregation query joining both fire and climate collections. The Date already exists in the fire object, so if a Mongo aggregation query was required, the "Date" field in the fire collection would be joined with the "Date field in the climate collection.  

Another option would have been to use a Referencing Model and reference the fire information within the Climate collection. The issue in this case, is you cannot identify a single fire by the joining column Date. So this would require storing an array of fire datestamps within a climate object in order to reference each fire. Additional overhead for not much benefit since in this task, a lot of the queries don't need joins and hence faster to retrieve all information contained within a single document.

An embedded option, where all fire instances are embedded within a single climate object is worse than the above two, as there are many fires on a particular climatic day, so the chances of breaching the 16MB limit for a single document would be high. 

## Q2) Loading the database

Create the Client Connection to MongoDb.

Then Create a new database **as2TaskB** with 2 collections:
* fire
* climate

Load the datasets into the collections using the new model.

In [3]:
client = MongoClient () #defining the Mongodb client.
result = client.drop_database('as2TaskB') #ensure that the as2TaskB database 
                                          #doesn't already exist

db = client.as2TaskB #defining the db

fireCollection = db.fire #define a new collection for fire data. 
                         #This will store fire data plus the 
                         #the embedded climate data associated 
                         #with each fire

climateCollection = db.climate #define a new collection 
                               #for the climate data

result = fireCollection.drop() #Ensure that the fire collection 
                               #does not already exist in Mongo db

result = climateCollection.drop() #Ensure that the fire collection 
                                  #does not already exist in Mongo db
    
climateData = [] #climate data array to store in memory
fireData = [] #fire data array to store in memory

fireDf = pd.read_csv("FireData-Part1.csv") #Read the FireData-Part1.csv into a dataframe
climateDf = pd.read_csv("ClimateData-Part1.csv") #Read the ClimateData-Part1.csv into a dataframe

mergedDf = pd.merge(fireDf, climateDf, on='Date', how='inner') #Join fire and climate dataframes by Date

#Create a dataframe that stores the climate columns for each merged fire-climate record. 
#This dataframe will be used to create an embedded climate JSON Object
#Note, Date field left out of embedded climate object as its redundant
embeddedDf = mergedDf[['Station', 'Air_Temperature_Celcius', 'Relative_Humidity', 'WindSpeed_knots', 'Max_Wind_Speed', 'Precipitation ']]

#For each row of fireData dataframe, create a new column called 'Climate'
#Create an embedded Climate JSON objects from the Embedded Dataframe, 
#Store the embedded Climate JSON Objects in the Climate column of fireData dataframe
fireDf["Climate"] = json.loads(embeddedDf.to_json(orient='records'))

#Convert the fireData dataframe into a set of JSON objects, 
#with each object containing an embedded the climate JSON object.
fireRecords = json.loads(fireDf.to_json(orient='records'))

#Convert the climateData dataframe into a set of JSON objects, 
#with each object containing an embedded the climate JSON object.
climateRecords = json.loads(climateDf.to_json(orient='records'))

#Load the fireData JSON objects (with the embedded Climate object) 
#into the fire Collection in as2TaskB database
result = fireCollection.insert_many(fireRecords)

#Load the climateData JSON objects into the climate collection 
#in the as2TaskB database
result = climateCollection.insert_many(climateRecords)


### Verify the Load


Confirm the number of JSON documents inserted into both the 
**climate** and **fire** collections. 

(should be 366 climate records and 2668 fire records based on input csv files)

In [4]:
numClimateRecs = climateCollection.count() #count the number of JSON documents
                                           #inserted into the climate Collection
print("Number of Climate records:", numClimateRecs) #print the count. Should be 366.


numFireRecs = fireCollection.count() #count the number of JSON documents 
                                     #inserted into the fire Collection
print("Number of Fire records:", numFireRecs) #print the count. Should be 2668.


Number of Climate records: 366
Number of Fire records: 2668


Printing out a sample JSON document in the climate collection (first document) 

In [5]:
climateCollection.find()[0]

{'Air_Temperature_Celcius': 19,
 'Date': '2016-12-31',
 'Max_Wind_Speed': 11.1,
 'Precipitation ': ' 0.00I',
 'Relative_Humidity': 56.8,
 'Station': 948700,
 'WindSpeed_knots': 7.9,
 '_id': ObjectId('5bb072ee93436914703db35b')}

Printing out a sample JSON document in the fire collection (first document) 

In [6]:
fireCollection.find()[0]

{'Climate': {'Air_Temperature_Celcius': 28,
  'Max_Wind_Speed': 15.9,
  'Precipitation ': ' 0.00I',
  'Relative_Humidity': 58.3,
  'Station': 948702,
  'WindSpeed_knots': 9.3},
 'Confidence': 78,
 'Date': '2017-12-27',
 'Datetime': '2017-12-27T04:16:51',
 'Latitude': -37.966,
 'Longitude': 145.051,
 'Power': 26.7,
 'Surface_Temperature_Celcius': 68,
 'Surface_Temperature_Kelvin': 341.8,
 '_id': ObjectId('5bb072ee93436914703da8ef')}

## Q3) Querying the database 

### Task B2 - Find climate data on 15th December 2017

In [8]:
results = climateCollection.find({"Date":"2017-12-15"})                           

for result in results:
    pprint(result)

{'Air_Temperature_Celcius': 18,
 'Date': '2017-12-15',
 'Max_Wind_Speed': 14.0,
 'Precipitation ': ' 0.00I',
 'Relative_Humidity': 52.0,
 'Station': 948702,
 'WindSpeed_knots': 7.1,
 '_id': ObjectId('5bb072ee93436914703db4b7')}


### Task B3 - Find Lat, Long, Surface Temp, Confidence where Surface Temp in between 65 and 100 degrees

In [17]:
results = fireCollection.find({"$and":[{"Surface_Temperature_Celcius":{"$gte":65}}, {"Surface_Temperature_Celcius":{"$lte":100}}]}, {"Latitude":1, "Longitude":1, "Surface_Temperature_Celcius":1, "Confidence":1})

for result in results:
    pprint(result)

{'Confidence': 78,
 'Latitude': -37.966,
 'Longitude': 145.051,
 'Surface_Temperature_Celcius': 68,
 '_id': ObjectId('5bb072ee93436914703da8ef')}
{'Confidence': 86,
 'Latitude': -35.543,
 'Longitude': 143.316,
 'Surface_Temperature_Celcius': 67,
 '_id': ObjectId('5bb072ee93436914703da8f2')}
{'Confidence': 93,
 'Latitude': -37.875,
 'Longitude': 142.51,
 'Surface_Temperature_Celcius': 73,
 '_id': ObjectId('5bb072ee93436914703da8f9')}
{'Confidence': 95,
 'Latitude': -37.613,
 'Longitude': 149.305,
 'Surface_Temperature_Celcius': 75,
 '_id': ObjectId('5bb072ee93436914703da8fb')}
{'Confidence': 90,
 'Latitude': -37.624,
 'Longitude': 149.314,
 'Surface_Temperature_Celcius': 66,
 '_id': ObjectId('5bb072ee93436914703da8fd')}
{'Confidence': 93,
 'Latitude': -38.057,
 'Longitude': 144.211,
 'Surface_Temperature_Celcius': 73,
 '_id': ObjectId('5bb072ee93436914703da900')}
{'Confidence': 92,
 'Latitude': -37.95,
 'Longitude': 142.366,
 'Surface_Temperature_Celcius': 70,
 '_id': ObjectId('5bb072ee

{'Confidence': 100,
 'Latitude': -37.3902,
 'Longitude': 148.2955,
 'Surface_Temperature_Celcius': 98,
 '_id': ObjectId('5bb072ee93436914703dad6f')}
{'Confidence': 100,
 'Latitude': -37.5027,
 'Longitude': 146.347,
 'Surface_Temperature_Celcius': 95,
 '_id': ObjectId('5bb072ee93436914703dad71')}
{'Confidence': 100,
 'Latitude': -37.5043,
 'Longitude': 146.3299,
 'Surface_Temperature_Celcius': 93,
 '_id': ObjectId('5bb072ee93436914703dad72')}
{'Confidence': 93,
 'Latitude': -36.1964,
 'Longitude': 144.5217,
 'Surface_Temperature_Celcius': 72,
 '_id': ObjectId('5bb072ee93436914703dad7d')}
{'Confidence': 91,
 'Latitude': -37.6397,
 'Longitude': 142.5968,
 'Surface_Temperature_Celcius': 68,
 '_id': ObjectId('5bb072ee93436914703dad8a')}
{'Confidence': 90,
 'Latitude': -37.6267,
 'Longitude': 142.9993,
 'Surface_Temperature_Celcius': 66,
 '_id': ObjectId('5bb072ee93436914703dad8f')}
{'Confidence': 93,
 'Latitude': -36.5871,
 'Longitude': 144.4958,
 'Surface_Temperature_Celcius': 72,
 '_id': 

 '_id': ObjectId('5bb072ee93436914703db15d')}
{'Confidence': 90,
 'Latitude': -36.1104,
 'Longitude': 145.9829,
 'Surface_Temperature_Celcius': 76,
 '_id': ObjectId('5bb072ee93436914703db15f')}
{'Confidence': 100,
 'Latitude': -37.782,
 'Longitude': 148.3844,
 'Surface_Temperature_Celcius': 99,
 '_id': ObjectId('5bb072ee93436914703db160')}
{'Confidence': 89,
 'Latitude': -36.9902,
 'Longitude': 141.879,
 'Surface_Temperature_Celcius': 65,
 '_id': ObjectId('5bb072ee93436914703db166')}
{'Confidence': 89,
 'Latitude': -37.745,
 'Longitude': 142.993,
 'Surface_Temperature_Celcius': 65,
 '_id': ObjectId('5bb072ee93436914703db167')}
{'Confidence': 92,
 'Latitude': -36.0157,
 'Longitude': 145.9414,
 'Surface_Temperature_Celcius': 70,
 '_id': ObjectId('5bb072ee93436914703db16b')}
{'Confidence': 99,
 'Latitude': -37.777,
 'Longitude': 148.4128,
 'Surface_Temperature_Celcius': 85,
 '_id': ObjectId('5bb072ee93436914703db16f')}
{'Confidence': 93,
 'Latitude': -37.5888,
 'Longitude': 148.5949,
 'Su

### Task B4 - Find Surface Temperature, Air Temperature, Relative Humidity and Max Wind Speed on 15th and 16th Dec 2017

In [18]:
results = fireCollection.find({"Date":{"$in":["2017-12-15", "2017-12-16"]}}, 
                              {"Surface_Temperature_Celcius":1, "Climate.Air_Temperature_Celcius":1, "Climate.Relative_Humidity":1, "Climate.Max_Wind_Speed":1}
                             )

for result in results:
    pprint(result)

{'Climate': {'Air_Temperature_Celcius': 18,
             'Max_Wind_Speed': 13.0,
             'Relative_Humidity': 53.7},
 'Surface_Temperature_Celcius': 43,
 '_id': ObjectId('5bb072ee93436914703da8f6')}
{'Climate': {'Air_Temperature_Celcius': 18,
             'Max_Wind_Speed': 13.0,
             'Relative_Humidity': 53.7},
 'Surface_Temperature_Celcius': 33,
 '_id': ObjectId('5bb072ee93436914703da8f7')}
{'Climate': {'Air_Temperature_Celcius': 18,
             'Max_Wind_Speed': 13.0,
             'Relative_Humidity': 53.7},
 'Surface_Temperature_Celcius': 54,
 '_id': ObjectId('5bb072ee93436914703da8f8')}
{'Climate': {'Air_Temperature_Celcius': 18,
             'Max_Wind_Speed': 13.0,
             'Relative_Humidity': 53.7},
 'Surface_Temperature_Celcius': 73,
 '_id': ObjectId('5bb072ee93436914703da8f9')}
{'Climate': {'Air_Temperature_Celcius': 18,
             'Max_Wind_Speed': 13.0,
             'Relative_Humidity': 53.7},
 'Surface_Temperature_Celcius': 55,
 '_id': ObjectId('5bb072ee

### Task B-5 - Find Datetime, Air Temperature, Surface Temperature and Confidence where Confidence is between 80 and 100

In [19]:
results = fireCollection.find({"$and":[{"Confidence": {"$gte":80}}, {"Confidence":{"$lte":100}}]}, 
                              {"Datetime":1, "Climate.Air_Temperature_Celcius":1, "Surface_Temperature_Celcius":1, "Confidence":1}
                             )

for result in results:
    pprint(result)

{'Climate': {'Air_Temperature_Celcius': 28},
 'Confidence': 82,
 'Datetime': '2017-12-27T00:02:15',
 'Surface_Temperature_Celcius': 63,
 '_id': ObjectId('5bb072ee93436914703da8f0')}
{'Climate': {'Air_Temperature_Celcius': 28},
 'Confidence': 86,
 'Datetime': '2017-12-27T00:02:14',
 'Surface_Temperature_Celcius': 67,
 '_id': ObjectId('5bb072ee93436914703da8f2')}
{'Climate': {'Air_Temperature_Celcius': 17},
 'Confidence': 80,
 'Datetime': '2017-12-25T04:29:08',
 'Surface_Temperature_Celcius': 54,
 '_id': ObjectId('5bb072ee93436914703da8f3')}
{'Climate': {'Air_Temperature_Celcius': 18},
 'Confidence': 94,
 'Datetime': '2017-12-16T15:38:39',
 'Surface_Temperature_Celcius': 43,
 '_id': ObjectId('5bb072ee93436914703da8f6')}
{'Climate': {'Air_Temperature_Celcius': 18},
 'Confidence': 93,
 'Datetime': '2017-12-16T04:35:13',
 'Surface_Temperature_Celcius': 73,
 '_id': ObjectId('5bb072ee93436914703da8f9')}
{'Climate': {'Air_Temperature_Celcius': 18},
 'Confidence': 84,
 'Datetime': '2017-12-16T0

 'Confidence': 83,
 'Datetime': '2017-05-15T04:26:20',
 'Surface_Temperature_Celcius': 56,
 '_id': ObjectId('5bb072ee93436914703daae2')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 82,
 'Datetime': '2017-05-15T04:26:20',
 'Surface_Temperature_Celcius': 55,
 '_id': ObjectId('5bb072ee93436914703daae6')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 85,
 'Datetime': '2017-05-15T04:26:20',
 'Surface_Temperature_Celcius': 66,
 '_id': ObjectId('5bb072ee93436914703daae8')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 90,
 'Datetime': '2017-05-15T04:26:20',
 'Surface_Temperature_Celcius': 78,
 '_id': ObjectId('5bb072ee93436914703daaea')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 86,
 'Datetime': '2017-05-15T04:26:20',
 'Surface_Temperature_Celcius': 60,
 '_id': ObjectId('5bb072ee93436914703daaed')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 82,
 'Datetime': '2017-05-15T04:26:20',
 'Surface_Temperature_Celcius': 55,

{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 97,
 'Datetime': '2017-05-04T04:44:40',
 'Surface_Temperature_Celcius': 80,
 '_id': ObjectId('5bb072ee93436914703daca4')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 86,
 'Datetime': '2017-05-04T04:44:40',
 'Surface_Temperature_Celcius': 61,
 '_id': ObjectId('5bb072ee93436914703daca6')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 88,
 'Datetime': '2017-05-04T00:28:30',
 'Surface_Temperature_Celcius': 73,
 '_id': ObjectId('5bb072ee93436914703daca9')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 87,
 'Datetime': '2017-05-04T00:25:30',
 'Surface_Temperature_Celcius': 71,
 '_id': ObjectId('5bb072ee93436914703dacab')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 100,
 'Datetime': '2017-05-04T00:25:10',
 'Surface_Temperature_Celcius': 87,
 '_id': ObjectId('5bb072ee93436914703dacac')}
{'Climate': {'Air_Temperature_Celcius': 10},
 'Confidence': 96,
 'Datetime': '2017-05-04T

 'Datetime': '2017-04-18T04:44:50',
 'Surface_Temperature_Celcius': 56,
 '_id': ObjectId('5bb072ee93436914703dae85')}
{'Climate': {'Air_Temperature_Celcius': 15},
 'Confidence': 80,
 'Datetime': '2017-04-18T04:44:50',
 'Surface_Temperature_Celcius': 54,
 '_id': ObjectId('5bb072ee93436914703dae86')}
{'Climate': {'Air_Temperature_Celcius': 15},
 'Confidence': 82,
 'Datetime': '2017-04-18T04:44:50',
 'Surface_Temperature_Celcius': 56,
 '_id': ObjectId('5bb072ee93436914703dae89')}
{'Climate': {'Air_Temperature_Celcius': 15},
 'Confidence': 95,
 'Datetime': '2017-04-18T04:44:50',
 'Surface_Temperature_Celcius': 76,
 '_id': ObjectId('5bb072ee93436914703dae8d')}
{'Climate': {'Air_Temperature_Celcius': 15},
 'Confidence': 81,
 'Datetime': '2017-04-18T04:44:50',
 'Surface_Temperature_Celcius': 54,
 '_id': ObjectId('5bb072ee93436914703dae91')}
{'Climate': {'Air_Temperature_Celcius': 15},
 'Confidence': 93,
 'Datetime': '2017-04-18T04:44:50',
 'Surface_Temperature_Celcius': 72,
 '_id': ObjectId('

 '_id': ObjectId('5bb072ee93436914703db064')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 91,
 'Datetime': '2017-04-13T04:26:30',
 'Surface_Temperature_Celcius': 69,
 '_id': ObjectId('5bb072ee93436914703db067')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 86,
 'Datetime': '2017-04-13T04:26:30',
 'Surface_Temperature_Celcius': 60,
 '_id': ObjectId('5bb072ee93436914703db06b')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 100,
 'Datetime': '2017-04-13T04:26:30',
 'Surface_Temperature_Celcius': 99,
 '_id': ObjectId('5bb072ee93436914703db071')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 88,
 'Datetime': '2017-04-13T04:26:30',
 'Surface_Temperature_Celcius': 63,
 '_id': ObjectId('5bb072ee93436914703db074')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 89,
 'Datetime': '2017-04-13T04:26:30',
 'Surface_Temperature_Celcius': 65,
 '_id': ObjectId('5bb072ee93436914703db076')}
{'Climate': {'Air_Temperature_Celcius': 16}

 '_id': ObjectId('5bb072ee93436914703db217')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 87,
 'Datetime': '2017-04-04T04:36:10',
 'Surface_Temperature_Celcius': 62,
 '_id': ObjectId('5bb072ee93436914703db218')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 90,
 'Datetime': '2017-04-04T04:36:00',
 'Surface_Temperature_Celcius': 76,
 '_id': ObjectId('5bb072ee93436914703db21a')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 100,
 'Datetime': '2017-04-04T04:35:00',
 'Surface_Temperature_Celcius': 95,
 '_id': ObjectId('5bb072ee93436914703db21d')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 83,
 'Datetime': '2017-04-04T04:34:50',
 'Surface_Temperature_Celcius': 57,
 '_id': ObjectId('5bb072ee93436914703db21e')}
{'Climate': {'Air_Temperature_Celcius': 16},
 'Confidence': 89,
 'Datetime': '2017-04-04T04:34:20',
 'Surface_Temperature_Celcius': 65,
 '_id': ObjectId('5bb072ee93436914703db220')}
{'Climate': {'Air_Temperature_Celcius': 16}

### Task B-6 Find top 10 records with highest surface temperature

In [12]:
results = fireCollection.find({}).sort([("Surface_Temperature_Celcius", pymongo.DESCENDING)]).limit(10) 

for result in results:
    pprint(result)

{'Climate': {'Air_Temperature_Celcius': 15,
             'Max_Wind_Speed': 9.9,
             'Precipitation ': ' 0.00I',
             'Relative_Humidity': 56.1,
             'Station': 948701,
             'WindSpeed_knots': 5.1},
 'Confidence': 100,
 'Date': '2017-04-18',
 'Datetime': '2017-04-18T04:52:00',
 'Latitude': -38.1665,
 'Longitude': 143.062,
 'Power': 239.8,
 'Surface_Temperature_Celcius': 124,
 'Surface_Temperature_Kelvin': 397.5,
 '_id': ObjectId('5bb072ee93436914703dad92')}
{'Climate': {'Air_Temperature_Celcius': 16,
             'Max_Wind_Speed': 12.0,
             'Precipitation ': ' 0.00I',
             'Relative_Humidity': 47.5,
             'Station': 948701,
             'WindSpeed_knots': 5.4},
 'Confidence': 100,
 'Date': '2017-04-04',
 'Datetime': '2017-04-04T04:32:50',
 'Latitude': -36.343,
 'Longitude': 142.1986,
 'Power': 233.4,
 'Surface_Temperature_Celcius': 123,
 'Surface_Temperature_Kelvin': 396.3,
 '_id': ObjectId('5bb072ee93436914703db23c')}
{'Climate':

### Task B-7 Find the number of fires in each day

In [15]:
results = fireCollection.aggregate([{"$group":{"_id":{"date":"$Date"}, "number_of_fires":{"$sum":1}}}])

for document in results:
    pprint(document)

{'_id': {'date': '2017-03-09'}, 'number_of_fires': 3}
{'_id': {'date': '2017-03-10'}, 'number_of_fires': 8}
{'_id': {'date': '2017-03-13'}, 'number_of_fires': 2}
{'_id': {'date': '2017-03-15'}, 'number_of_fires': 7}
{'_id': {'date': '2017-03-18'}, 'number_of_fires': 3}
{'_id': {'date': '2017-03-19'}, 'number_of_fires': 21}
{'_id': {'date': '2017-03-24'}, 'number_of_fires': 2}
{'_id': {'date': '2017-03-28'}, 'number_of_fires': 54}
{'_id': {'date': '2017-03-29'}, 'number_of_fires': 1}
{'_id': {'date': '2017-04-02'}, 'number_of_fires': 5}
{'_id': {'date': '2017-04-03'}, 'number_of_fires': 72}
{'_id': {'date': '2017-04-05'}, 'number_of_fires': 49}
{'_id': {'date': '2017-04-11'}, 'number_of_fires': 24}
{'_id': {'date': '2017-04-12'}, 'number_of_fires': 69}
{'_id': {'date': '2017-04-07'}, 'number_of_fires': 39}
{'_id': {'date': '2017-04-13'}, 'number_of_fires': 357}
{'_id': {'date': '2017-04-14'}, 'number_of_fires': 18}
{'_id': {'date': '2017-04-15'}, 'number_of_fires': 69}
{'_id': {'date': 

### Task B-8 Find the average Surface Temperature for each day

In [16]:
results = fireCollection.aggregate([{"$group":{"_id":{"date":"$Date"}, "avg_surface_temp":{"$avg":"$Surface_Temperature_Celcius"}}}])

for document in results:
    pprint(document)

{'_id': {'date': '2017-03-09'}, 'avg_surface_temp': 46.666666666666664}
{'_id': {'date': '2017-03-10'}, 'avg_surface_temp': 69.375}
{'_id': {'date': '2017-03-13'}, 'avg_surface_temp': 38.5}
{'_id': {'date': '2017-03-15'}, 'avg_surface_temp': 46.0}
{'_id': {'date': '2017-03-18'}, 'avg_surface_temp': 79.33333333333333}
{'_id': {'date': '2017-03-19'}, 'avg_surface_temp': 65.57142857142857}
{'_id': {'date': '2017-03-24'}, 'avg_surface_temp': 49.0}
{'_id': {'date': '2017-03-28'}, 'avg_surface_temp': 60.925925925925924}
{'_id': {'date': '2017-03-29'}, 'avg_surface_temp': 51.0}
{'_id': {'date': '2017-04-02'}, 'avg_surface_temp': 45.2}
{'_id': {'date': '2017-04-03'}, 'avg_surface_temp': 58.44444444444444}
{'_id': {'date': '2017-04-05'}, 'avg_surface_temp': 53.142857142857146}
{'_id': {'date': '2017-04-11'}, 'avg_surface_temp': 46.291666666666664}
{'_id': {'date': '2017-04-12'}, 'avg_surface_temp': 52.69565217391305}
{'_id': {'date': '2017-04-07'}, 'avg_surface_temp': 50.69230769230769}
{'_id':