# FIT 5148 - Distributed Databases and Big Data Assignment 2


### Task A. MongoDB Data Model

We had to use 2 data set to decide on to the data model, which are: 
>* hotspot_historic.csv (2668 rows)
>* climate_historic.csv (366 rows)

After Exploring through the data set it can be noticed that: 
>- Dates are unique in climate data. 
>- Datasets can be mapped through date column in both the datasets. 
>- Cardinality of the dataset is 1 to N.
>- Datasets cardinality justify either of "one to few" or "one to many" relation where on an average one row of climate data gets mapped to 10 rows of hotspot data (in this case there are ~100 rows with no data mapped to historic which further increase the count of mapping).
>- If the relation is "one to few" we use Embedded nodel and if the relationship is "one to many" we prefer reference model (Less than 1000),

It is a little ambigous situation where both the model will work fine. For Instance, If Embeded model is used perfomance of the query will be better as it will not have to query separate collection. But in this instance the primary focus of the task is to investigate on fire incidents and there might be a possibility that we might need to query the hotspot data which might make it  hard to access the details as stand-alone entities.

Therefore, in this case, we have decided to use the **reference model**. It will also ensure that searching and fetching of independent entities can be done easily and quickly. To make the referencing simple we will introduce an integer object id which will act as reference to the _id of the hotspot data. 

We have also incorporated **two way referencing** to reference from hotspot data to climate data to cater to the possible need of identifying a fire to particular sensor.

To enhance the perfomance of our model we have also incorporated **Denormalization** of "surface_temperature_celcius" and "confidence" from historic data to climate data under the assumption that it is analysis task and there will high read to write ratio and no Update will be required to the dataset.(which is one of the major disadvantage of denormalization). 


Below is the example of our model

**Climate Data**

```
[
 {
   " _id": '123'
    "station": '948702',
    "date": '27/12/2017',
    "air_temperature_celcius": '28'
    "relative_humidity": '58.3'
    "windspeed_knots": '9.3'
    "max_wind_speed": '15.9'
    "precipitation": '0.00I'

    "hotspot": [ 
        { "id": '10001', "confidence":"78", "surface_temperature_celcius":"68"},
        { "id": '10002', "confidence":"82", "surface_temperature_celcius":"63"},
        { "id": '10003', "confidence":"67", "surface_temperature_celcius":"53"},
        { "id": '10004', "confidence":"86", "surface_temperature_celcius":"67"}
    ]
 }
]        
```    
    
    
 **Hotspot Data**
```
 [
     {
        "_id": '10001'
        "latitude": '-37.966'
        "longitude": '145.051'
        "datetime": '2017-12-27T04:16:51'
        "confidence": '78'
        "date":'27/12/2017'
        "surface_temperature_celcius": '68'
        "owner": "123"
        },
   
     {
        "_id": '10002'
        "latitude": '-35.541'
        "longitude": '143.311'
        "datetime": '2017-12-27T00:02:15'
        "confidence": '82'
        "date": '27/12/2017'
        "surface_temperature_celcius": '63'
        "owner": "123"
        }
]

```


Loading the Climate historic file into dictionary

In [1]:
import csv 
import json

climate_historic=[]

with open("climate_historic.csv",'r') as climate_infile:
    climate_reader=csv.DictReader(climate_infile)
    id_climate=100
    for row in climate_reader:
        id_climate+=1
        temp_dict=row
        temp_dict["_id"]=id_climate
        climate_historic.append(temp_dict)
        
# climate_historic
        

#https://www.pythonforbeginners.com/csv/using-the-csv-module-in-python

Loading the Hotspot historic file into dictionary

In [2]:
hotspot_historic=[]

with open("hotspot_historic.csv",'r') as hotspot_infile:
    hotspot_reader=csv.DictReader(hotspot_infile)
    id_hotspot=10000
    for row in hotspot_reader:
        id_hotspot+=1
        temp_dict=row
        temp_dict["_id"]=id_hotspot
        hotspot_historic.append(temp_dict)

# hotspot_historic

In [3]:
final_hotspot=[]
final_climate=[]

for each_climate in climate_historic:
    each_climate['hotspot']=[]
    for each_hotspot in hotspot_historic:
        if each_climate["date"]== each_hotspot["date"]:
            temp_dict={}
            temp_dict["_id"]=each_hotspot["_id"]
            temp_dict["confidence"]=each_hotspot["confidence"]
            temp_dict["surface_temperature_celcius"]=each_hotspot["surface_temperature_celcius"]
            each_climate["hotspot"].append(temp_dict)
    final_climate.append(each_climate)
    


for each_hotspot in hotspot_historic:
    each_hotspot['surface_temperature_celcius']=int(each_hotspot['surface_temperature_celcius'])
    each_hotspot['confidence']=int(each_hotspot['confidence'])
    for each_climate in climate_historic:
        if each_climate["date"]== each_hotspot["date"]:
            each_hotspot['owner']=each_climate['_id']
    final_hotspot.append(each_hotspot)
        
            
            

Inserting rows into Mongo DB

In [5]:
import pymongo
from pymongo import MongoClient

## creating connection to Mongo db

# Method 1: connect on the default host and port
client = MongoClient () # method 1: connect on the default host and port

client.drop_database('fit5148_assignment_db')
db=client['fit5148_assignment_db']
db['hotspot_historic'].insert_many(final_hotspot)
db['climate_historic'].insert_many(final_climate)
#way of getting a collection
# db.hotspot.insert_many(final_hotspot)





<pymongo.results.InsertManyResult at 0x7fe4aebbcf88>

###### Creating Index

In [6]:
db['climate_historic'].create_index([('date', pymongo.ASCENDING)], unique=True)
db['hotspot_historic'].create_index([('date', pymongo.ASCENDING)])

'date_1'

In [7]:
sorted(list(db['hotspot_historic'].index_information()))
sorted(list(db['climate_historic'].index_information()))

['_id_', 'date_1']

###### Task a:  climate data on 10th December 2017.


In [8]:
from pprint import pprint

result=db['climate_historic'].find({"date":"10/12/2017"})
# cursor = montours.find({})
for document in result: 
    pprint(document)

{'_id': 444,
 'air_temperature_celcius': '17',
 'date': '10/12/2017',
 'hotspot': [{'_id': 10030,
              'confidence': '50',
              'surface_temperature_celcius': '38'},
             {'_id': 10031,
              'confidence': '67',
              'surface_temperature_celcius': '54'}],
 'max_wind_speed': '14',
 'precipitation ': ' 0.00I',
 'relative_humidity': '53.5',
 'station': '948702',
 'windspeed_knots': '7.3'}


###### Task b : latitude, longitude, surface temperature (°C), and confidence when the surface temperature (°C) was between 65 °C and 100 °C.


In [9]:
result=db['hotspot_historic'].find({"surface_temperature_celcius":{"$gte":65,"$lte":100}},{"latitude":1,"longitude":1 ,"surface_temperature_celcius":1,"confidence":1})
                                                           
for document in result: 
    pprint(document)



{'_id': 10001,
 'confidence': 78,
 'latitude': '-37.966',
 'longitude': '145.051',
 'surface_temperature_celcius': 68}
{'_id': 10004,
 'confidence': 86,
 'latitude': '-35.543',
 'longitude': '143.316',
 'surface_temperature_celcius': 67}
{'_id': 10011,
 'confidence': 93,
 'latitude': '-37.875',
 'longitude': '142.51',
 'surface_temperature_celcius': 73}
{'_id': 10013,
 'confidence': 95,
 'latitude': '-37.613',
 'longitude': '149.305',
 'surface_temperature_celcius': 75}
{'_id': 10015,
 'confidence': 90,
 'latitude': '-37.624',
 'longitude': '149.314',
 'surface_temperature_celcius': 66}
{'_id': 10018,
 'confidence': 93,
 'latitude': '-38.057',
 'longitude': '144.211',
 'surface_temperature_celcius': 73}
{'_id': 10027,
 'confidence': 92,
 'latitude': '-37.95',
 'longitude': '142.366',
 'surface_temperature_celcius': 70}
{'_id': 10038,
 'confidence': 100,
 'latitude': '-36.282',
 'longitude': '146.157',
 'surface_temperature_celcius': 71}
{'_id': 10054,
 'confidence': 100,
 'latitude': '

 'confidence': 97,
 'latitude': '-36.4169',
 'longitude': '144.2994',
 'surface_temperature_celcius': 81}
{'_id': 10625,
 'confidence': 96,
 'latitude': '-36.8094',
 'longitude': '142.8885',
 'surface_temperature_celcius': 77}
{'_id': 10641,
 'confidence': 84,
 'latitude': '-37.8323',
 'longitude': '147.2232',
 'surface_temperature_celcius': 68}
{'_id': 10657,
 'confidence': 98,
 'latitude': '-38.2853',
 'longitude': '145.9519',
 'surface_temperature_celcius': 83}
{'_id': 10658,
 'confidence': 98,
 'latitude': '-38.2375',
 'longitude': '142.8363',
 'surface_temperature_celcius': 82}
{'_id': 10660,
 'confidence': 92,
 'latitude': '-37.4463',
 'longitude': '142.7829',
 'surface_temperature_celcius': 70}
{'_id': 10671,
 'confidence': 95,
 'latitude': '-37.6854',
 'longitude': '143.4543',
 'surface_temperature_celcius': 76}
{'_id': 10677,
 'confidence': 99,
 'latitude': '-38.0881',
 'longitude': '143.9391',
 'surface_temperature_celcius': 84}
{'_id': 10678,
 'confidence': 96,
 'latitude': 

 'latitude': '-36.3332',
 'longitude': '145.8594',
 'surface_temperature_celcius': 90}
{'_id': 11067,
 'confidence': 86,
 'latitude': '-37.6775',
 'longitude': '148.5131',
 'surface_temperature_celcius': 78}
{'_id': 11078,
 'confidence': 93,
 'latitude': '-36.9111',
 'longitude': '142.692',
 'surface_temperature_celcius': 72}
{'_id': 11084,
 'confidence': 99,
 'latitude': '-37.5115',
 'longitude': '143.14',
 'surface_temperature_celcius': 86}
{'_id': 11095,
 'confidence': 93,
 'latitude': '-37.769',
 'longitude': '147.94',
 'surface_temperature_celcius': 72}
{'_id': 11098,
 'confidence': 90,
 'latitude': '-36.1303',
 'longitude': '146.3606',
 'surface_temperature_celcius': 67}
{'_id': 11105,
 'confidence': 98,
 'latitude': '-36.1345',
 'longitude': '146.351',
 'surface_temperature_celcius': 84}
{'_id': 11112,
 'confidence': 99,
 'latitude': '-36.1586',
 'longitude': '145.6666',
 'surface_temperature_celcius': 85}
{'_id': 11116,
 'confidence': 96,
 'latitude': '-36.5847',
 'longitude': 

 'confidence': 89,
 'latitude': '-36.0295',
 'longitude': '143.6409',
 'surface_temperature_celcius': 65}
{'_id': 11656,
 'confidence': 100,
 'latitude': '-37.0761',
 'longitude': '141.0574',
 'surface_temperature_celcius': 89}
{'_id': 11658,
 'confidence': 93,
 'latitude': '-35.0616',
 'longitude': '141.4417',
 'surface_temperature_celcius': 73}
{'_id': 11659,
 'confidence': 100,
 'latitude': '-36.3999',
 'longitude': '144.737',
 'surface_temperature_celcius': 88}
{'_id': 11665,
 'confidence': 91,
 'latitude': '-36.1748',
 'longitude': '145.7582',
 'surface_temperature_celcius': 69}
{'_id': 11666,
 'confidence': 99,
 'latitude': '-36.0691',
 'longitude': '145.7797',
 'surface_temperature_celcius': 85}
{'_id': 11669,
 'confidence': 90,
 'latitude': '-35.1949',
 'longitude': '141.0622',
 'surface_temperature_celcius': 66}
{'_id': 11672,
 'confidence': 100,
 'latitude': '-36.91',
 'longitude': '141.2705',
 'surface_temperature_celcius': 96}
{'_id': 11675,
 'confidence': 89,
 'latitude': 

 'surface_temperature_celcius': 95}
{'_id': 12155,
 'confidence': 100,
 'latitude': '-37.8429',
 'longitude': '143.8366',
 'surface_temperature_celcius': 88}
{'_id': 12157,
 'confidence': 77,
 'latitude': '-37.8062',
 'longitude': '143.3598',
 'surface_temperature_celcius': 65}
{'_id': 12159,
 'confidence': 100,
 'latitude': '-36.9918',
 'longitude': '141.8667',
 'surface_temperature_celcius': 97}
{'_id': 12161,
 'confidence': 90,
 'latitude': '-36.1104',
 'longitude': '145.9829',
 'surface_temperature_celcius': 76}
{'_id': 12162,
 'confidence': 100,
 'latitude': '-37.782',
 'longitude': '148.3844',
 'surface_temperature_celcius': 99}
{'_id': 12168,
 'confidence': 89,
 'latitude': '-36.9902',
 'longitude': '141.879',
 'surface_temperature_celcius': 65}
{'_id': 12169,
 'confidence': 89,
 'latitude': '-37.745',
 'longitude': '142.993',
 'surface_temperature_celcius': 65}
{'_id': 12173,
 'confidence': 92,
 'latitude': '-36.0157',
 'longitude': '145.9414',
 'surface_temperature_celcius': 7

###### Task c: Date, surface temperature (°C), air temperature (°C), relative humidity and max wind speed on 15th and 16th of December 2017.

In [10]:
result=db['climate_historic'].find({"date":{"$in":['15/12/2017', '16/12/2017']}},{"date":1,"air_temperature_celcius":1 ,"hotspot.surface_temperature_celcius":1,"relative_humidity":1,"max_wind_speed":1})
                                                           
for document in result: 
    pprint(document)
    print('\n')



{'_id': 449,
 'air_temperature_celcius': '18',
 'date': '15/12/2017',
 'hotspot': [{'surface_temperature_celcius': '42'},
             {'surface_temperature_celcius': '36'},
             {'surface_temperature_celcius': '38'},
             {'surface_temperature_celcius': '40'}],
 'max_wind_speed': '14',
 'relative_humidity': '52'}


{'_id': 450,
 'air_temperature_celcius': '18',
 'date': '16/12/2017',
 'hotspot': [{'surface_temperature_celcius': '43'},
             {'surface_temperature_celcius': '33'},
             {'surface_temperature_celcius': '54'},
             {'surface_temperature_celcius': '73'},
             {'surface_temperature_celcius': '55'},
             {'surface_temperature_celcius': '75'},
             {'surface_temperature_celcius': '55'},
             {'surface_temperature_celcius': '66'},
             {'surface_temperature_celcius': '56'},
             {'surface_temperature_celcius': '60'},
             {'surface_temperature_celcius': '73'},
             {'surface_t

###### Task d: Datetime, air temperature (°C), surface temperature (°C) and confidence when the confidence is between 80 and 100.

In [11]:
results = db['hotspot_historic'].aggregate([
    {"$lookup":
     {"from": "climate_historic",
      "localField": "date",
      "foreignField" : "date" ,
      "as":"climate_historic"
     }
    },     {"$match" : {"confidence":{"$gte":80,"$lte":100}}},

   {"$project" : {"datetime":1,
                  "surface_temperature_celcius":1, "confidence":1,"climate_historic.air_temperature_celcius":1
                                             }}
])

for document in results:
    pprint(document)
    print('\n')

{'_id': 10002,
 'climate_historic': [{'air_temperature_celcius': '28'}],
 'confidence': 82,
 'datetime': '2017-12-27T00:02:15',
 'surface_temperature_celcius': 63}


{'_id': 10004,
 'climate_historic': [{'air_temperature_celcius': '28'}],
 'confidence': 86,
 'datetime': '2017-12-27T00:02:14',
 'surface_temperature_celcius': 67}


{'_id': 10005,
 'climate_historic': [{'air_temperature_celcius': '17'}],
 'confidence': 80,
 'datetime': '2017-12-25T04:29:08',
 'surface_temperature_celcius': 54}


{'_id': 10008,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 94,
 'datetime': '2017-12-16T15:38:39',
 'surface_temperature_celcius': 43}


{'_id': 10011,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 93,
 'datetime': '2017-12-16T04:35:13',
 'surface_temperature_celcius': 73}


{'_id': 10012,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 84,
 'datetime': '2017-12-16T04:34:58',
 'surface_temperature_celcius': 55}


{'_i

 'surface_temperature_celcius': 61}


{'_id': 10093,
 'climate_historic': [{'air_temperature_celcius': '24'}],
 'confidence': 81,
 'datetime': '2017-11-13T03:52:14',
 'surface_temperature_celcius': 53}


{'_id': 10101,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 89,
 'datetime': '2017-11-12T00:33:15',
 'surface_temperature_celcius': 69}


{'_id': 10103,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 80,
 'datetime': '2017-11-11T13:30:08',
 'surface_temperature_celcius': 37}


{'_id': 10104,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 100,
 'datetime': '2017-11-11T13:30:08',
 'surface_temperature_celcius': 59}


{'_id': 10105,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 86,
 'datetime': '2017-11-11T04:04:25',
 'surface_temperature_celcius': 60}


{'_id': 10109,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 85,
 'datetime': '2017-11-09T04:16:48',
 's

{'_id': 10214,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 100,
 'datetime': '2017-09-24T15:07:47',
 'surface_temperature_celcius': 65}


{'_id': 10217,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 90,
 'datetime': '2017-09-24T15:07:47',
 'surface_temperature_celcius': 41}


{'_id': 10218,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 99,
 'datetime': '2017-09-24T15:07:47',
 'surface_temperature_celcius': 61}


{'_id': 10221,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 94,
 'datetime': '2017-09-24T15:07:45',
 'surface_temperature_celcius': 43}


{'_id': 10222,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 100,
 'datetime': '2017-09-24T15:07:45',
 'surface_temperature_celcius': 61}


{'_id': 10223,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 100,
 'datetime': '2017-09-24T13:30:10',
 'surface_temperature_celcius': 47}


{

 'surface_temperature_celcius': 101}


{'_id': 10412,
 'climate_historic': [{'air_temperature_celcius': '17'}],
 'confidence': 91,
 'datetime': '2017-05-22T04:32:20',
 'surface_temperature_celcius': 68}


{'_id': 10414,
 'climate_historic': [{'air_temperature_celcius': '17'}],
 'confidence': 96,
 'datetime': '2017-05-22T04:32:20',
 'surface_temperature_celcius': 78}


{'_id': 10417,
 'climate_historic': [{'air_temperature_celcius': '17'}],
 'confidence': 98,
 'datetime': '2017-05-22T04:32:20',
 'surface_temperature_celcius': 82}


{'_id': 10420,
 'climate_historic': [{'air_temperature_celcius': '17'}],
 'confidence': 100,
 'datetime': '2017-05-22T04:32:20',
 'surface_temperature_celcius': 93}


{'_id': 10421,
 'climate_historic': [{'air_temperature_celcius': '17'}],
 'confidence': 89,
 'datetime': '2017-05-22T04:32:20',
 'surface_temperature_celcius': 65}


{'_id': 10422,
 'climate_historic': [{'air_temperature_celcius': '17'}],
 'confidence': 85,
 'datetime': '2017-05-22T00:15:00',
 '

 'surface_temperature_celcius': 77}


{'_id': 10627,
 'climate_historic': [{'air_temperature_celcius': '10'}],
 'confidence': 84,
 'datetime': '2017-05-10T04:16:10',
 'surface_temperature_celcius': 58}


{'_id': 10630,
 'climate_historic': [{'air_temperature_celcius': '10'}],
 'confidence': 88,
 'datetime': '2017-05-10T04:14:20',
 'surface_temperature_celcius': 64}


{'_id': 10633,
 'climate_historic': [{'air_temperature_celcius': '10'}],
 'confidence': 100,
 'datetime': '2017-05-10T04:14:10',
 'surface_temperature_celcius': 103}


{'_id': 10636,
 'climate_historic': [{'air_temperature_celcius': '10'}],
 'confidence': 87,
 'datetime': '2017-05-10T04:12:50',
 'surface_temperature_celcius': 63}


{'_id': 10637,
 'climate_historic': [{'air_temperature_celcius': '10'}],
 'confidence': 86,
 'datetime': '2017-05-10T04:12:50',
 'surface_temperature_celcius': 60}


{'_id': 10639,
 'climate_historic': [{'air_temperature_celcius': '10'}],
 'confidence': 83,
 'datetime': '2017-05-10T04:11:30',
 '



{'_id': 10801,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 84,
 'datetime': '2017-05-05T03:53:10',
 'surface_temperature_celcius': 57}


{'_id': 10802,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 95,
 'datetime': '2017-05-05T03:53:10',
 'surface_temperature_celcius': 76}


{'_id': 10804,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 82,
 'datetime': '2017-05-05T03:51:50',
 'surface_temperature_celcius': 55}


{'_id': 10810,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 82,
 'datetime': '2017-05-05T03:50:20',
 'surface_temperature_celcius': 55}


{'_id': 10814,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 82,
 'datetime': '2017-05-05T03:50:20',
 'surface_temperature_celcius': 56}


{'_id': 10815,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 97,
 'datetime': '2017-05-05T03:50:20',
 'surface_temperature_celcius': 80}


{'

 'confidence': 100,
 'datetime': '2017-05-01T04:14:20',
 'surface_temperature_celcius': 94}


{'_id': 11050,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 82,
 'datetime': '2017-05-01T04:14:20',
 'surface_temperature_celcius': 62}


{'_id': 11051,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 95,
 'datetime': '2017-05-01T04:14:20',
 'surface_temperature_celcius': 76}


{'_id': 11054,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 85,
 'datetime': '2017-05-01T04:14:20',
 'surface_temperature_celcius': 59}


{'_id': 11055,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 87,
 'datetime': '2017-05-01T04:14:20',
 'surface_temperature_celcius': 117}


{'_id': 11056,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 87,
 'datetime': '2017-04-29T04:33:00',
 'surface_temperature_celcius': 90}


{'_id': 11060,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 '

 'datetime': '2017-04-19T03:50:30',
 'surface_temperature_celcius': 72}


{'_id': 11168,
 'climate_historic': [{'air_temperature_celcius': '22'}],
 'confidence': 84,
 'datetime': '2017-04-19T03:50:30',
 'surface_temperature_celcius': 58}


{'_id': 11169,
 'climate_historic': [{'air_temperature_celcius': '22'}],
 'confidence': 85,
 'datetime': '2017-04-19T03:50:30',
 'surface_temperature_celcius': 59}


{'_id': 11175,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 86,
 'datetime': '2017-04-18T04:56:20',
 'surface_temperature_celcius': 61}


{'_id': 11176,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 82,
 'datetime': '2017-04-18T04:55:00',
 'surface_temperature_celcius': 56}


{'_id': 11180,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 91,
 'datetime': '2017-04-18T04:54:40',
 'surface_temperature_celcius': 68}


{'_id': 11185,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 90,
 'da

 'confidence': 92,
 'datetime': '2017-04-18T04:44:50',
 'surface_temperature_celcius': 70}


{'_id': 11379,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 100,
 'datetime': '2017-04-18T04:44:50',
 'surface_temperature_celcius': 98}


{'_id': 11380,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 81,
 'datetime': '2017-04-18T04:44:50',
 'surface_temperature_celcius': 54}


{'_id': 11382,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 100,
 'datetime': '2017-04-18T04:44:50',
 'surface_temperature_celcius': 96}


{'_id': 11385,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 80,
 'datetime': '2017-04-18T04:44:50',
 'surface_temperature_celcius': 53}


{'_id': 11387,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 86,
 'datetime': '2017-04-18T04:44:50',
 'surface_temperature_celcius': 60}


{'_id': 11393,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 '

 'surface_temperature_celcius': 54}


{'_id': 11621,
 'climate_historic': [{'air_temperature_celcius': '12'}],
 'confidence': 100,
 'datetime': '2017-04-15T04:14:20',
 'surface_temperature_celcius': 100}


{'_id': 11624,
 'climate_historic': [{'air_temperature_celcius': '13'}],
 'confidence': 82,
 'datetime': '2017-04-14T05:15:50',
 'surface_temperature_celcius': 55}


{'_id': 11628,
 'climate_historic': [{'air_temperature_celcius': '13'}],
 'confidence': 87,
 'datetime': '2017-04-14T05:09:10',
 'surface_temperature_celcius': 62}


{'_id': 11630,
 'climate_historic': [{'air_temperature_celcius': '13'}],
 'confidence': 84,
 'datetime': '2017-04-14T05:09:10',
 'surface_temperature_celcius': 79}


{'_id': 11632,
 'climate_historic': [{'air_temperature_celcius': '13'}],
 'confidence': 84,
 'datetime': '2017-04-14T05:09:10',
 'surface_temperature_celcius': 78}


{'_id': 11635,
 'climate_historic': [{'air_temperature_celcius': '13'}],
 'confidence': 86,
 'datetime': '2017-04-14T03:35:20',
 '

 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 81,
 'datetime': '2017-04-13T04:26:30',
 'surface_temperature_celcius': 55}


{'_id': 11832,
 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 85,
 'datetime': '2017-04-13T04:26:30',
 'surface_temperature_celcius': 59}


{'_id': 11834,
 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 81,
 'datetime': '2017-04-13T04:26:30',
 'surface_temperature_celcius': 55}


{'_id': 11835,
 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 86,
 'datetime': '2017-04-13T04:26:30',
 'surface_temperature_celcius': 60}


{'_id': 11836,
 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 81,
 'datetime': '2017-04-13T04:26:30',
 'surface_temperature_celcius': 54}


{'_id': 11837,
 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 94,
 'datetime': '2017-04-13T04:26:30',
 'surface_temperature_celcius': 75}


{'_id': 11839,
 'cl

 'surface_temperature_celcius': 55}


{'_id': 11994,
 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 96,
 'datetime': '2017-04-13T00:07:10',
 'surface_temperature_celcius': 79}


{'_id': 11995,
 'climate_historic': [{'air_temperature_celcius': '16'}],
 'confidence': 95,
 'datetime': '2017-04-13T00:07:00',
 'surface_temperature_celcius': 76}


{'_id': 11998,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 93,
 'datetime': '2017-04-12T13:15:10',
 'surface_temperature_celcius': 43}


{'_id': 12002,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 100,
 'datetime': '2017-04-12T05:28:10',
 'surface_temperature_celcius': 106}


{'_id': 12004,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 84,
 'datetime': '2017-04-12T05:28:00',
 'surface_temperature_celcius': 58}


{'_id': 12009,
 'climate_historic': [{'air_temperature_celcius': '14'}],
 'confidence': 94,
 'datetime': '2017-04-12T05:21:40',
 '

 'surface_temperature_celcius': 115}


{'_id': 12210,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 84,
 'datetime': '2017-04-06T04:21:00',
 'surface_temperature_celcius': 58}


{'_id': 12211,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 100,
 'datetime': '2017-04-06T04:20:50',
 'surface_temperature_celcius': 95}


{'_id': 12212,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 88,
 'datetime': '2017-04-06T04:20:50',
 'surface_temperature_celcius': 65}


{'_id': 12214,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 83,
 'datetime': '2017-04-06T04:20:50',
 'surface_temperature_celcius': 63}


{'_id': 12216,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 81,
 'datetime': '2017-04-06T04:20:40',
 'surface_temperature_celcius': 60}


{'_id': 12217,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 83,
 'datetime': '2017-04-06T04:20:40',
 '

 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 83,
 'datetime': '2017-04-03T13:15:30',
 'surface_temperature_celcius': 40}


{'_id': 12409,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 100,
 'datetime': '2017-04-03T13:15:10',
 'surface_temperature_celcius': 62}


{'_id': 12410,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 96,
 'datetime': '2017-04-03T13:15:10',
 'surface_temperature_celcius': 45}


{'_id': 12412,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 87,
 'datetime': '2017-04-03T03:59:10',
 'surface_temperature_celcius': 96}


{'_id': 12414,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 93,
 'datetime': '2017-04-03T03:57:50',
 'surface_temperature_celcius': 72}


{'_id': 12416,
 'climate_historic': [{'air_temperature_celcius': '15'}],
 'confidence': 82,
 'datetime': '2017-04-03T03:56:20',
 'surface_temperature_celcius': 55}


{'_id': 12418,
 'c

 'confidence': 97,
 'datetime': '2017-03-28T04:30:10',
 'surface_temperature_celcius': 80}


{'_id': 12532,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 81,
 'datetime': '2017-03-28T04:28:30',
 'surface_temperature_celcius': 54}


{'_id': 12534,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 88,
 'datetime': '2017-03-28T04:28:30',
 'surface_temperature_celcius': 63}


{'_id': 12535,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 87,
 'datetime': '2017-03-28T04:28:30',
 'surface_temperature_celcius': 62}


{'_id': 12537,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 97,
 'datetime': '2017-03-28T04:27:40',
 'surface_temperature_celcius': 80}


{'_id': 12538,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 'confidence': 100,
 'datetime': '2017-03-28T04:27:20',
 'surface_temperature_celcius': 107}


{'_id': 12541,
 'climate_historic': [{'air_temperature_celcius': '18'}],
 '

 'datetime': '2017-03-12T04:29:50',
 'surface_temperature_celcius': 71}


{'_id': 12650,
 'climate_historic': [{'air_temperature_celcius': '21'}],
 'confidence': 100,
 'datetime': '2017-03-12T04:28:20',
 'surface_temperature_celcius': 99}


{'_id': 12651,
 'climate_historic': [{'air_temperature_celcius': '21'}],
 'confidence': 80,
 'datetime': '2017-03-12T04:28:00',
 'surface_temperature_celcius': 68}


{'_id': 12652,
 'climate_historic': [{'air_temperature_celcius': '21'}],
 'confidence': 85,
 'datetime': '2017-03-12T04:27:20',
 'surface_temperature_celcius': 98}


{'_id': 12653,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 100,
 'datetime': '2017-03-10T04:48:40',
 'surface_temperature_celcius': 105}


{'_id': 12654,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 100,
 'datetime': '2017-03-10T04:46:20',
 'surface_temperature_celcius': 109}


{'_id': 12655,
 'climate_historic': [{'air_temperature_celcius': '19'}],
 'confidence': 94,

###### Task e:  Top 10 records with the highest surface temperature (°C).

In [12]:
result=db['hotspot_historic'].find().sort("surface_temperature_celcius", pymongo.DESCENDING).limit(10)

# {"date":{"$in":['15/12/2017', '16/12/2017']}},{"date":1,"air_temperature_celcius":1 ,"hotspot.surface_temperature_celcius":1,"relative_humidity":1,"max_wind_speed":1})
                                                           
for document in result: 
    pprint(document)
    print('\n')



{'_id': 11188,
 'confidence': 100,
 'date': '18/04/2017',
 'datetime': '2017-04-18T04:52:00',
 'latitude': '-38.1665',
 'longitude': '143.062',
 'owner': 208,
 'surface_temperature_celcius': 124}


{'_id': 12382,
 'confidence': 100,
 'date': '4/04/2017',
 'datetime': '2017-04-04T04:32:50',
 'latitude': '-36.343',
 'longitude': '142.1986',
 'owner': 194,
 'surface_temperature_celcius': 123}


{'_id': 11046,
 'confidence': 100,
 'date': '1/05/2017',
 'datetime': '2017-05-01T04:14:20',
 'latitude': '-36.9318',
 'longitude': '143.0907',
 'owner': 221,
 'surface_temperature_celcius': 122}


{'_id': 12621,
 'confidence': 100,
 'date': '18/03/2017',
 'datetime': '2017-03-18T03:50:50',
 'latitude': '-37.017',
 'longitude': '148.1297',
 'owner': 177,
 'surface_temperature_celcius': 121}


{'_id': 10552,
 'confidence': 100,
 'date': '13/05/2017',
 'datetime': '2017-05-13T04:40:20',
 'latitude': '-34.9938',
 'longitude': '141.876',
 'owner': 233,
 'surface_temperature_celcius': 120}


{'_id': 113

###### Task f: Number of fire in each day. You are required to only display the total number of fire and the date in the output.


In [13]:
result=db['hotspot_historic'].aggregate([
    {"$group":    {"_id":"$date", "count": {"$sum":1}}}])
#     find("date").count()
                                                           
for document in result: 
    pprint(document)



{'_id': '6/03/2017', 'count': 2}
{'_id': '7/03/2017', 'count': 1}
{'_id': '9/03/2017', 'count': 3}
{'_id': '12/03/2017', 'count': 5}
{'_id': '13/03/2017', 'count': 2}
{'_id': '14/03/2017', 'count': 10}
{'_id': '18/03/2017', 'count': 3}
{'_id': '19/03/2017', 'count': 21}
{'_id': '28/03/2017', 'count': 54}
{'_id': '29/03/2017', 'count': 1}
{'_id': '31/03/2017', 'count': 22}
{'_id': '2/04/2017', 'count': 5}
{'_id': '4/04/2017', 'count': 89}
{'_id': '5/04/2017', 'count': 49}
{'_id': '7/04/2017', 'count': 39}
{'_id': '15/04/2017', 'count': 69}
{'_id': '17/04/2017', 'count': 38}
{'_id': '18/04/2017', 'count': 325}
{'_id': '19/04/2017', 'count': 50}
{'_id': '20/04/2017', 'count': 31}
{'_id': '23/04/2017', 'count': 19}
{'_id': '24/04/2017', 'count': 8}
{'_id': '25/04/2017', 'count': 3}
{'_id': '26/04/2017', 'count': 1}
{'_id': '29/04/2017', 'count': 3}
{'_id': '4/05/2017', 'count': 135}
{'_id': '3/04/2017', 'count': 72}
{'_id': '6/05/2017', 'count': 17}
{'_id': '7/05/2017', 'count': 3}
{'_id':

###### Task g: Average surface temperature (°C) for each day. 
(displaying average surface temperature (°C) and the date in the output.)


In [14]:
result=db['hotspot_historic'].aggregate([
    {"$group":    {"_id":"$date", "Average Temprature on the day": {"$avg":"$surface_temperature_celcius"}}}])
#     find("date").count()
                                                           
for document in result: 
    pprint(document)



{'Average Temprature on the day': 60.5, '_id': '6/03/2017'}
{'Average Temprature on the day': 64.0, '_id': '7/03/2017'}
{'Average Temprature on the day': 46.666666666666664, '_id': '9/03/2017'}
{'Average Temprature on the day': 88.2, '_id': '12/03/2017'}
{'Average Temprature on the day': 38.5, '_id': '13/03/2017'}
{'Average Temprature on the day': 65.6, '_id': '14/03/2017'}
{'Average Temprature on the day': 79.33333333333333, '_id': '18/03/2017'}
{'Average Temprature on the day': 65.57142857142857, '_id': '19/03/2017'}
{'Average Temprature on the day': 60.925925925925924, '_id': '28/03/2017'}
{'Average Temprature on the day': 51.0, '_id': '29/03/2017'}
{'Average Temprature on the day': 48.72727272727273, '_id': '31/03/2017'}
{'Average Temprature on the day': 45.2, '_id': '2/04/2017'}
{'Average Temprature on the day': 62.57303370786517, '_id': '4/04/2017'}
{'Average Temprature on the day': 53.142857142857146, '_id': '5/04/2017'}
{'Average Temprature on the day': 50.69230769230769, '_id'