# Workshop part 2: Data storage
---
## Assignments notebook
---

## Contents of the notebook

- Importing the required modules 
- Reading the datasets

- Creating the data models
- Creating the import functions 

- Connecting to the MongoDB datastores
- Using the import functions

## Assignments in notebook
1. Assignment 1: Complete the Tracker document with the correct FieldTypes.
2. Assignment 2: Complete the Transmission document with the correct FieldTypes.
3. Assignment 3: Complete the Signal document with the correct FieldTypes.
4. Assignment 4: Complete the Function for creating the Trail documents.
5. Assignment 5: Complete the Function for creating the Signal documents.


#### NOTE: To run a cell, you have to select the cell and press the Run button at the top of the screen. <br>

#### NOTE 2: For convenience, you can type the first letter of a variable and press TAB to automatically add the variable.
   

---
### Importing the required modules
---

In [None]:
import pandas as pd

from mongoengine import * 

from datetime import datetime

### Reading the Crane Datasets
---

In [None]:
Agnetha = pd.read_json('../../Datasets/JSON/Crane-Agnetha.json')
Frida = pd.read_json('../../Datasets/JSON/Crane-Frida.json')
Cajsa = pd.read_json('../../Datasets/JSON/Crane-Cajsa.json')

### Reading the GPS-Route Datasets
---

In [None]:
Biesbosch = pd.read_json('../../Datasets/JSON/Route-Biesbosch.json')
Zeeland_Car_1 = pd.read_json('../../Datasets/JSON/Route-Zeeland_Car_1.json')
Zeeland_Car_2 = pd.read_json('../../Datasets/JSON/Route-Zeeland_Car_2.json')

---

# Creating the Crane Data Model

#### Assignment 1: Complete the Tracker document with the correct FieldTypes.

The table below shows the column names and their datatypes, belonging to a Tracker document:

|Column|Type|
|--|--|
|name|string|
|study-name|string|
|individual-taxon-canonical-name|string|
|individual-local-identifier|int|

Use the content of this table to complete the Tracker document. 

---

In [None]:
class Tracker(Document):
    
    name = StringField()

    study_name = #TODO
    
    individual_taxon_canonical_name = StringField()
     

    individual_local_identifier = #TODO

---
#### Assignment 2: Complete the Transmission document with the correct FieldTypes.

The table below shows the column names and their datatypes, belonging to a Transmission document:


|Column|Type|Desc.|
|--|--|--|
|event-id |int|
|timestamp |datetime|
|coord|point[]| This is an array of the longitude and latitude coordinates  |
|alt|int|
|speed|float|
|heading| float|

Use the content of this table to find the datatype of the fields that you need to complete.

##### NOTE: When creating a field you first decalre the datatype (Starting with a capital letter) followed by "Field()".

---

In [None]:
class Transmission(Document):
    
    event_id = IntField()
    
    timestamp = #TODO
    
    coord = PointField()
    
    alt = FloatField()
    
    speed =  #TODO
    
    heading = #TODO

    tracker = ReferenceField(Tracker)

---

# Creating the GPS-Route Data Model
---

In [None]:
class Trail(Document):

    name = StringField()


---
#### Assignment 3: Complete the Signal document with the correct FieldTypes.

The table below shows the column names and their datatypes, belonging to a Signal document:


|Column|Type|Desc.|
|--|--|--|
|timestamp |datetime|
|coord|point[]| This is an array of the Longitude and Latitude Coordinates |
|alt|float|

Use the content of this table to find the datatype of the fields that you need to complete.

##### NOTE: When creating a field you first decalre the datatype (Starting with a capital letter) followed by "Field()".

You also have to create a Reference to the Trail document. In the slides you learned how to do this. 
In case you get stuck, you can always take at the Transmission document which also uses a ReferenceField() containing a reference to a Tracker document.

---

In [None]:
class Signal(Document):
    
    timestamp = #TODO
    
    coord = PointField()
    
    alt = FloatField()
    
    trail = #TODO

---

---

## End assignment 1, 2 and 3.
#### You should go back to the slides for further instructions.


---

---

## Creating the import function for importing the Crane Data
#### This function is already correct and uses a feature called bulk import. For more information related to the MongoDB bulk import feature, you should follow the Complete GeoStack Course.
---

In [None]:
def load_crane_data(df,name):
    
    tracker = Tracker(study_name = df.at[0,'study-name'],
                      individual_taxon_canonical_name = df.at[0,'individual-taxon-canonical-name'],
                      individual_local_identifier = df.at[0,'individual-local-identifier'],
                      name = name,).save()

    transmissions = []

    for index,row in df.iterrows():
        transmissions.append(Transmission(event_id = row['event-id'],
                                          timestamp = row['timestamp'],
                                          coord = [row['location-long'],row['location-lat']],
                                          alt = row['height-above-ellipsoid'],
                                          speed = row['ground-speed'],
                                          tracker = tracker))
        
    Transmission.objects.insert(transmissions,load_bulk=True)
    
    print("Done importing (Crane)Tracker: " + name)

---

## Creating the import function for importing the GPS-Route
---

In [None]:
def load_route_data(df,name):
    
    
    # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
    #                                                                                   #
    #       Assignment 4: Complete the code logic for creating the Trail documents.     #
    #                                                                                   #
    # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
    # 
    # To create a document, using an import function, you first have to declare which 
    # document you want to create (Create a new instance of this document).
    # 
    # Next you want to declare which Columns, from the dataframe related to the document, 
    # need to be assigned to the related fields.
    #
    # For this to be done correctly, you first need to know which fields are needed in
    # the document. For this you can check back to the code in which we defined the model 
    # of the Trail document. 
    #
    # TIP: The value of the Field:"name" in the Trail document is not provided by the 
    #      dataset. We will pass the value when we call the function:"load_route_data()".
    #
    # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
    
    trail =  #TODO

    trail.save() 
    
    for index,row in df.iterrows():
        
        # Below we assign the data of the timestamp column to a variable called:"time"
        # We devide the value of this column by 1000 because we want to remove the 
        # timezone info from the timestamp. For more information related to timezones
        # you should follow the complete Geostack Course.
        #
        # NOTE: The variable: "time" will be assigend to the correct field when we 
        #       add the code for the creation of a Signal document.
        time = datetime.fromtimestamp(row['time']/1000)
        

        # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
        #                                                                                       #
        #       Assignment 5: Complete the code logic for creating the Signal documents.        #
        #                                                                                       #
        # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
        #                                                                                       
        # To create a document, using an import function, you first have to declare which 
        # document you want to create (Create a new instance of this document).
        # 
        # Next you want to declare which Columns, from the dataframe related to the document, 
        # need to be assigned to the related fields.
        #          
        # To find out which fields belong to a Signal document you can check back to 
        # assignment 3 in which we created the model(template) of a Signal document.
        #
        # 
        # TIP: The data for the field:"timestamp", has already been tansformed and assigned to the 
        #      variable: "time". You only have to assign this value to the correct field in the 
        #      Signal document.
        # 
        # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
        signal = #TODO
    
        signal.save()
        
    print("Done importing Trail (GPS-Route): " + name)


---

---
### Connecting to the Crane Database
#### The database will run on localhost:27017/Crane_Database_
---

In [None]:
disconnect('default')
connect('Crane_Database_')

---
#### Using the function: "load_crane_data()" to load the (Crane)Tracker data. We pass the dataframe and the name of the Crane as parameters.
---

In [None]:
load_crane_data(Agnetha,"Agnetha")
load_crane_data(Frida,"Frida")
load_crane_data(Cajsa,"Cajsa")

---
### Connecting to the GPS-Route (Trail) database
#### The database will run on localhost:27017/Trail_Database_
---

In [None]:
disconnect('default')
connect('Trail_Database_')

---
#### Using the function: "load_route_data()" to load the GPS-Route (Trail) data. We pass the dataframe and the name of the Route as parameters.
---

In [None]:
load_route_data(Biesbosch,"Biesbosch")
load_route_data(Zeeland_Car_1,"Zeeland Car 1")
load_route_data(Zeeland_Car_2,"Zeeland Car 2")