# Data analyses Trails GPX

Date: 31-01-2020 <br>
Concept version: 1.0 <br>
Author: Pieter Lems  <br>

© Copyright 2019 Ministerie van Defensie

## Contents notebook
- Per trail
  - Import
  - Analyze
  - Visualize 
- Visualize all trails


## Data sets in notebook. ("../../datasets/GPX/")
- Trails datasets 
   - ../../datasets/GPX/SEP-26-09 64311 Biesbosch.gpx : National Park Noord-Brabantse Biesbosch, Drimmelen, Boat & Hike trail  
   - ../../datasets/GPX/JAN-16-11 172053 Zeeland MNV.gpx : Zeeland, Brouwersdam Willemstad, Birdwatching Car route
   - ../../datasets/GPX/JUN-03-11 151845 BiesboschLibellen.gpx :  National Park Zuid-Hollandse Biesbosch, Merwelanden, Dragonfly Hiking trail
   - ../../datasets/GPX/OKT-25-09 164243 Hamert Fiets.gpx : National Park Maasduinen, de Hamert, Biking Trail
   - ../../datasets/GPX/SEP-25-09 182235 Hamert.gpx : National Park Maasduinen, de Hamert, Hiking trail 
  
  
### List of anomalities in datasets 
|Set Name|Anomalies|
|--|--|
|Biesbosch|12|
|Biesbosch Libellen|9|
   
      

---

### Importing required modules
---

In [None]:
import gpxpy
import datetime
import numpy as np
import pandas as pd

import cartopy        
import cartopy.crs as ccrs 
import cartopy.feature as cfeature
import matplotlib.pyplot as plt

---
### Define generic functions

### Function: create_df()
this function creates a dataframe from the GPX files

- parameter 1: The GPS-track data which needs to be appended to a dataframe

In [None]:
def create_df(data):

    df = pd.DataFrame(columns=['lon', 'lat', 'alt', 'time'])

    for point in data:
        df = df.append({'lon': point.longitude,
                        'lat' : point.latitude, 
                        'alt' : point.elevation,
                        'time' : point.time}, ignore_index=True)
    return df

---
### Function: transform_to_JSON()
This function exports the dataframe to the fileformat JSON. 

- parameters 1 : Dataframe which you want to clean
- parameters 2 : Name of output JSON file

In [None]:
def transform_to_JSON(df,name):
    
    output_path = "../../datasets/JSON/Trail_JSON/" + name
    
    df.to_json(output_path,orient='records')
    
    return ("Transformation of "  + str(name) + " complete!")
    


---
### Function: create_plot()
This function creates a plot from the inserted data on the map.

- parameter 1: The dataframe you want to plot
- parameter 2: The longitude column in the dataframe
- parameter 3: The latitude column in the dataframe

In [None]:
def create_plot(df,lonColumn,latColumn):
    return plt.plot(df[lonColumn], df[latColumn])

---
### Function: init_cartopymap():
This function creates a new cartopyMap and then returns the map instance

In [None]:
def init_cartopymap():
    
    plt.figure(figsize = (20, 12))
    
    m = plt.axes(projection=ccrs.PlateCarree())

    m.coastlines(resolution='10m')
    
    m.add_feature(cartopy.feature.LAND.with_scale('10m'), edgecolor='black', facecolor = "white")
    
    m.add_feature(cfeature.OCEAN)
    
    m.add_feature(cfeature.LAKES.with_scale('10m'), edgecolor = 'black')
    
    m.add_feature(cfeature.RIVERS.with_scale('10m')) 
    
    m.add_feature(cfeature.BORDERS.with_scale('10m'))

    return m 

---
### Function: dfs_on_cartopymap():
This function creates a map and visualizes the inserted dataframes on a CartopyMap

- parameter 1: A list of dataframes, this can also be a list of just one dataframe
- parameter 2: The longitude column in the dataframe
- parameter 3: The latitude column in the dataframe
- parameter 4: The color of the datapoints
- parameter 5: The size of the datapoints

In [None]:
def dfs_on_cartopymap(dfList,lonColumn,latColumn,color,size):

    m = init_cartopymap()
    
    for df in dfList: 
        
        m.scatter(df[lonColumn], 
                  df[latColumn],
                  color=color, 
                  s = size)    
    return m

---
### Function: basic_analyses()
This function performs a basic GPX data analyses on an input file and returns a populated dataframe

- parameter 1: The file location of the GPX file

In [None]:
def basic_analyses(input_file):
    
    file = open(input_file, 'r')
    
    parsed_file = gpxpy.parse(file)
    
    print('\n--------------------------------BASIC INFORMATION-------------------------------\n\n')
    
    print("File info: " + str(parsed_file.tracks)+'\n')
    
    print("Tracks in file:" + str(len(parsed_file.tracks))+'\n')
    
    print("Segments in track: " + str(len(parsed_file.tracks[0].segments))+'\n')
    
    print("DataPoints in track: " + str(len(parsed_file.tracks[0].segments[0].points))+'\n')
    
    data = parsed_file.tracks[0].segments[0].points
    
    print("Start position and date :" + str(data[0])+'\n')
    
    print("End position and date :"  + str(data[-1])+'\n')
    
    df = create_df(data)
    
    print('\n------------------------------COLUMNS AND DATA TYPES----------------------------\n\n')
    
    print(df.dtypes)
    
    print('\n-----------------------------DATA FRAME (first 2 rows)---------------------------\n\n')
    
    return df

---
## Biesbosch
### National Park Noord-Brabantse Biesbosch, Drimmelen, Boat & Hike trail 

---

In [None]:
Biesbosch_file = open('../../datasets/GPX/SEP-26-09 64311 Biesbosch.gpx', 'r')
Biesbosch = gpxpy.parse(Biesbosch_file)

#### Print amount of tracks in GPX file

In [None]:
len(Biesbosch.tracks)

#### Print amount of line segments in track

In [None]:
len(Biesbosch.tracks[0].segments)

#### Print length of track 

In [None]:
len(Biesbosch.tracks[0].segments[0].points)

#### Assign all trackpoints to a variable and print one

In [None]:
Biesbosch_data = Biesbosch.tracks[0].segments[0].points
Biesbosch_data[:1]

#### Show start coordinates and DTG (Datetimegroup)

In [None]:
Biesbosch_data[0]

#### Show end coordinates and DTG (Datetimegroup)

In [None]:
Biesbosch_data[-1]

#### Create a dataframe

In [None]:
Biesbosch = create_df(Biesbosch_data)

#### Plot track using MatplotLib

In [None]:
plt.plot(Biesbosch[12:]['lon'], Biesbosch[12:]['lat'])

#### Plot track using MatplotLib + Cartopy

In [None]:
dfs_on_cartopymap([Biesbosch[12:]],'lon','lat',"red",1)

#### Plot datacolumns and their datatypes

In [None]:
Biesbosch.dtypes

#### Create table with columns and data types:

|Column|Type|Desc.|
|--|--|--|
|lon |    float64 | longitude coordinates |
|lat  |   float64 | latitiude coordinates | 
|alt   |  float64 | altitude coordinates |
|time  |   Datetime | datetime of transmission|

#### Transform dataframe to JSON file format

In [None]:
transform_to_JSON(Biesbosch[12:],'Trail_Biesbosch.json')

---
## Biesbosch Libellen
### National Park Zuid-Hollandse Biesbosch, Merwelanden, Dragonfly Hike trail
---

#### Perform basic analyses of trail and assign dataframe returned to a variable.

In [None]:
Biesbosch_Lib = basic_analyses('../../datasets/GPX/JUN-03-11 151845 BiesboschLibellen.gpx')
Biesbosch_Lib

#### Plot track using MatplotLib.

In [None]:
create_plot(Biesbosch_Lib[9:],'lon','lat') 

#### Plot track using MatplotLib on a CartopyMap.

In [None]:
dfs_on_cartopymap([Biesbosch_Lib[9:]],'lon','lat','red',1)

#### Transform dataframe to JSON file format.

In [None]:
transform_to_JSON(Biesbosch_Lib[9:],'Trail-Biesbosch-Libellen.json')

---
## Zeeland
### Zeeland, Brouwersdam Willemstad, Birdwatching car route
---

#### Perform basic analyses of trail and assign dataframe returned to a variable.

In [None]:
Zeeland = basic_analyses('../../datasets/GPX/JAN-16-11 172053 Zeeland MNV.gpx')
Zeeland

#### Plot track using MatplotLib.

In [None]:
create_plot(Zeeland,'lon','lat') 

#### Plot track using MatplotLib on a CartopyMap.

In [None]:
dfs_on_cartopymap([Zeeland],'lon','lat','red',3)

#### Transform dataframe to JSON file format.

In [None]:
transform_to_JSON(Zeeland,'Trail_ZeelandMNV.json')

---
##  Hamert Biking
### National Park Maasduinen, de Hamert, Biking Trail
---

#### Perform basic analyses of trail and assign dataframe returned to a variable.

In [None]:
Hamert_Bike = basic_analyses('../../datasets/GPX/OKT-25-09 164243 Hamert Fiets.gpx')
Hamert_Bike 

#### Plot track using MatplotLib.

In [None]:
create_plot(Hamert_Bike,'lon','lat') 

#### Plot track using MatplotLib on a CartopyMap.

In [None]:
dfs_on_cartopymap([Hamert_Bike],'lon','lat','red',1)

#### Transform dataframe to JSON file format.

In [None]:
transform_to_JSON(Hamert_Bike,'Trail-Hamert-Bike.json')

---
## Hamert Hiking
### National Park Maasduinen, de Hamert, Hiking trail 
---

#### Perform basic analyses of trail and assign dataframe returned to a variable.

In [None]:
Hamert_Hike = basic_analyses('../../datasets/GPX/SEP-25-09 182235 Hamert.gpx')
Hamert_Hike 

#### Plot track using MatplotLib.

In [None]:
create_plot(Hamert_Hike,'lon','lat') 

#### Plot track using MatplotLib on a CartopyMap.

In [None]:
dfs_on_cartopymap([Hamert_Hike],'lon','lat','red',1)

#### Transform dataframe to JSON file format.

In [None]:
transform_to_JSON(Hamert_Hike,'Trail-Hamert-Hike.json')

---
## Visualizing all GPS-tracks on one Cartopy Map
### Done by passing a list of all dataframes in this notebook to the function 
---

In [None]:
dfs_on_cartopymap([Zeeland, 
                   Biesbosch_Lib, 
                   Biesbosch, 
                   Hamert_Bike,
                   Hamert_Hike],'lon','lat','red',2)

---
## TITLE
### SUBTITLE
---

#### Perform basic analyses of trail and assign dataframe returned to a variable.

#### Plot track using MatplotLib.

#### Plot track using MatplotLib on a CartopyMap.

#### Transform dataframe to JSON file format.