# MDA Lab Course - Chapter 6 - Personal Mobility Analysis


### Introducing Question. What happend here?
* Having a look on the track: circles, mountainy region including lifts, track not following any roads, altitude unclear.
* Might be ... 

![Introducing Track: What happend here?](images/header.png)

### What are we going to do, briefly?
In this course, we gather tools to examine personal mobility tracks in detail and beeing able to make an assumption over specific tracks, behavior and locations and hotspots visited by the recorded person.

![s](images/sol.jpg)

## Theory: GPX-files in python
### Theorie of GPX-files is can be found on wikipedia
[Link to Wikipedia](https://de.wikipedia.org/wiki/GPS_Exchange_Format)
### Working with GPX-Files in Python
GPX-data is basically xml-files. To read xml-files, multiple apporaches exist in python but only few of them are feasible for gpx-files.
In this lab course we want to use the python-library **gpxpy** to read in the files since it *understands* the gpx format natively, i.e., it is interpreting the xml-structure the gpx-way.

There are 2 versions of the GPX-format. Version 1.0 was released in 2002, while version 1.1 followed in 2004. The attribute *speed* is not existing in the newer GPX1.1 and can only be saved using extensions. 
Geo-data is saved in the GPX-file in one of the following ways:
* Waypoint: single waypoint or one GPS-point.
* Route: sorded list of waypoints which describe a route.
* Track: sorted list of points which form an line. This type is typically returned by gps-recorders.

Additionally to the geo-information, GPX can save data in attributes, e.g., **ele** for elevation, **time** for a timestamp or **desc** for a description.
Gpxpy can read and write all three types of data.
Since typically *tracks* are saved, we going to have a look into reading a gpx-file which contains tracks. First, we start opening the GPX-file. This is done via the standard function *open()* from python. Gpxpy is not needed here.

In [None]:
gpx_file = open("data/Gleitschirm.gpx", 'r')

Next, we want to parse the content of the file, and get a gpxpy object back. For doing this, we first need to import the gpxpy package.

In [None]:
import gpxpy
gpx = gpxpy.parse(gpx_file)

Now that we have read in the file and parsed it into the gpxpy structure, we can access the elements. In this case, we loop over all the tracks in the file. For getting all the points, we need to look into all tracks, then look into all segments and finally can access the points. In this example, we save the lat/lon/time values of the points to single lists and additionally print out the points.

In [None]:
latitude = []
longitude=[]
time = []

for track in gpx.tracks:
    for segment in track.segments:
        for point in segment.points:
            longitude.append(point.longitude)
            latitude.append(point.latitude)
            time.append(point.time)
            print('{} Point at ({},{}). Elevation: {} m'.format(point.time, point.latitude, point.longitude, point.elevation))

As a next step, we can save the points into a GeoDataFrame for further analysis. To accomplish this, GeoPandas uses the *gpd.points_from_xy(long, lat)* method to read in the raw lat/lon lists we saved previously. The time vector is used as an index. The start *gdf.head()* of the resulting GeoPandasDataFrame is then printed out.

In [None]:
import geopandas as gpd

geo_df = gpd.GeoDataFrame(geometry=gpd.points_from_xy(longitude, latitude), index=time)
geo_df.head()

# Task 1: GPS-File Import

## Task 1.1: GPX-File import function
In order to analyze the recorded profiles, we need the functionality to read GPX-data. Write a function, that reads in the GPX file and returns a GeoDataFrame. Please sort the GeoDataFrame after `time` set the timestamps as index and and put `latitude`/`longitude` as a Point in the `geometry` property of the GeoDataFrame. Add also the `elevation` info to the GeoDatyFrame.
________________________

#### Signature of the function
`gdf_raw = gpx_importer(gpx_file)`

**Inputs**: `gpx_file`: String containing a path to the GPX file

**Returns:** `gdf_raw`: GeoDataFrame containing `latitude`, `longitute`, `elevation` data and `time` set as index
________________________

In [None]:
import gpxpy
import geopy
import pandas as pd
import geopandas as gpd

def gpx_importer(gpx_file):
    
    # Preallocation of arrays to append the extracted data
    latitude = []
    longitude = []
    elevation = []
    time=[]
    
    # Open the handed over 'gpx_file' and parse data in by using the gpxpy.parse() function
    gpx_data = None
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    gpx_fh = open(gpx_file, 'r')
    gpx_data = gpxpy.parse(gpx_fh)
    #<</solution>>
    # INSERT CODE HERE | END
    
    # Looping over data to access data regarding 'latitude', 'longitude', 'elevation' and 'time'
    # Hint: You may want to use a nested loop over 'tracks' --> 'segments' --> 'points' and use the .append() 
    # function to append the obtain data to the above preallocated arrays.
    # Hint: You can, e.g., acces the data of a 'point' regarding 'latitude' with point.latitude
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    for track in gpx_data.tracks:
        for segment in track.segments:
            for point in segment.points:
                latitude.append(point.latitude)
                longitude.append(point.longitude)
                elevation.append(point.elevation)
                time.append(point.time)
    #<</solution>>
    # INSERT CODE HERE | END
                
    # Based on your extracted data, initialize a GeoDataFrame with 'geometry' and 'elevation' columns,
    # while the 'time' is set as index. Sort your GeoDataFrame by 'time'
    # Hint: How can you derive 'geometry' points from xy? You may also want to make use of the following
    # functions: gdf.sort_values(by=) and gdf.set_index()
    gdf_raw = None
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    gdf_raw = gpd.GeoDataFrame(
        {"elevation":elevation, "time":time},
        geometry=gpd.points_from_xy(longitude, latitude)).sort_values(by="time").set_index("time")
    #<</solution>>
    # INSERT CODE HERE | END   
    
    return gdf_raw

### Test your function
Test your function with a GPX file (`GPX_FILE = "rawdata/Gleitschirm.gpx"`, given in the cell below). Please print out the `gdf.head()` of the GeoDataFrame. Make sure to test the gpx_importer in a new cell.

In [None]:
# This is the file you want to read to validate the gpx_importer(gpx_file) function
GPX_FILE = "data/Gleitschirm.gpx"

# Read 'GPX_FILE' and show the head of the data. Did your implemented function work correctly?
my_mobility_data_single_file = None
# INSERT CODE HERE | BEGIN
#<<solution>>
my_mobility_data_single_file = gpx_importer(GPX_FILE)
my_mobility_data_single_file.head()
#<</solution>>
# INSERT CODE HERE | END  

## Task 1.2: Multi-GPX-File import function
As you propably noticed, your GPX-logging app may produces more then one file. The Android App [GPSLogger](https://play.google.com/store/apps/details?id=com.mendhak.gpslogger&hl=de) for example, saves each day in a seperate file. To handle these many files, we need a multi-file importer. Luckily, we already can import one GPX-file, so it should be easy to apply this function to several files, right? Create a new function importing multiple GPX-files!

Use the following technologies/approach in the function:
* Use the python standard package `glob` to read in all the files inside a folder
* For-loop over all found, valid files
* Use the previously defined function `gpx_importer(GPX_FILE)` to read in a single GPX-file
* Appending the single, imported files to a common GeoDataFrame
________________________

### Signature of the function
`imported_files = gpx_multi_importer(folder)`

**Inputs**: `folder_path`: Path to a folder containing GPX-files

**Returns:** `imported_files`: Ordered GeoDataFrame of all GPX-files in the input folder path
________________________

In [None]:
# Do you need to import a new library before you can run the code below?
# INSERT CODE HERE | BEGIN
#<<solution>>
import glob
#<</solution>>
# INSERT CODE HERE | END 

def gpx_multi_importer(folder_path):
    
    # The glob.glob function allows you to store files (in this case all files with "*.gpx" ending) in a folder object
    gpx_file_list = glob.glob(folder_path + "*.gpx")
    
    # Use this GeoDataFrame to append the the separated files of the folder object 'gpx_file_list'
    imported_files = gpd.GeoDataFrame()
    
    # Loop over the 'gpx_files' in 'gpx_file_list' and append obtained data to the above initialized 'imported_files'
    # in sorted manner. You might use an intermediate step to read 'gpx_file' using your gpx_importer(gpx_file) function.
    # Hint: There is an argument 'sort=True' for the .append() function available.
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    for gpx_file in gpx_file_list:
        res = gpx_importer(gpx_file)
        imported_files = imported_files.append(res, sort=True)
    #<</solution>>
    # INSERT CODE HERE | END 
    
    # Sort the index of your file by using the gdf.sort_index() function
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    imported_files = imported_files.sort_index()
    #<</solution>>
    # INSERT CODE HERE | END 
    
    return imported_files

### Test the function
Test the function in a new cell. Use the folder `folder = "data/lukas_31-7_bis_6-8/"` as input.

In [None]:
# This is the path to the folder you want to read to validate the gpx_multi_importer(folder) function
folder = "data/lukas_31-7_bis_6-8/"

# Read all files in 'folder' and show the head of the resulting GeoDataFrame. Did your implemented
# function work correctly? You sould see 'time' as index and 'elevation' and 'geometry' (point) as columns.
# INSERT CODE HERE | BEGIN
#<<solution>>
my_mobility_data = gpx_multi_importer(folder)
my_mobility_data.head()
#<</solution>>
# INSERT CODE HERE | END 


## Task 2: Reuse the *calculate_distance()* function
In a previous part of the lab course (Chapter 3 - Basic Methods and Visualization) you developed a function to calculate the distance between a pair of lat/lon values.

In [None]:
import numpy as np

# Given function: lat_lon_2_m from chapter 3
def lat_lon_2_m(latitude_1, longitude_1, latitude_2, longitude_2):
    
    # Radius of the earth in m
    radius_earth = 6371009

    d_latitude = np.deg2rad(latitude_2 - latitude_1)
    d_longitude = np.deg2rad(longitude_2 - longitude_1)
    latitude_1 = np.deg2rad(latitude_1)
    latitude_2 = np.deg2rad(latitude_2)

    a = (np.sin(d_latitude / 2)) ** 2 + np.cos(latitude_1) * np.cos(latitude_2) * (np.sin(d_longitude / 2)) ** 2
    c = 2 * np.arcsin(np.sqrt(a))
    distance = radius_earth * c

    return distance

Reuse the given function `lat_lon_2_m()` in this place and test the function using the two given points:

**Point 1**: (48.5, 11.1)<br>
**Point 2**: (48.9, 11.6)

The return value should be approx. 57.66 km. 

In [None]:
# Test the lat_lon_2_m() function with the given points and print the result
# INSERT CODE HERE | BEGIN
#<<solution>>
d = lat_lon_2_m(*(48.5, 11.1), *(48.9, 11.6)) / 1000
print("Distance between point 1 and point 2:", round(d,4), "km")
#<</solution>>
# INSERT CODE HERE | END 

### Compare distance results of several functions
Compare this result to the output of the two geopy functions below. Use the above given coordinates as input.
* `geopy.distance.great_circle((lat,lon), (lat,lon))` and 
* `geopy.distance.geodesic((lat,lon), (lat,lon))` <br>
[Documentation](https://geopy.readthedocs.io/en/latest/#geopy.distance.great_circle)

How would you explain the different return value of `lat_lon_2_m()` and `great_circle()` vs. `geodesic()`? Answer in a short note.

In [None]:
# Don't forget to import the geopy.distance functions
# INSERT CODE HERE | BEGIN
#<<solution>>
from geopy.distance import great_circle, geodesic
#<</solution>>
# INSERT CODE HERE | END 

# Compare to the results of geopy.distance functions.
# Use gc for the great_circle function, gd for the geodesic function
gc = None
gd = None
# INSERT CODE HERE | BEGIN
#<<solution>>
gc = great_circle((48.5, 11.1), (48.9, 11.6))
gd = geodesic((48.5, 11.1), (48.9, 11.6))
#<</solution>>
# INSERT CODE HERE | END 

print("Great Circle: ", round(gc.km, 4), "km")
print("Geodesic: ", round(gd.km, 4), "km")
print("Difference: ", round(abs(gc.m - gd.m), 2), "m")

## Task 3A (given): Speed and Acceleration
### Info
In Task 3, many functions are given that augment the gps data imported before. Since there is not enough time to implement the functions on your own, they are given. Please look at the specifications of the functions and make yourself familiar with the way the work. Use the dataset *my_mobility_data* you imported with the multi_importer.

### Given function from Chapter 3 - Basic Methods and  Visualization
Function that calculates distances between two consectuive points using `lat_lon_2_m` function.
________________________
#### Signature of the function
`distance = calculate_distance(latitude, longitude)`
**Inputs**: 
* `latitude`: Array containing latitude coordinates
* `longitude`: Array containing longitude coordinates

**Returns**: `distance`: np.array containing distance between all consectuive points

In [None]:
# Given function from Chapter 3 - Basic Methods and Visualization
def calculate_distance(latitude, longitude):
    if len(latitude) != len(longitude):
        return np.nan
    else:
        # initialize distance array
        distance = np.zeros(len(latitude))
        # loop latitude, longitude
        for i in range(0, len(latitude) - 1):
            # calculate distance between two consecutive points using lat_lon_2_km function
            d = lat_lon_2_m(
                latitude[i],
                longitude[i],
                latitude[i + 1],
                longitude[i + 1])
            # append distance to array
            distance[i + 1] = d
        return distance

### Task 3.1 (given): Function that calculates the speed from the GPX-track

Function calculates the speed of a GPX-track.<br>
Test your function on the GPX-dataset. Add the columns *speed* and *acc* to the input dataframe. <br>

#### Signature of the function
`speed = calculate_speed_trend(array_distances, array_deltatime)`

**Inputs**: 
* `array_distances`: Distances, calculated by the function *calculate_distance(latitude, longitude)* from (Chapter 3 - Basic Methods and Visualization)
* `array_deltatime`: Array of delta time between the timestamps of our input dataset. Use *diff()* to calcualte

**Returns**: `speed`: Array containing speed

In [None]:
# Given function from Task 3.1
def calculate_speed(array_distances, array_deltatime):
    if len(array_distances) != len(array_deltatime):
        print("input vector length does not match!")
        return np.nan
    else:
        # preallocate the return variable
        speed = [0] * (len(array_distances)-1)
        for i in range(0, len(array_distances)-1):
            speed[i] = array_distances[i] / array_deltatime[i] * 3.6
        # Add entry to acceleration, to have same length as speed
        speed = np.append(speed, [0])
        return speed

### Task 3.2 (given):  Function that calculates the acceleration from the GPX-track

Function calculates the acceleration of a speed-signal.<br>
Test your function on the GPX-dataset. Add the columns *acc* to the input dataframe. <br>
________________________
#### Signature of the function
`acceleration = calculate_acceleration(speed, array_deltatime)`

**Inputs**:
* `speed`: Input data, a vector containing speed signals
* `array_deltatime`: Array of delta time between the timestamps of our input dataset

**Returns**: `acceleration`: Acceleration as array

In [None]:
# Given function from Task 3.2
def calculate_acceleration(speed, array_deltatime):
    acceleration = np.diff(speed)
    # Add entry to acceleration, to have same length as speed
    acceleration = np.append(acceleration, [0]) / array_deltatime
    return acceleration

### Task 3.3 (given): Function that corrects the speed data using acceleration

Now that we have a raw speed value and the acceleration from it, we can look for "unnormal" accelerations and decide to ignore them as well as the underlying speed value. This function loops through the acceleration vector and checks whether an accerleration value is above a given threshold, e.g., *THRESHOLD_ACC = 10 m/s²*.
________________________
#### Signature of the function
`speed_corr = correct_speed(speed, acceleration, threshold)`

**Inputs**: 
* `speed:` Vector containing speed signal.
* `acceleration`: Vector containing the acceleration signal
* `threshold`: Int/float representing a threshold, when to "untrust" a value

**Returns**: `speed_corr`: Corrected speed as array

In [None]:
# Given function from Task 3.3
def correct_speed(speed, acceleration, threshold):
    if len(speed) != len(acceleration):
        print("input vector length does not match!")
        return np.nan
    else:
        # Initiate the corrected speed vector with the original
        speed_corr = np.zeros((len(speed), 1))
        # Loop over and replace speeds, if acceleration > threshold
        speed_corr = speed
        for i in range(0, len(speed)-3):
            #speed_corr[i] = speed[i]
            if abs(acceleration[i]) > threshold:
                #speed_corr[i:i+3] = np.NaN
                speed_corr[i-3:i+3] = np.NaN  
        return speed_corr

### Task 3.4 (given): Function that calculates a moving average of the speed data

Function calculates a moving-averaged speed-signal to support later analyses.<br>
Add the columns *speed_ma* (Moving averaged speed values) to the input dataframe. <br>
Details: [Theory and Implementation of Movingaverages in Python](https://waterprogramming.wordpress.com/2018/09/04/implementation-of-the-moving-average-filter-using-convolution/)
________________________
#### Signature of the function
`ma_values = movingaverage(values, window)`

**Inputs**: 
* `values`: signtal to be averaged
* `window`: Window size of the moving-average 

**Returns**: `ma_values`: Averaged input-values

In [None]:
# Given function from Task 3.4
def movingaverage (values, window):
    weights = np.repeat(1.0, window)/window
    sma = np.convolve(values, weights, 'same')
    return sma

## Task 3B: Enrich *my_mobility_data* with acceleration and speeds
Now, we make use of the given functions from Taks 3A to calculate speed, speed_corrected, moving_averaged_speed (from the speed_corrected) and acc. Augment the `my_mobility_data` DataFrame with the calculated data. <br>

### Add the new values to the dataset
Name the new columns in the dataframe:
- speed_corrected: `my_mobility_data["speed_corr"]`
- speed: `my_mobility_data["speed"]`
- acc: `my_mobility_data["acc"]`
- speed_moving_averaged: `my_mobility_data["speed_ma"]`

In [None]:
import matplotlib.pyplot as plt
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

# Given constant to threshold acceleration 
THRESHOLD_ACC = 10

# We calculate the distance and corresponding delta_times of my_mobility_data as you need these
# characteristics later on.
distances = calculate_distance(my_mobility_data.geometry.y, my_mobility_data.geometry.x) 
delta_times = my_mobility_data.index.to_series().diff().apply(lambda x: x.total_seconds())

# Calculate speed, acceleration and speed_corr
# Hint: Make use of the functions given in Task 3A 
speed = None
acc = None
speed_corr = None
# INSERT CODE HERE | BEGIN
#<<solution>>
speed = calculate_speed(distances, delta_times)
acc = calculate_acceleration(speed, delta_times)
speed_corr = correct_speed(speed, acc, THRESHOLD_ACC)
#<</solution>>
# INSERT CODE HERE | END 

# Add the new data back into the original GeoDataFrame my_mobility_data,
# e.g., speed_corr should be contained in the column my_mobility_data["speed_corr"]
# INSERT CODE HERE | BEGIN
#<<solution>>
my_mobility_data["speed_corr"] = speed_corr
my_mobility_data["speed"] = speed
my_mobility_data["acc"] = acc
#<</solution>>
# INSERT CODE HERE | END 

# Calculation of averaged speed
speed_moving_averaged = movingaverage(my_mobility_data['speed_corr'], 15)
my_mobility_data["speed_ma"] = speed_moving_averaged

# To get more detailed insight on the corrected speed data "speed_corr", print the .mean()
# and median() values of my_mobility_data["speed_corr"]
# INSERT CODE HERE | BEGIN
#<<solution>>
print("Mean speed_corr: {} km/h".format(round(my_mobility_data["speed_corr"].mean(), 2)))
print("Median speed_corr: {} km/h".format(round(my_mobility_data["speed_corr"].median(), 2)))
#<</solution>>
# INSERT CODE HERE | END 

### Plot the new values of the dataset
- Plot the corrected speed of the data
- Plot the moving averaged speed of the data
- Plot the acceleration of the data

In [None]:
# Plot speed data
# INSERT CODE HERE | BEGIN
#<<solution>>
fig, ax = plt.subplots(1)
ax.plot(my_mobility_data["speed_corr"], marker="*", linestyle="None")
ax.set_xlabel("time")
ax.set_ylabel("speed_corr in km/h")
ax.set_title("speed_corr")
#<</solution>>
# INSERT CODE HERE | END 

# Plot of speed moving average data
fig, ax = plt.subplots(1)
ax.plot(my_mobility_data["speed_ma"], marker=".")
ax.set_xlabel("time")
ax.set_ylabel("averaged speed in km/h")
ax.set_title("Averaged Speed")

# Plot of acceleration data
fig, ax = plt.subplots(1)
ax.plot(my_mobility_data["acc"], marker=".", linestyle="None")
ax.set_xlabel("time")
ax.set_ylabel("acceleration in m/s")
ax.set_title("Acceleration")
ax.set_ylim([-500,500])

## Task 4 (given): Creating a Dataset with Continous Timestamp
The timestamp from the GPX-measurement can have gaps and inconsistent delta-timesteps. We want a dataset which has a linear time without gaps and with a static timestep-size for our further analysis. Missing data needs to be filled with NaNs.

We use the following methodology:
* Create a new GeoDataFrame `my_mobility_data_cont` with a datetime-index with the desired timestep, e.g., frequency of 10s, 20s or 30s
* Use the pandas method `merge_asof()` to merge the original GeoDataFrame with the new, resampled one.

After the merge process, we need to reconfigure things a bit. By using the pandas `merge_asof()` method, our GeoDataFrame gets messed up and the geometry attribute is a *pandas.Series* now. We want it to be a *geopandas.GeoSeries*. 
* To re-convert back to the GeoDataframe, initialize a new GeoDataFrame with the output DataFrame from the merge process as input for the constructor.
* As a control to check if everything went right, please plot the `["speed_corr"]` attribute 
* As a second check, please print the type of the geometry-column like so: *type(myGeoDataFrame.geometry)*. This should produce the following output: `<class 'geopandas.geoseries.GeoSeries'>`

Please go through the code below and think throught it. Because of time-contraints, we set this as given.

In [None]:
from IPython.core.debugger import set_trace

# Create continous time vector from start to end
time_start = my_mobility_data.index[0]
time_end = my_mobility_data.index[-1]
time_range = pd.date_range(time_start, time_end, freq='30s')

# create new dataframe using the time_range as index
my_mobility_data_cont = gpd.GeoDataFrame({}, index=time_range)

# merge old data into the new frame using merge_asof
my_mobility_data_cont = pd.merge_asof(
    my_mobility_data_cont, my_mobility_data.sort_index(), left_index=True, right_index=True)

# Reconvert to GeoDataFrame again
my_mobility_data_cont = gpd.GeoDataFrame(my_mobility_data_cont)

# Plot the speed_corr from the continous and resampled dataframe
fig, ax = plt.subplots(1)
ax.plot(my_mobility_data_cont["speed_corr"])
print("Length: {}".format(my_mobility_data_cont.size))
print(type(my_mobility_data_cont["geometry"]))

## Task 5 (given): Plot the GPX File / Data

### Info

To visualize the data, we want to plot it on a map. Please make yourself comfortable with the class' *folium_plot*, that plots the geometry of the well known `my_mobility_data` onto a map.
We use the package [**folium**](https://python-visualization.github.io/folium/) here for mapping.

#### Structure of the class

**Classname:** folium_plot <br>
**Membermethods:**<br>
* init()
* add_to_folium(GeoDataFrame): Plot GeoDataFrame holding Points as PolyLine to a map
* add_many_to_folium(list(GeoDataFrame)) Plot several GeoDataFrames from a list to a map. All in different colors.
* add_geojson_to_folium(gdf_poly, tooltip): Plot a GeoDataFrame holding a single geometry object e.g. polygon, circle. This does not work for whole tracks (use add_to_folium, add_many_to_folium for this usecase)
* save_map(path_name): Save map as .html document to *path_name*
* show_map(): Needs to be the last statement of a cell. Triggers showing the map.

### Plot my_mobility_data
The code is already given in the cell below the plot-class definition.
Make use of the class and plot the *my_mobility_data* dataset using the decribed class.

In [None]:
import folium
import os

# Given: Class to plot geo data into folium maps
class folium_plot():
    def __init__(self, **kwargs):
        
        self.width = kwargs.get("width", "100%")
        self.height = kwargs.get("height", "600")
        self.location = kwargs.get("location", [48.15, 11.57])
        self.f = folium.Figure(width=self.width, height=self.height)
        #self.m = folium.Map(location=self.location, tiles='stamentoner', zoom_start=11, crs="EPSG4326").add_to(self.f)
        self.m = folium.Map(location=self.location, tiles='stamentoner', zoom_start=11).add_to(self.f)

        self.color_map =  ['red', 'blue', 'green', 'purple', 'orange', 'darkred', 'darkblue', 'darkgreen', 'cadetblue', 'pink', 'lightblue', 'lightgreen',
                 'gray', 'lightgray']
    
    # For Points in geometry: adds them as a polyline
    def add_to_folium(self, my_mobility_data, color="red", tooltip=""):

        # Zip data together in a np-array
        data2 = np.array(list(zip(my_mobility_data['geometry'].y, my_mobility_data['geometry'].x)))
        folium.PolyLine(data2, color=color, weight=4.5, opacity=1, tooltip=tooltip).add_to(self.m)
        return self.m
    
    # give a list of geodataframes and plot each one as polyline
    def add_many_to_folium(self, my_mobility_data_list):
        for idx, my_mobility_data in enumerate(my_mobility_data_list):
            self.add_to_folium(my_mobility_data, color=self.color_map[idx % len(self.color_map)], tooltip=idx)
    
    # Add geometry like polygon, point from a geoDataFrame to the map
    def add_geojson_to_folium(self, gdf_poly, tooltip=""):
        folium.GeoJson(gdf_poly, tooltip=tooltip).add_to(self.m)
    
    # Add a marker at a given position to the map
    def add_marker_to_folium(self, lat, lon, popup=""):
        folium.Marker([lon, lat], popup=popup).add_to(self.m)

    # save the map to a html-file
    def save_map(self, path_name):
        self.m.save(os.path.join(path_name))
    
    # call at the end of the cell: Showing the map
    def show_map(self):
        return self.f

In [None]:
# Use the plot map class to plot the dataset
m = folium_plot()
m.add_to_folium(my_mobility_data)
m.add_marker_to_folium(lon=my_mobility_data.geometry.y[0], lat=my_mobility_data.geometry.x[1], popup="start")
m.show_map()

## Task 6: Basic statistical KPIs
For our further analyses, we need a function that calculates and plots statistical kpis for a given dataset.

### Task 6.1 (given): Get total distance of track
Before comming to the function dealing with the KPIs, we need a helper function: Create a function that calculates the total distance of a dataset. You can re-use the `calculate_distance()` function of Chapter 3 and encapsualte it into the new function.
________________________
#### Signature of the function
`distance = total_distance(my_mobility_data)`

**Inputs**: `my_mobility_data`: Dataframe containing the trip

**Returns**: `distance`: Total distance of the dataset in **km**
________________________


In [None]:
def total_distance(my_mobility_data):
    
    # Total travel distance, allocated to 0
    total_distance = 0
    # Calculate vector of disntances between the points
    distances = calculate_distance(my_mobility_data.geometry.y, my_mobility_data.geometry.x)
    # Sum up vector, divide by 1000 to get km
    total_distance = distances.sum() / 1000 
    
    return total_distance

### Task 6.2: Function calculating and plotting KPIs

Now that we have all the helper functions ready, we can create the KPIs function. The following KPIs should be calculated and printed / plotted from inside the function:

#### KPIs to be printed out:
* Total travel distance in km
* Total track time: Human readable time representation.
* Total track start: Human readable time representation.
* Total track end: Human readable time representation.
* Total moving time: Human readable time representation. Detect movement by observing the speed-vector. Introduce a adjustable threshold to detect "zero speed"
* Average travel speed in km/h. Use speed_corr.
* Average travel speed, only moving time in km/h. Use speed_corr.
* Average elevation of track in m

#### KPIs to be plotted:
* Histogram of the travel speeds. Use fixed bins in the interval [0,100] km/h with a stepsize of 5 km/h.
* Elevation of the track
* speed trend of the track (Either speed_ma or speed_corr)
________________________
#### Signature of the function
`dict = statistical_kpis(my_mobility_data)`

**Inputs**: `my_mobility_data`: Dataframe containing the trip

**Returns**: `{"av_speed": av_speed, "total_distance": dis}`: Dictionary containing average speed and distance
________________________
Please make sure that the KPIs are printed/plotted in a way, that they are clearly specified and accompanied by a meaningful unit, e.g., "Total traveled distance: 3000 km"

In [None]:
# Import of geopy
from geopy import distance
import geopy
from IPython.core.debugger import set_trace

# Threshold to detect zero speed
THRESHOLD_DETECT_ZERO_SPEED = 0

# Function to calculate statistical kpis and plot basic values
def statistical_kpis(my_mobility_data):

    # Calculate the total track distance in km of the input data 'my_mobility_data' and
    # print the result rounded to 2 digits
    # Hint: Use the previously introduced function total_distance()
    dis = None
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    dis = total_distance(my_mobility_data)
    print("Total travel distance: {} km,".format(round(dis, 2)))
    #<</solution>>
    # INSERT CODE HERE | END 
    
    # Add the index again to my_mobility_data as a column 'time'
    my_mobility_data["time"] = my_mobility_data.index
    
    # Calculate the total track time 'time_delta' by subtracting the end time from the 
    # start time of my_mobility_data["time"]
    time_start = None
    time_end = None
    time_delta = None
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    time_start = my_mobility_data["time"].iloc[0]
    time_end = my_mobility_data["time"].iloc[-1]
    time_delta = time_end - time_start
    #<</solution>>
    # INSERT CODE HERE | END 
    
    # Total track start and end time
    print("Total track time: {} s. This are {} h,".format(
        time_delta.total_seconds(), round(time_delta.total_seconds() / 3600, 2)))    
    print("Total track start: {}".format(time_start))
    print("Total track end: {}".format(time_end))

    # Calculation of the total moving time (speed higher than 0km/h)
    mean_dt = my_mobility_data_cont["time"].diff().mean().total_seconds()
    dt_counter = \
        mean_dt*my_mobility_data["speed_corr"].loc[my_mobility_data["speed_corr"] > THRESHOLD_DETECT_ZERO_SPEED].size
    print("Total moving time: {} s. This are {} h".format(round(dt_counter, 2), round(dt_counter / 3600, 2)))

    # Calculate the average travel speed, based on the corrected speed my_mobility_data["speed_corr"]
    # Hint: You may want to use the np.nanmean() function which calculates the mean() of all non-nan values
    av_speed = None
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    av_speed = np.nanmean(my_mobility_data["speed_corr"])
    #<</solution>>
    # INSERT CODE HERE | END 
    print("Average travel speed: {} km/h".format(round(av_speed, 2)))
    
    # Average travel speed, but only moving time
    av_speed = \
        np.nanmean(my_mobility_data["speed_corr"].loc[my_mobility_data["speed_corr"] > THRESHOLD_DETECT_ZERO_SPEED])
    print("Average travel speed, only moving time: {} km/h".format(round(av_speed, 2)))

    # Calculation of the average elevation
    average_elevation = np.nanmean(my_mobility_data['elevation'])
    print("Average elevation of track: {} m".format(round(average_elevation,2)))

    ################################ Plots ######################################
    
    # Travel speeds histogram
    bin_size_speed = range(0, 100, 5)
    # Plot a histrogram of the corrected speed data my_mobility_data["speed_corr"]. Use the given bin size
    # and set the argument 'desity=True' of the plt.hist() function.
    # INSERT CODE HERE | BEGIN
    #<<solution>>
    fig, ax = plt.subplots(1, figsize=(8, 2))
    ax.hist(my_mobility_data["speed_corr"], bins=bin_size_speed, density=True)
    #<</solution>>
    # INSERT CODE HERE | END 
    ax.set_title("Histogram travel speeds")
    ax.set_ylabel("n Occurences")
    ax.set_xlabel("km/h bins")

    # Plot of elevation of the whole track
    fig, ax = plt.subplots(1, figsize=(8, 2))
    ax.plot(my_mobility_data['elevation'])
    ax.set_title("Elevation of whole track")
    ax.set_ylabel("Altitude in m")
    ax.set_xlabel("time")
    
    # Plot of corrected speed in comparision to moving averaged speed
    #my_mobility_data["speed_ma"]
    fig, ax = plt.subplots(1, figsize=(8, 2))
    ax.plot(my_mobility_data["speed_corr"], marker=".", label="corr_speed")
    ax.plot(my_mobility_data["speed_ma"], marker=".", label="speed_ma")
    ax.set_title("Corrected speed / moving averaged speed")
    ax.legend()
    
    return {"av_speed": av_speed, "total_distance": dis}

### Test your function
Use the function `statistical_kpis()` on the dataset `my_mobility_data_cont`

In [None]:
# INSERT CODE HERE | BEGIN
#<<solution>>
kpis_dict = statistical_kpis(my_mobility_data_cont)
#<</solution>>
# INSERT CODE HERE | END 

## Task 7: Popular Places

To analyze the recorded mobility behavior, we need to cut the GPX-file into pieces, accordingly to actually travelled trips. One trip is therefore a movement from a point A to point B, where the subject stays a certain time at A resp. B. Before we can start looking for trips, we need to figure out, where our popular places are. To find out, we want to use ordinary [2D-histograms](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.hist2d.html) over lat/lon-bins of our whole track. By counting the numbers of GPS data-points in the location-bins, we can see where we have been to what extend of time. We use the dataset with continous time, so every row-entry represents the same amount of time.

In the following, the functio `get_popular_places()` is given to obtain n popular places based on an input dataset.
______________

#### Signature of the function
`gdf_hotspots_polygons = get_popular_places(my_mobility_data)`

**Inputs**: 
* `my_mobility_data_cont`: Dataframe containing the trip
* Optional
    * `N_BIGGEST_HOTSPOTS`: Number of the most frequented bins to look for
    * `FACTOR_BINSIZE`: Factor for calculation of real binsize
    * `crs`: crs of returned GeoDataFrame

**Returns**: `gdf_hotspots_polygons`: List containing GeoDataFrames with the N_BIGGEST_HOTSPOTS hotspots


In [None]:
# Import packages
import matplotlib.colors as mcolors
from shapely.geometry.polygon import Polygon
from shapely.geometry import box, Point
from IPython.core.debugger import set_trace

def get_popular_places(
    my_mobility_data_cont, N_BIGGEST_HOTSPOTS = 20, FACTOR_BINSIZE = 8, crs = {'init': 'epsg:4326'}):
        
    # Calculating the range of latitude and longitude values 
    min_lat = min(my_mobility_data_cont.geometry.y)
    max_lat = max(my_mobility_data_cont.geometry.y)
    min_lon = min(my_mobility_data_cont.geometry.x)
    max_lon = max(my_mobility_data_cont.geometry.x)
    d_lat_km = geopy.distance.distance((min_lat,min_lon),(max_lat,min_lon)).kilometers
    d_lon_km = geopy.distance.distance((min_lat,min_lon),(min_lat,max_lon)).kilometers
    print("Latitude range:  {} - {}. {} km".format(min_lat, max_lat, d_lat_km))
    print("Longitude range: {} - {}. {} km".format(min_lon, max_lon, d_lon_km))

    # Binsize is based on the range of lat / lng and FACTOR_BINSIZE
    # Higher FACTOR_BINSIZE means smaller bins 
    binsize = [int(d_lat_km * FACTOR_BINSIZE), int(d_lon_km * FACTOR_BINSIZE)]
    print("Binsize is: {}".format(binsize))
    
    # Use the plt.hist2d() function to plot a histogram with previously calculated binsize. This function 
    # returns a bi-dimensional histogram of samples x and y
    fig, ax = plt.subplots(1)
    pop_places_matrix = ax.hist2d(my_mobility_data_cont.geometry.y, my_mobility_data_cont.geometry.x,
                                   bins=binsize, norm=mcolors.PowerNorm(0.1))
    ax.set_title("Histogram lat/Lon, number of visists per bin")
    ax.set_ylabel("Longitude")
    ax.set_xlabel("Latitude")
    
    # Find the N_BIGGEST_HOTSPOTS
    hotspots = np.sort(pop_places_matrix[0].flatten())[-N_BIGGEST_HOTSPOTS:]

    # Everything bigger or equal to the minimum of n_biggest_hotspots is consodered as a hotspot
    THRESHOLD_HISTO_DYN = np.min(hotspots)
    print("Threshold bins of interest: ", THRESHOLD_HISTO_DYN)

    # Based on the found THRESHOLD_HISTO_DYN, we iterate over the matrix structure and identify the hotspot entries
    gdf_hotspots_polygons = []
    for i in range(0, pop_places_matrix[0].shape[0]):  
        for j in range(0, pop_places_matrix[0].shape[1]): 

            if pop_places_matrix[0][i][j] >= THRESHOLD_HISTO_DYN:
                print("Found {} at i={}, j={}".format(pop_places_matrix[0][i][j], i, j))
                lat1 = pop_places_matrix[1][i]
                lat2 = pop_places_matrix[1][i+1]
                lon1 = pop_places_matrix[2][j]
                lon2 = pop_places_matrix[2][j+1]

                circle = Point(abs(lon1+lon2)/2, abs(lat1+lat2)/2).buffer(0.002)

                gdf_hotspots_polygons.append(
                    gpd.GeoDataFrame({"n_visits": pop_places_matrix[0][i][j]},index=[0], crs=crs, geometry=[circle]))
                
    return gdf_hotspots_polygons

### Use the given function
Your task is now to obtain 20 popular places of the dataset `my_mobility_data_cont`. Use the given function `get_popular_places()` now to calculate popular places of the dataset `my_mobility_data_cont`.

In [None]:
# Get the 20 most popular places 
gdf_hotspots_polygons = None
# INSERT CODE HERE | BEGIN
#<<solution>>
gdf_hotspots_polygons = get_popular_places(my_mobility_data_cont)
#<</solution>>
# INSERT CODE HERE | END 

## Task 8 (given): Plot the polygons of the hot spots on the map
Use the class `folium_plot` from Task 5 to plot the track and the polygons / circles on the map.
* For the track, use again the function `add_to_folium()`
* For the polygons / circles, iterate over the polygons list and use the `add_geojson_to_folium()` on each item

The code for the task is already given. If the code doesnt run, make sure you put in the correct kind of input data from the previous task. You should see the total trip in red, and the hotspots marked as blueish ovals/circles.

In [None]:
# Task 8: Plot the polygons of the hot spots on the map
m2 = folium_plot()

m2.add_to_folium(my_mobility_data_cont)

for idx, gdf_poly in enumerate(gdf_hotspots_polygons):
    tooltip = "Nr: {}. n_visits: {}".format(idx, gdf_poly["n_visits"][0])
    m2.add_geojson_to_folium(gdf_poly.to_crs(epsg='4326').to_json(), tooltip)

m2.save_map("polygons.html")
m2.show_map()

## Task 9: Find trips between the hotspots

A core discipline in mobility data analysis is the detection of single trips between hotspots from a continous GPS-track. Now, we want to find the trips between the previously discovered hotspots and you should implement a trip detector methods as described in the following:

Functions needed inside the detector:
* Given: `polygons_contain_geometry(polygons, geometry)`
* From Task 8: `total_distance()`

Key logic of the algorithm:
* We want to detect the start of a trip when we leave one of the hotspots
* We want to detect the stop of a trip when we arrive in a hotspot and speed is close to zero
* Record trips only, if they have a minimum length of 1 km

| Control variable and init-value | Description |                                                                               
|:--------------------------------|:------------------------------------------------------------------------------|
| flag_start_track = False        | Flag to mark if track is started                                                               |
| current_start_polygon = -1      | Index of the current start-polygon. returned from *polygons_contain_geometry()*                |
| current_end_polygon = -1        | Index of the detected end-polygon. returned from *polygons_contain_geometry()*                 |
| flag_left_start_polygon = None  | Equals to **False** when new track started. Flag set to **True** if we left the start-polygon. |
| trips = []                      | List holding the detected trips                                                                |
| counter_general = 0             | Counts every iteration of the for-loop. Index to the *my_mobility_data*                        

<img src="images/loop trip detector.png" alt="Drawing" style="width: 600px;"/>


In [None]:
import datetime
import pytz
import time

######################## Given ###########################################################################
def polygons_contain_geometry(polygons, geometry):
    '''
    Function expects a list of gpd GeoDataframes, containing shapes. All df/shapes are tested, if the geometry
    is inside it. 
    Parameters: 
    polygons: List of GeoDataFrames with geometry (polygon, circle...)
    geometry: One GeoPandasDataFrame which is tested if it lies inside any of the geometry inside polygons
    Return:
    - idx: idx of the polygon which contains the geometry, -1 if no found
    - is_inside: True, if containing polygon found, otherwise False
    '''
    if not isinstance(polygons, list):
        raise "ERROR: We need lists of gdf containing polygons!"
    
    for idx, polygon in enumerate(polygons):
        is_inside = polygon.geometry.contains(geometry.geometry)[0]
        if is_inside:
            #print("inside", idx)
            return is_inside, idx
    else:
        return False, -1

### Implementation of trip detector
Implement the trip detector like shown in the diagram above. Please use the following set of control variables, which need to be initialized before starting into the loop. Run the detector on the dataset and plot the detected trips using the plotting classes function `add_many_to_folium()`. How many Trips do you find?

**IMPORTANT NOTE:** The flow diagram above describes each step in detail and uses correct name spaces, it might be helfpul for you to open the diagram in a separate window and try to model the figure with the same syntax as given.

In [None]:
# Const MINIMUM_TRIP_LENGTH_KM
MINIMUM_TRIP_LENGTH_KM = 1

def trip_detector(my_mobility_data, gdf_hotspot_polygons):
    
    # Initialize controller variables before starting loop (same variable names as shown in the flow diagram)
    flag_start_track=False
    current_start_polygon=-1
    current_end_polygon=-1
    flag_left_start_polygon = None
    trips = []
    counter_general = 0  # counting the foor-loop runs, increment after every loop

    
    # Loop over the my_mobility_data
    for idx, row in my_mobility_data.iterrows():
        
        # Check if current rows point lies inside a hotspot
        is_inside, inside_idx = polygons_contain_geometry(gdf_hotspot_polygons, row)
        
        # 1. Find start point of trajectory ('Inside polygon and trip not started')
        # If any mob_data point is within the polygon, we start looking for a trajectory.
        # We do only so, if not started to look for one already (flag_start_trajectory)
        if is_inside and not flag_start_track:
            
            # START TRIP
            current_start_polygon = None 
            flag_left_start_polygon = None
            flag_start_track = None
            idx_start = None
            # INSERT CODE HERE | BEGIN
            #<<solution>>
            current_start_polygon = inside_idx
            flag_left_start_polygon = False
            flag_start_track = True  # set flag to true: we follow a trajectory now
            idx_start = counter_general  # memorize the start index of the current trajectory
            #<</solution>>
            # INSERT CODE HERE | END 

        # 2. Check if we already left one of the hotspots ('Check if we left hotspot')
        # Hint: You need to insert a IF statement and if TRUE: 'Update Start Information' as 
        # described in the flow chart
        
        # INSERT CODE HERE | BEGIN
        #<<solution>>
        if is_inside == False and flag_left_start_polygon == False:
            # Set the flag, that we left the start polygon
            flag_left_start_polygon = True
            # Also Reset the start-idx and the start time to the current -1
            idx_start = counter_general-1
        #<</solution>>
        # INSERT CODE HERE | END 
                
        # 3. Find end point of trajectory ('Check if we arrive in new hotspot')
        # We want to check if the track goes inside the end_polygon. This can only happen after a minimum time
        # and if we are inside a trajectory (flag_start_trajectory)
        if is_inside and flag_start_track and flag_left_start_polygon == True:
            
            # 4. Set conditions to end trip ('End-conditions')
            # Speed needs to be low 2 and we have been outside of hotspots (-1)
            if row["speed_corr"] < 2:
                
                current_end_polygon = inside_idx
                flag_start_track = False  # we end the trajectory here
                idx_end = counter_general  # memorize the end index of the current trajectory

                # Slice trip
                trip = my_mobility_data.iloc[idx_start:idx_end+1, :]  # slice the found trip from the whole dataset

                # We only want to save the trip, 'if total_distance(trip) > MINIMUM_TRIP_LENGTH_KM'
                # Hint: You might want to append a 'trip' to 'trips' if statement is true
                # INSERT CODE HERE | BEGIN
                #<<solution>>
                if total_distance(trip) > MINIMUM_TRIP_LENGTH_KM:
                    trips.append(trip)
                #<</solution>>
                # INSERT CODE HERE | END 
        
        counter_general+=1
    
    return trips

### Identify trips
Use the above implemented trip detector method `trip_detector()` to identify trips based on the dataset `my_mobility_data_cont` and the previously identifed hotspots `gdf_hotspots_polygons`. Plot these trips using the `folium_plot()` class.

In [None]:
# Use the implemented trip detector 'trip_detector' to identify all trips between hotspots.
trips = None
# INSERT CODE HERE | BEGIN
#<<solution>>
trips = trip_detector(my_mobility_data_cont, gdf_hotspots_polygons)
#<</solution>>
# INSERT CODE HERE | END 

# How many trips can you identify?
# INSERT CODE HERE | BEGIN
#<<solution>>
print("The dataset contains {} trips between hotspots.".format(len(trips)))
#<</solution>>
# INSERT CODE HERE | END       

In [None]:
# Plot the trips and the hotspots
m3 = folium_plot()
m3.add_many_to_folium(trips)

for idx, gdf_poly in enumerate(gdf_hotspots_polygons):
    tooltip = "Nr: {}. n_visits: {}".format(idx, gdf_poly["n_visits"][0])
    m3.add_geojson_to_folium(gdf_poly, tooltip)

m3.save_map(os.path.join('trips.html'))
m3.show_map()

## Task 10 (given): Analyze the Trips using Reverse Geocoding

#### Info
Now we want to analyze the tracks a bit more detailed. As a tool, we can use the FTM GIS-API to perform reverse geocoding on our dataset. Therefore, we want to reverse-geocode the start- and end-position of all trips.

Use the following sources: [Documentation GIS-API](https://wiki.tum.de/pages/viewpage.action?pageId=39591716) and [Python Requests to do HTTP-Requests](https://2.python-requests.org/en/master/)

**IMPORTANT NOTE**: If you are doing this at home, GIS-API is only reachable from the university network. Use the LRZ-VPN.

#### Task
* Define a function, which takes all the trips and prints out a conclusing like the example below for every trip. Use the requests package to send a http-request containing the start resp. end-psoition of the track to the GIS-API and process the response.

Print Example: <br>
`---- Trip 0 Geocoding --------------------------------------------
Start Time: 2019-07-31 16:22:51.722000+00:00
Start address: TUM Fakultät Mathematik und Informatik, 3, Boltzmannstraße, Hochschul- und Forschungszentrum, Garching,Landkreis München, Obb, Bayern, 85748, Deutschland
_-----
End Time: 2019-07-31 17:05:51.722000+00:00
End address: 6, Hörwarthstraße, Münchner Freiheit, Bezirksteil Münchner Freiheit, Stadtbezirk 04 Schwabing-West, München, Obb, Bayern, 80804, Deutschland
_-------------------------------------------------------------------
`

* Run the function on the trips
* Examine the result: Do you get meaningful information out of the API? Do you get reasonable start/end times?

________________________

#### Signature of the function
`reverse_geocode(trips)`

**Inputs**: `trips`: List of trips, each trip is a geoDataFrame

**Returns**: No returns

In [None]:
# Prototype: http://gis.ftm.mw.tum.de/reverse?coordinates=[11.62467, 48.2188]
import requests
import json

def reverse_geocode(trips):
    
    url = "http://gis.ftm.mw.tum.de/reverse?coordinates={coordinate_list}"

    for idx, trip in enumerate(trips):
        url_get_start = url.format(coordinate_list=str([trip.geometry[0].x, trip.geometry[0].y]))
        url_get_end = url.format(coordinate_list=str([trip.geometry[-1].x, trip.geometry[-1].y]))

        resp_start = requests.get(url=url_get_start)
        resp_end = requests.get(url=url_get_end)

        print("---- Trip {} Geocoding --------------------------------------------".format(idx))
        print("Start Time: {}".format(trip["time"][0]))
        print("Start address: ", end="")
        print(json.loads(resp_start.text)["features"][0]["properties"]["display_name"])
        print("---")
        print("End Time: {}".format(trip["time"][-1]))
        print("End address: ",end="")
        print(json.loads(resp_end.text)["features"][0]["properties"]["display_name"])
        print("-------------------------------------------------------------------")
    

# Sort trips rising starttime
trips = sorted(trips, key=lambda x:x["time"][0])

# Apply the reverse geocode function to the trips
reverse_geocode(trips)

### Statistical KPIs of the single trips including map

Use the function calculating the KPIs and the reverse geocoding in one step on every trip and examine the results. Additionally, plot each trip on a small folium map, to examine only the trip. Put everything in a loop and let it run over all your trips.

**Hint:** You can call 'folium_plot()' as shown below to adjust the size and the center of the map of the map:

`mx = folium_plot(width="50%",height="200", location=[lon, lat])`

Answer these questions during the homework, of course you can already start to think :)  
* Can you estimate the mode choice of every trip?
* Can you find out about home / work place from the dataset?
* Can you deduce free-time activities of the person tracked?


In [None]:
# Statistical KPIs of the single trips including map
for idx, trip in enumerate(trips):
    print("---- Trip {} ------------------------------------------------------".format(idx))
    mx = folium_plot(width="50%",height="200", location=[trip.geometry.y[int(len(trip.geometry)/2)], trip.geometry.x[int(len(trip.geometry)/2)]])
    mx.add_to_folium(trip)
    mx.add_marker_to_folium(lon=trip.geometry.y[0], lat=trip.geometry.x[1], popup="starttime: {}".format(trip["time"][0]))
    display(mx.show_map())
    
    reverse_geocode([trip])
    
    statistical_kpis(trip)
    print("------------------------------------------------------------------")

In [None]:
# If warning: too many open plots, run this cell to close all the open plots
plt.close("all")