# Generalizing trajectories

<img align="right" src="https://anitagraser.github.io/movingpandas/assets/img/movingpandas.png">

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/anitagraser/movingpandas-examples/main?filepath=1-tutorials/7-generalizing-trajectories.ipynb)
[![IPYNB](https://img.shields.io/badge/view-ipynb-hotpink)](https://github.com/anitagraser/movingpandas-examples/blob/main/1-tutorials/7-generalizing-trajectories.ipynb)
[![HTML](https://img.shields.io/badge/view-html-green)](https://anitagraser.github.io/movingpandas-website/1-tutorials/7-generalizing-trajectories.html)

To reduce the size (number of points) of trajectory objects, we can generalize them, for example, using:

- Spatial generalization, such as Douglas-Peucker algorithm
- Temporal generalization by down-sampling, i.e. increasing the time interval between records
- Spatiotemporal generalization, e.g. using Top-Down Time Ratio algorithm

[Documentation](https://movingpandas.readthedocs.io/en/master/trajectorygeneralizer.html)

A closely related type of operation is [trajectory smoothing which is coverd in a separate notebook](./10-smoothing-trajectories.ipynb). 

In [None]:
import sys

import geopandas as gpd
import matplotlib.pyplot as plt
import movingpandas as mpd
import pandas as pd
from holoviews import opts

from pymeos import pymeos_initialize, TGeogPointSeq, TGeogPointInst, TGeomPointInst, TGeomPointSeq, TPointSeq, TPointInst

pymeos_initialize()

plot_defaults = {'linewidth': 5, 'capstyle': 'round', 'figsize': (9, 3), 'legend': True}
opts.defaults(opts.Overlay(active_tools=['wheel_zoom'], frame_width=500, frame_height=400))

In [None]:
pdf = pd.read_csv('data/aisinput.csv')
gdf = gpd.GeoDataFrame(pdf.drop(['latitude', 'longitude'], axis=1),
                       geometry=gpd.points_from_xy(pdf.longitude, pdf.latitude), crs=4326)
traj_collection = mpd.TrajectoryCollection(gdf, 'mmsi', t='t')

In [None]:
original_traj = traj_collection.trajectories[1]
print(original_traj)

In [None]:
original_traj.plot(column='speed', vmax=20, **plot_defaults)

In [None]:
def create_point(row) -> TPointInst:
    return TGeogPointInst(string=f"{row['geometry']}@{row.name}")


original_traj.df['MEOS Point'] = original_traj.df.apply(create_point, axis=1)

In [None]:
sequence = TGeogPointSeq(instant_list=original_traj.df['MEOS Point'], normalize=False)

In [None]:
dp_generalized = mpd.DouglasPeuckerGeneralizer(original_traj).generalize(tolerance=0.001)
dp_generalized.plot(column='speed', vmax=20, **plot_defaults)

In [None]:
dp_generalized_pymeos = sequence.simplify(synchronized=False, tolerance=0.001).to_trajectory()
dp_generalized_pymeos.plot(column='speed', vmax=20, **plot_defaults)

In [None]:
dp_generalized

In [None]:
dp_generalized_pymeos

In [None]:
print('Original length: %s' % (original_traj.get_length()))
print('Generalized length: %s' % (dp_generalized.get_length()))
print('Generalized PyMEOS length: %s' % (dp_generalized_pymeos.get_length()))

## Spatiotemporal generalization (TopDownTimeRatioGeneralizer)

In [None]:
tdtr_generalized = mpd.TopDownTimeRatioGeneralizer(original_traj).generalize(tolerance=0.001)

In [None]:
tdtr_generalized.df.head()

In [None]:
tdtr_generalized_pymeos = sequence.simplify(synchronized=True, tolerance=0.001).to_trajectory()

Let's compare this to the basic Douglas-Peucker result:

In [None]:
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(19, 7))
tdtr_generalized.plot(ax=axes[0][0], column='speed', vmax=20, **plot_defaults)
tdtr_generalized_pymeos.plot(ax=axes[0][1], column='speed', vmax=20, **plot_defaults)
dp_generalized.plot(ax=axes[1][0], column='speed', vmax=20, **plot_defaults)
dp_generalized_pymeos.plot(ax=axes[1][1], column='speed', vmax=20, **plot_defaults)

In [None]:
tdtr_generalized

In [None]:
tdtr_generalized_pymeos

In [None]:
from time import time


def speeds():
    times = 10

    start = time()
    for _ in range(times):
        mpd.DouglasPeuckerGeneralizer(original_traj).generalize(tolerance=0.001)
    end = time()
    dp = (end - start) / times

    start = time()
    for _ in range(times):
        sequence.simplify(synchronized=False, tolerance=0.001).to_trajectory()
    end = time()
    dp_meos = (end - start) / times

    start = time()
    for _ in range(times):
        mpd.TopDownTimeRatioGeneralizer(original_traj).generalize(tolerance=0.001)
    end = time()
    tdtr = (end - start) / times

    start = time()
    for _ in range(times):
        sequence.simplify(synchronized=True, tolerance=0.001).to_trajectory()
    end = time()
    tdtr_meos = (end - start) / times

    return dp, dp_meos, tdtr, tdtr_meos

In [None]:
dp, dp_meos, tdtr, tdtr_meos = speeds()

In [None]:
print(f'Douglas-Peuker MovingPandas: {dp:0.3f}s')
print(f'Douglas-Peuker MovingPandas - PyMEOS: {dp_meos:0.3f}s')
print(f'Douglas-Peuker MovingPandas - PyMEOS speedup: {dp / dp_meos:0.2f}')
print(f'Top-Down Time Ratio MovingPandas: {tdtr:0.3f}s')
print(f'Top-Down Time Ratio MovingPandas - PyMEOS: {tdtr_meos:0.3f}s')
print(f'Top-Down Time Ratio MovingPandas - PyMEOS speedup: {tdtr / tdtr_meos:0.2f}')