# Handling trajectory data files (reading & writing)

<img align="right" src="https://movingpandas.github.io/movingpandas/assets/img/movingpandas.png">

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/movingpandas/movingpandas/main?filepath=tutorials/2-reading-data-from-files.ipynb)

**<p style="color:#e31883">This notebook demonstrates the current development version of MovingPandas.</p>**

For tutorials using the latest release visit https://github.com/movingpandas/movingpandas-examples.


In [None]:
import pandas as pd
from geopandas import GeoDataFrame, read_file

import sys

sys.path.append("..")
import movingpandas as mpd

mpd.show_versions()

import warnings

warnings.simplefilter("ignore")

In [None]:
hvplot_defaults = {
    "tiles": "CartoLight",
    "frame_height": 400,
    "frame_width": 700,
    "cmap": "Viridis",
    "colorbar": True,
}

## Reading Geopackages

### with DatetimeIndex

In [None]:
%%time
gdf = read_file("data/demodata_geolife.gpkg")
gdf["t"] = pd.to_datetime(gdf["t"])
gdf = gdf.set_index("t")
tc = mpd.TrajectoryCollection(gdf, "trajectory_id")
print(tc)

### without DatetimeIndex

In [None]:
%%time
gdf = read_file("data/demodata_geolife.gpkg")
tc = mpd.TrajectoryCollection(gdf, "trajectory_id", t="t")
print(tc)

In [None]:
%%time
gdf = read_file("data/demodata_ais.gpkg")
gdf["t"] = pd.to_datetime(gdf["Timestamp"], format="%d/%m/%Y %H:%M:%S")
gdf = gdf[gdf.SOG > 0]
tc = mpd.TrajectoryCollection(gdf, "MMSI", min_length=100, t="t")
print(tc)

Note that any MovingPandas treats all times as local times - if you need to use a different time zone, you will have to convert your data before creating trajectories or trajectory collections.

## Reading CSVs

In [None]:
%%time
df = pd.read_csv("data/demodata_geolife.csv", delimiter=";")
tc = mpd.TrajectoryCollection(df, "trajectory_id", t="t", x="X", y="Y")
tc.hvplot(title=str(tc), line_width=5, **hvplot_defaults)

## Reading GPX files

In [None]:
%%time
gdf = read_file("data/304 to UL 2019-02-18 0745.gpx", layer="track_points").set_index(
    "time"
)
gdf.drop(
    columns=[
        "track_fid",
        "track_seg_id",
        "ele",
        "magvar",
        "geoidheight",
        "name",
        "cmt",
        "desc",
        "src",
        "link1_href",
        "link1_text",
        "link1_type",
        "link2_href",
        "link2_text",
        "link2_type",
        "sym",
        "type",
        "fix",
        "sat",
        "hdop",
        "vdop",
        "pdop",
        "ageofdgpsdata",
        "dgpsid",
    ],
    inplace=True,
)
traj = mpd.Trajectory(gdf, "2019-02-18 0745", obj_id="304")
traj.add_distance().add_speed(name="speed (kph)", units=("km", "h"))

In [None]:
traj.df

In [None]:
traj.plot()

In [None]:
traj.hvplot(
    c="speed (kph)",
    clim=(0, 60),
    line_width=7.0,
    title="Bus {} departing {}".format(traj.obj_id, traj.id),
    xlabel="Longitude",
    ylabel="Latitude",
    clabel="Speed (km/h)",
    tiles="CartoLight",
    cmap="RdYlGn",
    colorbar=True,
) * read_file("data/stops_304_to_ul.gpkg").hvplot(
    geo=True, size=140, marker="+", color="blue"
)

## Reading MovingFeatures JSONs (MF-JSON)

In [None]:
%%time
file_name = "data/movingfeatures.json"
traj = mpd.read_mf_json(file_name)
traj

In [None]:
traj.plot()

In [None]:
traj.hvplot(
    title="Wind measure along trajectory", c="wind", line_width=5, **hvplot_defaults
)

## Writing as points

In [None]:
point_gdf = tc.to_point_gdf()
point_gdf.head()

In [None]:
point_gdf.to_file("temp.gpkg", layer="points", driver="GPKG")
read_file("temp.gpkg", layer="points").plot()

## Writing as lines

In [None]:
line_gdf = tc.to_line_gdf()
line_gdf.head()

In [None]:
line_gdf.to_file("temp.gpkg", layer="lines", driver="GPKG")
read_file("temp.gpkg", layer="lines").plot()

## Writing as trajectories

In [None]:
traj_gdf = tc.to_traj_gdf(wkt=True)
traj_gdf

In [None]:
traj_gdf.to_file("temp.gpkg", layer="trajectories", driver="GPKG")
read_file("temp.gpkg", layer="trajectories").plot()

## Writing as MF-JSON

In [None]:
mf_json = tc.to_mf_json()
mf_json

In [None]:
mpd.read_mf_dict(mf_json, traj_id_property="trajectory_id").plot()

## Error messages while reading

The following errors are expected:

### Missing datetime info

In [None]:
gdf = read_file("data/demodata_geolife.gpkg")

try:
    mpd.TrajectoryCollection(gdf, "trajectory_id")
except TypeError as e:
    print(f"TypeError: {e}")

### Missing geometry info

In [None]:
df = pd.read_csv("data/demodata_geolife.csv", delimiter=";")

try:
    mpd.TrajectoryCollection(df, "trajectory_id", t="t")
except ValueError as e:
    print(f"ValueError: {e}")

## Continue exploring MovingPandas

1. [Getting started](1-getting-started.ipynb)
1. [Handling trajectory data files (reading & writing)](2-reading-data-from-files.ipynb)
1. [TrajectoryCollection aggregation (flow maps)](3-generalization-and-aggregation.ipynb)
1. [Stop detection](4-stop-detection.ipynb)
1. [Working with local coordinates](5-local-coordinates.ipynb)
1. [Computing trajectory metrics](6-trajectory-metrics.ipynb)
1. [Multithreading](7-multithreading.ipynb)
1. [OGC Moving Features](8-ogc-moving-features.ipynb)