<center>
<table>
  <tr>
    <td><img src="https://portal.nccs.nasa.gov/datashare/astg/training/python/logos/nasa-logo.svg" width="100"/> </td>
     <td><img src="https://portal.nccs.nasa.gov/datashare/astg/training/python/logos/ASTG_logo.png?raw=true" width="80"/> </td>
     <td> <img src="https://www.nccs.nasa.gov/sites/default/files/NCCS_Logo_0.png" width="130"/> </td>
    </tr>
</table>
</center>

        
<center>
<h2><font color= "blue" size="+3">PyCon 2024 Tutorial</font></h2>
</center>

---

<center>
    <h3>Python Workflows to Extract and Plot Satellite Data Products along Tracks</h3>
    <h2><font color="red" size="+3">Background</font></h2>
</center>

----
[Jules Kouatchou](mailto:Jules.Kouatchou@nasa.gov) • [Bruce Van Aartsen](mailto:bruce.vanaartsen@nasa.gov)
-
----

## <font color="red"> Objectives</font>

- We want to show the steps we need to take after we collect timeseries data (locations and fields) of a moving object.
- We use the Pandas, Shapely, GeoPandas and MovingPandas to process the data, perform analyses and do visualization.

We particular want to do a quick introduction on Pandas, GeoPandas and MovingPandas that are the main packages we will be using to track the movement of an object.

## <font color="red">Movement of planar objects</font>

- In this tutorial, we are interested in tracking the movement of objects in a two-dimensional space or plane.
   - We assume that an object is considered to be a single point.
- Over a time period, we want to collect data from the movement, where each data point contains:
   - The date/time
   - The location (latitude and longitude)
   - (optionally) Measurements at the location
- With the timeseries dataset, we can:
   - Compute parameters such as distance, speed, etc.
   - Compare measurements against model simulations.
 
### <font color="blue">Examples</font>
- The eye of an hurricane
- Track the movement of a car
- Track the movement of a ship
- Movement of a total solar eclipse
- Movement of (baskeetball, soccer, football, etc.) players
- The International Space Station (ISS)
- Movement of a satellite

---

## <font color="red">Required Packages</font>

- __Matplotlib__: for basic plots.
- __Pandas__: Manipulation and exploratory data analysis of tabular data.
- __Shapely__: For manipulation and analysis of planar geometric objects
- __GeosPandas__: Combines the capabilities of Pandas and Shapely for geospatial operations
- __MovingPandas__: Handling the movement of geospatial objects.

----

In [None]:
import warnings
warnings.filterwarnings("ignore")

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
import numpy as np
import pandas as pd
import geopandas as gpd

In [None]:
from shapely import geometry as shpgeom
from shapely import wkt as shpwkt

In [None]:
import movingpandas as mpd

In [None]:
import holoviews as hv

In [None]:
import hvplot.pandas 

In [None]:
plot_defaults = {'linewidth':5, 'capstyle':'round', 'figsize':(9,3), 'legend':True}
hv.opts.defaults(hv.opts.Overlay(active_tools=['wheel_zoom'], 
                              frame_width=500, frame_height=400))
hvplot_defaults = {'tiles':None, 'cmap':'Viridis', 'colorbar':True}

In [None]:
mpd.show_versions()

![fig_pd](https://pandas.pydata.org/docs/_static/pandas.svg)

- Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with two-dimensional labeled data both easy and intuitive.
- It aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python.

### <font color="blue">Example</font>

Consider the [forecast of the track of the hurricane KATRINA in 2005](https://www.esl.lsu.edu/hurricanes/2005/KATRINA/) that consists of the following:

- The dates/times
- The locations (in terms of latitudes and longitudes)
- Three fields at each location
   - Pressure (mbar)
   - Max Winds (kts)
   - Max Gust (kts)

We can use the `read_csv` function to read the remote file:

In [None]:
url = "https://www.esl.lsu.edu/hurricanes/447/csv"

In [None]:
import datetime
dateparse = lambda x: datetime.datetime.strptime(x, '%y%m%d %H')
df = pd.read_csv(url, 
                 skiprows=[1],
                 parse_dates={'1': [0]}, 
                 date_parser=dateparse,
                )

df

We can get basic information on the columns:

In [None]:
df.info()

We can rename the columns:

In [None]:
df.columns = ['t', 'latitude', 'longitude', 'pressure', 'max_winds', 'max_gust']
df

We want the values of the three fields to be floating point numbers:

In [None]:
df = df.astype({'pressure':'float64', 
                'max_winds':'float64', 
                'max_gust': 'float64'})
df

In [None]:
df.info()

#### Let us plot the location:

In [None]:
df.plot(kind="scatter", x='longitude', y='latitude');

![fig_gpd](https://geopandas.org/en/stable/_static/geopandas_logo_web.svg)

- A Python library that allows you to process shapefiles representing tabular data (like Pandas), where every row is associated with a geometry.
- Designed to primarily work with vector data.
- Extends the capabilities of Pandas to enable spatial operations on geometric types.
  - Geometric operations are performed by Shapely.
- Includes new data types such as `GeoDataFrame` and `GeoSeries` which are subclasses of Pandas DataFrame and Series and enables efficient vector data processing in Python. 

### <font color="blue">GeoDataFrame</font>
- A tabular data structure that contains a "geometry" column.
- The geometry column defines a point, line, or polygon associated with the rest of the columns. This column is a collection of `Shapely` objects. 
- The Coordinate Reference System (CRS) is the coordinate reference system of the geometry column that tells us where a point, line, or polygon lies on the Earth's surface. GeoPandas maps a geometry onto the Earth's surface.
- The “geometry” column – no matter its name – can be accessed through the geometry attribute (`gdf.geometry`), and the name of the `geometry` column can be found by typing `gdf.geometry.name`.


![fig_frame](https://geopandas.org/en/stable/_images/dataframe.svg)
Image Source: [GeoPandas](https://geopandas.org/en/stable/getting_started/introduction.html)

#### Add a `'geometry'` column to the Pandas DataFrame

Convert the positions (latitude and longitude) into Shapely POINT objects:

In [None]:
df['geometry'] = [shpgeom.Point(xy) for xy in zip(df['longitude'], df['latitude'])] 
df

#### Create a GeoDataFrame from the Pandas DataFrame

In [None]:
gdf = gpd.GeoDataFrame(df, geometry="geometry") 
gdf

#### Basic visualization

In [None]:
gdf.plot(figsize=(7,10));

In [None]:
gdf.hvplot(tiles='EsriTerrain', coastline=True, 
           hover_cols=["t", "pressure", "max_winds", "max_gust"])

#### Introduce a buffer along the track

In [None]:
track = shpgeom.LineString( [[a.x, a.y] for a in gdf.geometry.values] )
track

In [None]:
track_buffer = track.buffer(2)
track_buffer

In [None]:
pd.DataFrame({"geometry": 1}, index=[0])

In [None]:
df_track = pd.DataFrame({"geometry": track}, index=[0])
gdf_track = gpd.GeoDataFrame(df_track, geometry="geometry")
gdf_track

In [None]:
gdf_track_buffer = gpd.GeoDataFrame(geometry=gdf_track.buffer(0.5))

In [None]:
fig, ax = plt.subplots()
gdf_track_buffer.plot(ax=ax, color="pink", )
gdf_track.plot(ax=ax)

In [None]:
import cartopy
import cartopy.crs as ccrs

pc = ccrs.PlateCarree()
fig = plt.figure(figsize=(10, 9))
ax = fig.add_subplot(111, projection=pc)#ccrs.LambertConformal())
ax.patch.set_visible(False)
ax.set_extent([-125, -67.5, 19, 50], pc)

ax.add_feature(cartopy.feature.LAND, facecolor='w')
ax.add_feature(cartopy.feature.OCEAN, facecolor='w')
ax.add_feature(cartopy.feature.STATES)

gdf_track_buffer.plot(ax=ax, color="pink")
gdf_track.plot(ax=ax)
ax.set_title('US States which intersect the track of Hurricane Katrina (2005)');

---

![fig_logo](https://movingpandas.github.io/movingpandas/assets/img/logo-wide.svg)

- A Python library (based on Pandas, GeoPandas and HoloViz) for handling the movement of geospatial objects.
-  The key features of MovingPandas for movement data exploration are related to data import,
visualization, and spatiotemporal analysis.
- Provides trajectory data structures and functions (such as length, duration, and speed computations) for movement data exploration and analysis.
- A trajectory is:
   - A time-ordered series of geometries. The geometries and associated attributes are stored in a GeoPandas GeoDataFrame.
   - __Can be seen as a sequence of points that specify the position of a moving object in space and time__.
   - A segment is a part of the trajectory that contains a list of episodes. 
       - Each episode has a starting and ending timestamp, a segmentation criterion (annotation type), and an episode annotation. 
       - For instance, an annotation type can be the “weather conditions”, and an episode annotation can be “a storm”, “heavy rain”, “extremely high waves”, etc.




#### Create a MovingPandas trajectory

In [None]:
mdf_traj = mpd.Trajectory(df, 
                          traj_id=1, 
                          x = "longitude", y="latitude", t="t")

mdf_traj

In [None]:
mdf_traj.df

#### Determine the start date, end date and duration of the trajectory

In [None]:
mdf_traj.get_start_time()

In [None]:
mdf_traj.get_end_time()

In [None]:
mdf_traj.get_duration()

#### Compute the sampling interval (median time difference between records)

In [None]:
mdf_traj.get_sampling_interval()

#### Compute the length of the trajectory

In [None]:
str(mdf_traj.get_start_location())

In [None]:
str(mdf_traj.get_end_location())

In [None]:
mdf_traj.get_length(units="mi")

#### Add the `distance` and the `speed` columns

- MovingPandas has built-in functions to compute the distance and speed

In [None]:
mdf_traj.add_distance(overwrite=True, units="mi")
mdf_traj.df

In [None]:
mdf_traj.add_speed(overwrite=True, 
              name="speed", units=("mi", "h"))

mdf_traj.df

#### Create 2D timeseries interactive plots

In [None]:
wind_plot = mdf_traj.df.hvplot(x='t', y='max_gust', color="green")
gust_plot = mdf_traj.df.hvplot(x='t', y='max_winds', color="red")
pres_plot = mdf_traj.df.hvplot(x='t', y='pressure')

In [None]:
wind_plot*gust_plot

In [None]:
pres_plot

#### Create maps

In [None]:
fig, ax = plt.subplots(1, figsize=(20,10))
gdf_track_buffer.plot(ax=ax, color="pink", )
mdf_traj.plot(ax=ax);

In [None]:
fig, ax = plt.subplots(1, figsize=(20,10))
mdf_traj.plot(ax=ax, legend="true", 
              capstyle='round',
              column="pressure", 
              linewidth=5,
              cmap='jet');
gdf_track_buffer.plot(ax=ax, color="pink", )

In [None]:
pres_plot = mdf_traj.hvplot(c="pressure", 
                hover_cols=["max_winds", "max_gust"],
                cmap="jet")
path_plot = gdf_track_buffer.hvplot(color="pink")

pres_plot

We can select the background image with the by setting the `tiles` parameter with one of the options:

   ‘CartoDark’, ‘CartoEco’, ‘CartoLight’, ‘CartoMidnight’, 
   ‘EsriImagery’, ‘EsriNatGeo’, ‘EsriReference’, ’EsriTerrain’,
   ‘EsriUSATopo’, ‘OSM’, ‘StamenLabels’, ‘StamenTerrain’,
   ‘StamenTerrainRetina’, ‘StamenToner’, ‘StamenTonerBackground’,
   ‘StamenWatercolor’, ‘Wikipedia’ (default)


In [None]:
mdf_traj.hvplot(c="pressure", 
           tiles="EsriImagery",
                hover_cols=["max_winds", "max_gust"],
                width=900,
                height=600,
            xlim=(-100, -60), 
            ylim=(22, 43.5),
           cmap="jet")

## <font color="red"> Data Collection and Analyses</font>

To track the movement of an object, we need to collect timeseries data:

- The date/time
- The location (latitude/longitude)
- Fields at each location.

From there, we need to:

- Use `Pandas` to create a DataFrame with "date/time", "latitude", "longitude", "field_name1", "field_name2", etc. as columns.
- Use `GeoPandas` to create a GeoDataFrame based on the `Pandas` DataFrame.
- Use `MovingPandas` to create a trajectory object.
- Perform analyses and visualization.