# LT Toolbox Tutorial

Welcome to the Lagrangian Trajectories Toolbox tutorial! 

The LT Toolbox is a python library dedicated to the post-processing, visualisation and analysis of Lagrangian water parcel trajectories. The toolbox makes use of [xarray](http://xarray.pydata.org/en/stable/#) multidimensional data arrays to store attribute variables associated with trajectories (e.g. lat, lon, in-situ temperature etc.).

In this tutorial, we will show how to:

+ **Store** the output of an example simulation of the [TRACMASS](https://tracmass.readthedocs.io/en/latest/) Lagrangian trajectory code in a trajectories object.

+ **Add** new variables, such as particle IDs and seeding levels, to your dataset.

+ **Filter** trajectories using any attribute variable contained in your dataset.

+ **Get** existing features, including trajectory start/end times, start/end locations and durations.

+ **Compute** metrics, such as distance travelled, particle displacements and the Lagrangian velocities from trajectories.

+ **Plot** trajectory data in the form of time series, temperature-salinity diagrams and more.

+ **Map** trajectories, properties and probability distributions onto the Earth's surface using [Cartopy](https://tracmass.readthedocs.io/en/latest/).

## Getting Started

Let us begin by importing the relevant packages we'll need to get started with the LT Toolbox. 

**Note**: Since lt_toolbox is still undergoing unit testing, the package is not yet available on PyPi, hence we use a local development version.

In [None]:
import xarray as xr
import numpy as np

# Navigating to the local development version of lt_toolbox.
%cd /Users/ollietooth/Desktop/D.Phil./lt_toolbox/

from lt_toolbox.trajectories import trajectories
from lt_toolbox.compute_utils import haversine_dist

### Storing Trajectory Data

To explore the functionality of the LT Toolbox, we will use output from the example NEMO ORCA1 (3D) simulation contained in the TRACMASS [documentation](https://tracmass.readthedocs.io/en/latest/Examples.html). 

The simulation uses monthly mean velocity fields (24 months continuously looped over to generate a 200 year simulation) and releases particles southward from \~68 N for the first 24 months (seeding levels). Particles are terminated on reaching the equator or on re-encountering the seeding plane (\~ 68 N). The maximum lifetime for trajectories in the simulation is 200 years.

TRACMASS outputs trajectories in .csv files, so we first used the export_tracmass_to_nc.py file to reformat the output data into a .nc file conforming to the standard NCEI trajectory [template](https://www.nodc.noaa.gov/data/formats/netcdf/v2.0/trajectoryIncomplete.cdl).

Below we load the resulting .nc file as a DataSet with xarray, before creating a trajectories object, traj.

In [None]:
# Navigate to the directory where the ORCA1 output data is stored.
%cd /Users/ollietooth/Desktop/D.Phil./lt_toolbox/lt_toolbox/

# Open output .nc file as a DataSet.
dataset = xr.open_dataset('ORCA1-N406_TRACMASS_complete.nc')

# Create a trajectories object from the DataSet.
traj = trajectories(dataset)

**What is a trajectories object?**

The trajectories object, traj, hosts the original xarray DataSet accessible as an attribute with traj.data. 

For improved functionality, attribute variables (data variables stored in our DataSet) are made accessible with traj.{variable} to access a given {variable}.

The true value of a trajectories object comes with the use of the built-in functions specifically designed for post-processing, visualisation and analysis of Lagrangain water parcel trajectories.

### Exploring our trajectories object

By accessing the .data attribute of our trajectories object above, we can see that attribute variables are formatted with dimensions **traj** (trajectory) and **obs** (observation - represents one time-level).

**Note**: Since trajectories are of differing lengths, missing **obs** values for a given **traj** are filled as NaNs or NaTs (time).

In [None]:
# To return details of our original DataSet.
print(traj.data)

# To access the temp attribute variable.
print(traj.trajectory)

# Using datetime64 format for time instead of timedelta64.
# Start date of simulation is 2000-01-01
traj = traj.use_datetime('2000-01-01')

### Adding new attribute variables to our trajectories object.

To add a new attribute variable to a trajectories object, including appending it to the original DataSet, use the **.add_variable()** method. 

Two common attribute variables which a user may wish to add to their trajectories object, a unique trajectory id and the seeding level when a particles are released are included as seperate methods: **.add_id()** and **.add_seed()**.

In [None]:
# Suppose we would like to add the id and seeding level of all of our
# particles for future analysis. 

# We can combine methods on a single line to return both.
traj = traj.add_id().add_seed()

# Let's look at our new attribute variables.
print(traj.id)
print(traj.seed_level)

In [None]:
# Consider now if we wanted to create a new attribute variable, temp_K,
# to store the in-situ temperature in Kelvin. 

# Using .values to access the numpy array storing the values of temp.
temp_k = traj.temp.values + 273.15

# Dictionary of attributes of our new variable.
attrs = {'long_name': 'in-situ temperature in Kelvin',
         'standard_name': 'temp_K',
         'units': 'Kelvin'
        }

# Add temp_k as a new attribute variable to our trajectories object.
traj = traj.add_variable(temp_k, attrs)

# Let's see if temp_K has been added!
print(traj.temp_K)

### Filtering trajectories using an attribute variable.

Filtering our trajectories to include only those complete trajectories where a specific criteria is met is integral to Lagrangian analysis. 

The LT Toolbox includes two fully vectorised methods **.filter_between()** and **filter_equal()** to allow users to filter on any attribute variable conatined in their trajectories object.

In [None]:
# Filtering all our trajectories to include only those released in 
# seeding level 1 and storing as a new trajectories object, traj_seed1.
traj_seed1 = traj.filter_equal('seed_level', 1, drop=False)

# Let's look at the data in traj_seed1 - only 864 particles were
# released in seeding level 1.
print(traj_seed1.data)

In [None]:
# Filtering all our trajectories to include only those which travel 
# between 0 N - 25 N and storing as a new trajectories object, 
# traj_sublat.
traj_sublat = traj.filter_between('lat', 0, 25, drop=False)

# Let's look at the data in traj_sublat - only 560 particles were
# found between 0-25 N at any time during their trajectories.
print(traj_sublat.data)

### Filtering trajectories by time - A Special Case

It should be noted that when filter methods are called with a time attribute variable, rather than return complete trajectories, only the observations (**obs**) meeting the specified criteria are returned. 

In [None]:
# Defining the min and maximum time levels to return obs for.
tmin = np.datetime64('2000-01-31')
tmax = np.datetime64('2000-03-01')

# Filtering between tmin and tmax, the values tmin and tmax are
# included, and storing as a new trajectories object, traj_subtime.
traj_subtime = traj.filter_between('time', tmin, tmax, drop=False)

# Let's look at the data in traj_subtime - only 4 (116) obs are 
# included as expected.
print(traj_subtime.data.time)

In [None]:
# Defining the time level to return obs for.
tval = np.datetime64('2000-03-01')

# Filtering tval and storing as a new trajectories object,
# traj_subtime.
traj_subtime = traj.filter_equal('time', tval, drop=False)

# Let's look at the data in traj_subtime - only 1 (119) obs is 
# included as expected.
print(traj_subtime.data.time)

### Getting existing features from a trajectories object.

The LT Toolbox includes a range of .get_ methods to extract important features from existing attribute variables in a trajectories object. 

In [None]:
# Get the times and locations when particle are released. 
traj = traj.get_start_time().get_start_loc()

# Get the times and locations when particles are terminated.
traj = traj.get_end_time().get_end_loc()


In [None]:
# Get the maximum value of the temp variable for each trajectory. 
traj = traj.get_max('temp')

# Get the minimum value of the temp variable for each trajectory. 
traj = traj.get_min('temp')

# Get the value of the temp variable for each trajectory on 2000-01-01. 
traj = traj.get_value('temp', '2000-01-01')


In [None]:
# Get the duration of each trajectory, t_total. 
traj = traj.get_duration()

# Let's look at the data in traj.
print(traj.data)

### Computing diagnostic metrics for trajectories.

Computation is a further important feature of the toolbox.

In particular, there are .compute methods available to compute the distance travelled by particles along their trajectories, particle displacements (zonal/meridional/vertical) and Lagrangian velocities (zonal/meridional/vertical).

Below we show how to combine **filter**, **compute** and **plot** methods to efficiently generate visualisations of output data.

In [None]:
# Creating a time series plot of the meridional displacement travelled 
# by the first 10 particles in our ORCA1 output data.
traj.filter_between('traj', 0, 9).compute_dy().plot_timeseries('temp', 1)

In [None]:
# Creating a time series plot of the meridional velocity travelled by 
# the first 10 particles in our ORCA1 output data.
traj.filter_between('traj', 0, 9).compute_v().plot_timeseries('v')

### Further plotting of seawater properties.

Another useful plot for examining the water mass properties of a given trajectory is the temperature-salinity (t-s) diagram. 

We can very easily produce a t-s diagram with the LT Toolbox using the **.plot_ts_diagram** method.

In [None]:
# Creating a t-s plot for the particles released between 2000-4000m
# in seeding level 1 and colouring by the particle longitude of 
# release.
traj.filter_equal('seed_level', 1).filter_between('z', -4000, -2000).plot_ts_diagram(col_variable='lon')

In [None]:
# Plotting temperature of the particles released at the
# first seeding level on 2000-03-01 (3rd time level).
traj.plot_variable('temp', 'xz', 1, '2000-03-01')

### Mapping trajectories and probability distributions with Cartopy.

The LT Toolbox utilises the highly adaptable Cartopy geospatial visualisation package to map trajectories and their properties on the Earth's surface. 

Below we show how to use **.map_trajectories**, **.map_probability()** and **.map_property()** methods when anaysing trajectory output data.

In [None]:
# Creating a map of trajectories released during the first seeding 
# levels (Jan - March).
traj.filter_equal('seed_level', 2).map_trajectories()

In [None]:
# Creating a map of the trajectories of the first 20 particles released,
#Â coloured by the in-situ temperature.
traj.filter_between('traj', 0, 19).map_trajectories(col_variable='temp')

In [None]:
# Creating a map of the binned probability of particle pathways.
traj.map_probability(bin_res=1, prob_type='traj', cmap='viridis')

In [None]:
# Creating a map of the binned probability of particle positions.
traj.map_probability(bin_res=1, prob_type='pos', cmap='viridis')

In [None]:
# Creating a map of the binned mean temperature of particles.
traj.map_property(bin_res=1, variable='temp', statistic='mean')