# Introduction to PySPEDAS

**By Jim Lewis, Berkeley Space Sciences Lab - jwl@ssl.berkeley.edu**

- PySPEDAS: https://pyspedas.readthedocs.io/
- PyTplot: https://pyspedas.readthedocs.io/en/matplotlib-backend/


A little history: 
- `tplot` started as an IDL project in 1995, by Davin Larson, and is the core of SPEDAS (and now PySPEDAS)
- in 2017, some developers on the MAVEN team created an initial Python version, using Qt as a back-end for creating figures
- in late 2021, development began on a `matplotlib` version, which is what we're using in this notebook


We'll install a specific matplotlib version, to address a version conflict in the heliocloud environment.

In [None]:
!pip install matplotlib==3.6.2

We have preloaded the data we'll be using with PySPEDAS during PyHC Summer School, to avoid overloading the various data servers we would otherwise need to contact.  We'll set the SPEDAS_DATA_DIR environment variable to the data cache directory.

In [None]:
import os
os.environ["SPEDAS_DATA_DIR"] = "/home/jovyan/scratch_space/pyspedas_data"


### Example 1: A minimal PySPEDAS example ###

We'll import the top level pyspedas and pyspedas modules.  We'll also import a few frequently-used tools (like tplot) to avoid repetition.

In [None]:
import pyspedas
import pyspedas
from pyspedas import tplot

We'll set a time range corresponding to the date of the event we're studying.

In [None]:
trange = ['2023-03-24', '2023-03-25']

Now we'll load some THEMIS-A Fluxgate Magnetometer (FGM) data and plot it.

In [None]:
fgm_vars = pyspedas.themis.fgm(probe='a',trange=trange)
print(fgm_vars)
tplot('tha_fgs_gse')

A few things to note about this example:

Many of the PySPEDAS load routines follow the naming convention pyspedas.mission.instrument.

Parameters to the load routines are fairly standardized, and usually have sensible defaults.
In this case, we got all the relevant variables (including several choices of time resolution and coordinate systems) from the THEMIS-A L2 FGM data set.

Most load routines return a list of the tplot variables loaded.

We didn't have to set any axis titles, tick mark spacing, data ranges, or legends for the plot!
Many of the plot options are taken from metadata in the underlying data files; other attributes are initialized with reasonable defaults.

Tplot variables are the main data structure in PySPEDAS.  A tplot variable is essentially a container for timestamps, data arrays, and metadata (including plot options).  The underlying data structures are mapped to strings, and it is actually the strings or lists of strings that are passed around between PySPEDAS tools.

Let's take a closer look at some of the tplot variables produced by this load routine call.

We can see a list of loaded variables with the routine pyspedas.tplot_names():

In [None]:
pyspedas.tplot_names()

To get access to the underlying timestamp or data arrays, or the metadata dictionary, 
we can use pyspedas.get_data().

pyspedas.time_string() is useful for generating human-readable timestamps.

In [None]:
from pyspedas import get_data, time_string

# By default, get_data returns a tuple with named fields 'times', 'y', 
# and possibly additional fields for spectrograms or higher dimensional data arrays.
fgs_dat = get_data('tha_fgs_gse')
print(time_string(fgs_dat.times[0:3]))
print(fgs_dat.y[0:3])

# get_data can also return a dictionary containing the variable's metadata, plot options, etc.
fgs_md = get_data('tha_fgs_gse', metadata=True)
print(fgs_md.keys())
print(fgs_md['plot_options']['yaxis_opt'])

To find the supported load routines and keywords, see our documentation: https://pyspedas.readthedocs.io/

You can also see the supported options by calling `help` on the load routine you're interested in

In [None]:
help(pyspedas.themis.fgm)

# THEMIS ESA (Electrostatic Analyzer) data

Now we'll load some THEMIS-A data from the ESA instrument.   This is a good example of a spectrogram plot.

In [None]:
esa_vars = pyspedas.themis.esa(probe='a', trange=trange)
print(esa_vars)

# PEEF = ESA fast survey electrons, PEIF = ESA fast survey ions
tplot(['tha_peef_en_eflux','tha_peef_velocity_dsl','tha_peif_en_eflux', 'tha_peif_velocity_dsl'])

Let's take a look at one of the spectrogram variables, 'tha_peif_en_eflux'.

In [None]:
import numpy as np
esa_eflux_data = get_data('tha_peif_en_eflux')
print("Timestamps:")
print(pyspedas.time_string(esa_eflux_data.times[0:3]))
print("First data value(s):")
print(esa_eflux_data.y[0,:])
print("First v values(s):")
print(esa_eflux_data.v[0,:])
print("Shape of times array:",np.shape(esa_eflux_data.times))
print("Shape of data values array:", np.shape(esa_eflux_data.y))
print("Shape of v (bin values) array:",np.shape(esa_eflux_data.v))


There are 746 timestamps.  Each data point has 32 energy bins, with the bin values (in eV) along the Y axis. 
 
The data values in each energy bin are mapped to colors (what we call the "z-axis" for a spectogram variable).  

The tuple returned by get_data for this variable has an extra component, 'v', representing the bin values along the Y axis. Note that v also has shape 746 x 32.  The bin values are allowed to vary over time, and PySPEDAS will render them correctly as long as the metadata in the underlying data file follows the standard conventions.

# Orbit data for the ERG (Arase) satellite

Let's look at a different mission, JAXA's Arase probe.   We'll load and plot the orbit data
for the time range of interest.

In [None]:
erg_orb_vars = pyspedas.erg.orb(trange=trange)
tplot('erg_orb_l2_pos_gse')

In [None]:
erg_orb_vars


# OMNIWeb Solar Wind parameters

The OMNIWeb data set includes various solar wind and interplanetary magnetic field measurements that are especially useful as inputs to models of Earth's magnetic field, and as geomagnetic activity indexes.   Here we'll load some OMNI data for the date we're studying, and plot the proton density, flow speed, and dynamic pressure.

In [None]:
pyspedas.omni.data(trange=trange)
tplot(['proton_density', 'flow_speed', 'Pressure'])

# Ground Magnetometer Data #

PySPEDAS can load magnetometer data from several networks of ground stations in North America, Europe, Antarctica, and other locations.  Most of them can be accessed via the THEMIS GMAG load routine.   

In [None]:
gmag_vars = pyspedas.themis.gmag(sites=['fsmi', 'fykn', 'atha'], trange=trange)
tplot(gmag_vars)

Let's take a closer look at the FSMI data (Fort Smith, Northwest Territories, Canada).
The strength of the ambient field makes it hard to see small variations.  PyTplot has a "subtract_median" tool that will help visualize the variations.

In [None]:
pyspedas.subtract_median('thg_mag_fsmi',newname='thg_mag_fsmi_subtract_median')
tplot('thg_mag_fsmi_subtract_median')