# Notebook to visualize SWOT longitudinal profile data, and modify vertical datum and units

The SWOT satellite measures water surface elevation, width and slope. It was launched in December 2022. It sees nearly all global rivers and lakes. For higher-latitude locations such as Alaska, observations are usually 3 or 4 times per 21 day cycle.

<img src="SWOT-Mission-Surface-Water-Ocean-Topography.jpg" alt="SWOT" width="500"/>

Key documents: https://podaac.jpl.nasa.gov/SWOT

This notebook was written by Mike Durand (durand.8@osu.edu), with contributions from Bidhya Yadav, Ohio State University. It was presented April 23, 2025.

This notebook will first present a little bit of background on SWOT data. Then we'll use SWOT data for a river reach you can select using this notebook. We'll pull data from NASA servers, and visualize the timeseries of river elevations at one reach on a river you choose. Then we'll look at the longitudinal profile of water elevations for that reach.

One note on using Jupyter notebooks: you can basically follow along with the presentation by pushing "Shift+Enter" when each cell is highlighted.  
* The blue bar on the left shows you which cell is highlighted 
* Many cells are just displaying information and graphics, while other cells are Python code
* Careful! If you run a cell with Python code more than once, it will repeat the code (which may apply an offset you only want to apply once, or run an operation that will take a litlte while, e.g.)

One final note: we are running today on CUAHSI's cloud computing resources: https://www.hydroshare.org/group/156. This lets us all work together in a shared environment, and gives anyone access to the tools we need for the workshop.

## Background on SWOT and its data products

There is a lot to know about data products, and we cannot cover all of it here. Please see this 15 minute video by Tamlin Pavelsky (UNC), the US Hydrology lead for SWOT, for a more detailed look: https://podaac.jpl.nasa.gov/animations/Hydrology-Data-Products-from-the-SWOT-Mission. 

From that video, here well cover just two data product types, both of which are in the "River Single Pass" dataset described in the video, and are described at the "Key documents" link, above. Specifically in this workshop we'll just look at rivers and nodes. But first, we'll take a brief look at one level lower, the pixel cloud.

### SWOT pixel cloud

The primary instrument on SWOT is the Ka-band radar interferometer KaRIN. As an interferometric SAR, SWOT measures radar backscatter intensity, phase, and coherence. These low-level quantities are processed into water surface elevation, mapped on an irregular grid, which we call the pixel cloud:

<img src="PixelCloud.png" alt="SWOT PixC" width="750"/>


### SWOT river data products: nodes and reaches

The pixel cloudmeasurements are then mapped to river nodes, which are locations along river centerlines located every ~200 m along rivers. There are on the order of 7.5 million nodes on global river.

<img src="Nodes-and-reaches.png" alt="SWOT PixC" width="750"/>

River products are aggregated again from nodes to river reaches, which are approximately 10 km in length. There are on the order of 150,000 reaches on global rivers.

## Objectives:

The workflow in the notebook contains cells that run Python code that do the following: 
1. Set up the compute environment by importing software packages
2. Identifying the reachid you are interested in, by going to the "SWORD Explorer" website
3. Enter the period of interest: plot a timeseries of SWOT elevations for a reach
4. Convert the vertical datum from SWOT's native datum to a datum of your choice
5. Pull a longitudinal profile of SWOT data for the reach and time of your choice
6. Export data to csv, and download

## 1 Set up environment

The Python cells below need to be run each time the notebook is executed. The set up the needed libraries to run here in CUAHSI's Jupyter Hub cloud.

In [None]:
# We'll use the plotly library to show the data. other libraries are in the Utilities.py file
import plotly.express as px

In [None]:
# These two functions pull reach timeseris and long profiles respectively
from Utilities import PullReachTimeseries, PullLongitudinalProfile, ChangeDatum

## 2 Find reach of interest

To choose a reach to analyze, go to SWORD Explorer: https://www.swordexplorer.com

To view Alaska, you must first click on the "81" basin. Then you should see a map that looks like this. Zoom in, and click on the reach you are interested in, and you'll see the reachid pop up.

<img src="SWORD.png" alt="SWORD" width="500"/>


In [None]:
# Define reach
reachid='81246000021' # this is the Nenana River at Nenana

## 3 Define time period of interest

This will pull and show a timeseries of SWOT overpasses for a reach, by displaying a timeseries of water elevations.

The command below uses a wonderful service called hydrochron which allows you to do API calls to query SWOT data. You can read about it here: https://www.earthdata.nasa.gov/news/hydrocron-new-tool-swot-time-series-analysis

This operations takes a few seconds.

In [None]:
# This command queries SWOT data using the Hydrochron service 
df=PullReachTimeseries(reachid)

In [None]:
# Inspect downloaded data
df.head()

This is the timeseries displayed as a dataframe using the powerful Pandas library (https://pandas.pydata.org). Basically it's similar to an Excel table that lets us manipulate and plot the data. 

For description of each field, see the "Key documents" link above. In particular, check out the Handbook, and the Product Description Document.

SWOT allows measurement of many quantities. An interesting one is slope.

In [None]:
# Plot swot slope as a timeseries 
px.line(df,x='time_str',y='slope',   # this line just tells which columns in the dataframe we want to plot
       labels={"time_str": "",               # this one and the one below just provide the xlabel and ylabel
               "slope": "water surface slope: dimensionless "},
        markers=True)

This data has been filtered to include quality flags 0 and 1 (good and suspect). You can read more about the various data quality flags in the "Key documents" link at the top of this notebook.
* You should notice that some observations look a bit more suspect than others. 
* A more sophisticated filter might be able to remove datapoints such as on September 16, 2024, but this is shown here to remember to apply sanity checks at all times. SWOT is an experimental mission. 



## 4 Convert WSE to NAVD88
We will use NOAA vdatum api (https://vdatum.noaa.gov/docs/services.html) to estimate the offset between SWOT EGM08 and NAVD88 

The call to the NOAA API takes a few seconds to run.

In [None]:
swot_offset, zout = ChangeDatum(df)  #retrieve offset value zout=zin+swot_offset
df['wse_NAVD88'] = df['wse'] + swot_offset #  this offset is applied in units of meters
df['wse_NAVD88_ft'] = df['wse_NAVD88']/0.3048 # convert meters to feet

In [None]:
# Plot swot data as a timeseries (feet)
px.line(df,x='time_str',y='wse_NAVD88_ft',   # this line just tells which columns in the dataframe we want to plot
       labels={"time_str": "",               # this one and the one below just provide the xlabel and ylabel
               "wse_NAVD88_ft": "water surface elevation: feet above NAVD88 "},
        markers=True)

This data has been filtered to include quality flags 0 and 1 (good and suspect). You can read more about the various data quality flags in the "Key documents" link at the top of this notebook.
* You should notice that some observations look a bit more suspect than others. 
* A more sophisticated filter might be able to remove datapoints such as on September 16, 2024, but this is shown here to remember to apply sanity checks at all times. SWOT is an experimental mission. 
* Notice too that I have not removed ice flagged data, but note that we are still learning what exactly SWOT sees when the water surface is frozen.
* There was a processing update in October 2024, and data since then has looked a bit better. 


Look at above plot and choose a day where there is data to analyze long profile.

From above, I am interested in seeing the profile August 4. It does not look abnormal, and it is in an ice-free time. Let's also look at a lower water time, in October.

In [None]:
# Save the day as one or two different times [yyyy-mm-dd,yyyy-mm-dd]. 
tlongs=['2024-08-04','2024-10-27'] # note - this can only be a series of one or two times

Notebook cells below will pull a long profile for this day

## 5. Pull longitudinal profile

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
# This command retrieves SWOT data for times listed above (just the first and last in the list)
longdf=PullLongitudinalProfile(reachid,tlongs)

In [None]:
# Let's take a look at the data in tabular format
longdf

Note: by default, this is only pulling data with node_q of 0 and 1. You can optionally supply another input to the PullLongitudinalProfile function that allows you to specify which quality flags to include. 

In [None]:
# Convert to NAVD88
longdf["wse"] = longdf["wse"] + swot_offset

In [None]:
# Plot long profiles, color-coded by date
px.line(longdf,x='p_dist_out',y='wse',color='date',
       labels={"p_dist_out": "Distance to outlet [m]",
               "wse": "wse[m]"},
        markers=True)

The "distance to outlet" data (the outlet point in the ocean) are not easy to parse

Let's plot distance as kilometers to Tanana confluence instead 

In [None]:
# This is based on the observation from the above graph that the minimum flow distance could be treated as a "zero" point.
# This same thing could be done for any reach. Just happens that this one ends at Tanana
longdf['dist_up_conf']=(longdf['p_dist_out']-longdf['p_dist_out'].min())/1000.

In [None]:
# Plot compared with distance to Tanana confluence
px.line(longdf,x='dist_up_conf',y='wse',color='date',
       labels={"dist_up_conf": "Distance to Tanana confluence [km]",
               "wse": "wse[m]"},
        markers=True)

In [None]:
# Convert to English units
longdf['wse [ft]'] = longdf['wse']/0.3048
longdf['dist_up_conf [mi]'] = longdf['dist_up_conf']*0.62

In [None]:
# Plot in English units
px.line(longdf,x='dist_up_conf [mi]',y='wse [ft]',color='date',
       labels={"dist_up_conf [mi]": "Distance to Tanana confluence [mi]",
               "wse": "wse[feet]"},
        markers=True)

This is really interesting: it seems to show that the upstream water elevation changes much less than downstream. And it also looks as if the slope is flatter at high flow, and steeper at low flow. We can check that by comparing the SWOT slope data, which I pulled manually from graph above:
* August 4, 2024: 0.00162 ft/ft
* October 27, 2024: 0.00167 ft/ft 

In [None]:
# Now let's look at width
longdf['width [ft]'] = longdf['width']/0.3048
px.line(longdf,x='dist_up_conf [mi]',y='width [ft]',color='date',
       labels={"dist_up_conf [mi]": "Distance to Tanana confluence [mi]",
               "width [ft]": "river width [feet]"},
        markers=True)

SWOT is clearly capturing the high and low points. However, severael things are going on, here. The times when SWOT width goes to zero in August are clearly wrong. This is likely the result of a major software updated applied in October 2024. All data will be reprocessed in calendar year 2025 but for now, everything before October 2024 is subject to some errors especially in width.

## 6. Export data to csv and download

This command downloads data to the local file on 

In [None]:
longdf.to_csv('example-long-profile-output.csv')

Now you can right click the .csv file in the file list on the left, and download to your local computer!