![](https://img.shields.io/badge/PO.DAAC-Contribution-%20?color=grey&labelColor=blue)

> From the PO.DAAC Cookbook, to access the GitHub version of the notebook, follow [this link](https://github.com/podaac/tutorials/blob/master/notebooks/datasets/SWOTHR_localmachine.ipynb).

# SWOT Hydrology Dataset Exploration on a local machine

## Accessing and Visualizing SWOT Datasets

### Requirement:
Local compute environment e.g. laptop, server: this tutorial can be run on your local machine.

### Learning Objectives:
- Access SWOT HR data prodcuts (archived in NASA Earthdata Cloud) within the AWS cloud, by downloading to local machine
- Visualize accessed data for a quick check

#### SWOT Level 2 KaRIn High Rate Version 2.0 Datasets:

1. **River Vector Shapefile** - SWOT_L2_HR_RIVERSP_2.0

2. **Lake Vector Shapefile** - SWOT_L2_HR_LAKESP_2.0

3. **Water Mask Pixel Cloud NetCDF** - SWOT_L2_HR_PIXC_2.0

4. **Water Mask Pixel Cloud Vector Attribute NetCDF** - SWOT_L2_HR_PIXCVec_2.0

5. **Raster NetCDF** - SWOT_L2_HR_Raster_2.0

6. **Single Look Complex Data product** - SWOT_L1B_HR_SLC_2.0

_This notebook has been slightly modified by the University of Sherbrooke and University Laval team to run smoothly in Google Colab for the June 10, 2024 training session. Original authors :  Cassie Nickles, NASA PO.DAAC (Feb 2024) || Other Contributors: Zoe Walschots (PO.DAAC Summer Intern 2023), Catalina Taglialatela (NASA PO.DAAC), Luis Lopez (NASA NSIDC DAAC)_

_Last update :  January 13, 2026_

  

### Libraries Needed

In [None]:
!pip install contextily
!pip install earthaccess
!pip install --upgrade holoviews hvplot
!pip install holoviews hvplot bokeh xarray
!pip install rioxarray
!pip install rasterio
!pip install shapely
!pip install geoviews
!pip install pyproj
#!pip install csrspy

#!pip install hvplot


import glob
import h5netcdf
import xarray as xr
import pandas as pd
import geopandas as gpd
import contextily as cx
import numpy as np
import matplotlib.pyplot as plt
import hvplot.xarray
import holoviews as hv
import zipfile
import earthaccess
import os
import rioxarray
from shapely.geometry import mapping
from shapely.geometry import Point
import csv
import shapefile
import geoviews as gvts
from pyproj import Proj
import logging
from shapely.geometry import box
from datetime import datetime
from tqdm import tqdm

#from csrspy.main import CSRSTransformer
#from csrspy.enums import CoordType, Reference, VerticalDatum
#from csrspy.utils import sync_missing_grid_files


Install modified csrspy library for transformations

In [None]:
# Download and extract the library
!wget -O csrspy_modifie.zip https://github.com/sfoucher/SWOT-Canada/raw/main/csrspy_modifie.zip
!unzip -o csrspy_modifie.zip -d csrspy_modifie

# Add the folder to the Python path
import sys
sys.path.append('/content/csrspy_modifie')

# Verify contents
!ls /content/csrspy_modifie

In [None]:

# Import the package
import sys
sys.path.append('/content/csrspy_modifie/csrspy_modifie')
import csrspy
from csrspy.main import CSRSTransformer
from csrspy.enums import CoordType, Reference, VerticalDatum
from csrspy.utils import sync_missing_grid_files
dir(csrspy)

### Earthdata Login

An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. If you don't already have one, please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up. We use `earthaccess` to authenticate your login credentials below.

In [None]:
auth = earthaccess.login()

### Single File Access

#### **1. River Vector Shapefiles**

The https access link can be found using `earthaccess` data search. Since this collection consists of Reach and Node files, we need to extract only the granule for the Reach file. We do this by filtering for the 'Reach' title in the data link.

Alternatively, Earthdata Search [(see tutorial)](https://nasa-openscapes.github.io/2021-Cloud-Workshop-AGU/tutorials/01_Earthdata_Search.html) can be used to manually search in a GUI interface.

For additional tips on spatial searching of SWOT HR L2 data, see also [PO.DAAC Cookbook - SWOT Chapter tips section](https://podaac.github.io/tutorials/quarto_text/SWOT.html#tips-for-swot-hr-spatial-search).




#### Search for the data of interest

---



In [None]:
#Retrieving granules with the desired characteristics using the 'earthdata.search'_data function
river_results = earthaccess.search_data(short_name = 'SWOT_L2_HR_RiverSP_D', # Enter 'SWOT_L2_HR_RiverSP_2.0' for the version C data
                                        #temporal = ('2024-02-01 00:00:00', '2025-11-03 23:59:59'), # can also be filtered based on the time range
                                        granule_name = '*Node*_214_NA*') # Here we filter by Node files (not Reach), by pass, and by continent
                                                                        # Specify 'Node' or 'Reach' depending on the desired file


In [None]:
# Print the properties of the granules associated with the selected pass
print(river_results)

#### Dowload, unzip, read the data

Let's download the selected data file! `earthaccess.download` has a list as the input format, so we need to put brackets around the single file we pass.
Here, we download the most recent file from the collection.



In [None]:
earthaccess.download([river_results[-1]], "./data_downloads")

The native format for this data is a .zip file, and we want the .shp file within the .zip file, so we must first extract the data to open it. First, we'll programmatically get the filename we just downloaded, and then extract all data to the `data_downloads` folder.

In [None]:
filename = earthaccess.results.DataGranule.data_links(river_results[-1], access='external')
filename = filename[0].split("/")[-1]
filename

In [None]:
with zipfile.ZipFile(f'data_downloads/{filename}', 'r') as zip_ref:
    zip_ref.extractall('data_downloads')

Open the shapefile using `geopandas`

In [None]:
filename_shp = filename.replace('.zip','.shp')

In [None]:
SWOT_HR_shp1 = gpd.read_file(f'data_downloads/{filename_shp}')

#view the attribute table
SWOT_HR_shp1

#### Quickly plot the SWOT river data

Display water elevations (WSE, Water Surface Elevation)

In [None]:
# Définir les coordonnées de la boîte englobante (xmin, ymin, xmax, ymax)
bounding_box = box(-67.01, 45.82, -66.42, 46.4)

# Filtrer les valeurs aberrantes (-999999999999) dans la colonne 'wse'
SWOT_HR_shp1_filtered = SWOT_HR_shp1[SWOT_HR_shp1['wse'] != -999999999999]

# Découper le shapefile avec la boîte englobante
SWOT_HR_shp1_clipped = gpd.clip(SWOT_HR_shp1_filtered, bounding_box)

# Affichage avec WSE après découpage et filtrage des valeurs aberrantes
fig, ax = plt.subplots(figsize=(10,10))

# Tracer la figure avec des couleurs basées sur la colonne WSE
SWOT_HR_shp1_clipped.plot(ax=ax, column='wse', cmap='cool', legend=True,
                          legend_kwds={'label': "Water Surface Elevation (WSE)", 'orientation': "vertical"})

# Ajouter une carte de base
cx.add_basemap(ax, crs=SWOT_HR_shp1_clipped.crs, source=cx.providers.Esri.WorldStreetMap)

plt.show()


Display node quality

In [None]:
import matplotlib.colors as mcolors
# Create a list of unique values in node_q
unique_values = np.sort(SWOT_HR_shp1['node_q'].unique())

# Create a discret colormap
cmap = plt.get_cmap('RdYlGn_r', len(unique_values))
norm = mcolors.BoundaryNorm(boundaries=np.arange(len(unique_values)+1)-0.5, ncolors=len(unique_values))

# Display with coloring based on node_q
fig, ax = plt.subplots(figsize=(10,10))

# Plot figure colored by node_q
SWOT_HR_shp1.plot(ax=ax, column='node_q', cmap=cmap, norm=norm, legend=False)

# Add legend
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
sm.set_array([])

cbar = plt.colorbar(sm, ax=ax, ticks=range(len(unique_values)))
cbar.ax.set_yticklabels(unique_values)
cbar.set_label('Nodes quality')

# Define area of interest boundaries
ax.set_ylim(45.82, 46.4)
ax.set_xlim(-67.01, -66.42)


# Add basemap
cx.add_basemap(ax, crs=SWOT_HR_shp1.crs, source=cx.providers.Esri.WorldStreetMap)

plt.show()

print("""0 = good
1 = suspect - may have large errors
2 = degraded - very likely do have large errors
3 = bad -  may be nonsensicial and should be ignored
""")

In [None]:
# Another way to plot geopandas dataframes is with `explore`, which also plots a basemap
#SWOT_HR_shp1.explore()

#### **2. Lake Vector Shapefiles**

The lake vector shapefiles can be accessed in the same way as the river shapefiles above.

For additional tips on spatial searching of SWOT HR L2 data, see also [PO.DAAC Cookbook - SWOT Chapter tips section](https://podaac.github.io/tutorials/quarto_text/SWOT.html#tips-for-swot-hr-spatial-search).

#### Search for data of interest

In [None]:
lake_results = earthaccess.search_data(short_name = 'SWOT_L2_HR_LAKESP_D',
                                        #temporal = ('2024-02-01 00:00:00', '2024-02-29 23:59:59'), # can also be filtered based on the time range
                                        granule_name = '*Prior*_214_NA*') # Here we filter files with 'Prior' (this collection has three options: Obs, Unassigned, and Prior), by pass, and by continent


In [None]:
#Print the granule characteristics associated with the selected pass.
print(lake_results)

Let's download the selected data file! For the formation purposes, let's download a granule from the Halifax region. earthaccess.download has a list as the input format, so we need to put brackets around the single file we pass.  Here, we download the most recent file from the collection.


In [None]:
earthaccess.download([lake_results[-1]], "./data_downloads")

The native format for this data is a .zip file, and we want the .shp file within the .zip file, so we must first extract the data to open it. First, we'll programmatically get the filename we just downloaded, and then extract all data to the `SWOT_downloads` folder.

In [None]:
filename2 = earthaccess.results.DataGranule.data_links(lake_results[-1], access='external')
filename2 = filename2[0].split("/")[-1]
filename2

In [None]:
with zipfile.ZipFile(f'data_downloads/{filename2}', 'r') as zip_ref:
    zip_ref.extractall('data_downloads')

Open the shapefile using `geopandas`

In [None]:
filename_shp2 = filename2.replace('.zip','.shp')
filename_shp2

In [None]:
SWOT_HR_shp2 = gpd.read_file(f'data_downloads/{filename_shp2}')

#view attribute table
SWOT_HR_shp2

#### Quickly plot the SWOT lakes data

Display water elevations (WSE, Water Surface Elevation)

In [None]:
# Filter out outliers and set the bounding box
SWOT_HR_shp2_clipped = gpd.clip(SWOT_HR_shp2.query('wse != -999999999999'), box(-67.10, 45.47, -66.82, 45.65)) # oromocto lake

# Display the clipped shapefile
fig, ax = plt.subplots(figsize=(7, 7))


# Plot the figure with colors based on the WSE column
SWOT_HR_shp2_clipped.plot(ax=ax, column='wse', cmap='cool', legend=True,
                          legend_kwds={'label': "Niveau de d'élévation de l'eau (WSE)", 'orientation': "vertical"})


# Add WSE values to the shapefile
centroids = SWOT_HR_shp2_clipped.geometry.centroid
for x, y, label in zip(centroids.x, centroids.y, SWOT_HR_shp2_clipped['wse']):
    ax.text(x, y, f'{label:.2f}', fontsize=8, ha='center', va='center', color='black')

# Add basemap
cx.add_basemap(ax, crs=SWOT_HR_shp2_clipped.crs, source=cx.providers.Esri.WorldStreetMap)


plt.show()


Display the percentage of dark water.
Variable dark_frac: fraction of the total lake area (area_total) covered by dark water. This value ranges from 0 to 1, where 0 indicates no dark water and 1 indicates 100% dark water.


In [None]:
# Filter out outliers and set the bounding box
SWOT_HR_shp2_clipped = gpd.clip(SWOT_HR_shp2.query('wse != -999999999999'),  box(-67.10, 45.47, -66.82, 45.65))

# Display the clipped shapefile
fig, ax = plt.subplots(figsize=(7, 7))


# Plot the figure with colors based on the dark_frac column
SWOT_HR_shp2_clipped.plot(ax=ax, column='dark_frac', cmap='RdYlGn_r', legend=True,
                          legend_kwds={'label': "Pourcentage de dark water", 'orientation': "vertical"})

# Add dark_frac values to the shapefile

centroids = SWOT_HR_shp2_clipped.geometry.centroid
for x, y, label in zip(centroids.x, centroids.y, SWOT_HR_shp2_clipped['dark_frac']):
    ax.text(x, y, f'{label:.2f}', fontsize=8, ha='center', va='center', color='black')

# Add basemap
cx.add_basemap(ax, crs=SWOT_HR_shp2_clipped.crs, source=cx.providers.Esri.WorldStreetMap)

plt.show()

Accessing the remaining files is different than the shp files above. We do not need to extract the shapefiles from a zip file because the following SWOT HR collections are stored in **netCDF** files in the cloud. For the rest of the products, we will open via `xarray`, not `geopandas`.

#### **3. Water Mask Pixel Cloud NetCDF**

#### Search for data collection and time of interest

In [None]:
pixc_results = earthaccess.search_data(short_name = 'SWOT_L2_HR_PIXC_D', # Enter 'SWOT_L2_HR_PIXC_2.0' for Version C (data before May 6, 2025)
                                        #temporal = ('2024-02-01 00:00:00', '2024-02-29 23:59:59'), # can also specify by time
                                        granule_name = '*_214_074L*') # pass number, tile number and swath side (R or L)
                                        #bounding_box = (-72.73,46.58,-72.60,46.62)) # filter by bounding box, to find your bounding box : http://bboxfinder.com/

In [None]:
#Print the granule characteristics associated with the selected pass, tile and swath.
print(pixc_results)

Let's download one data file! earthaccess.download has a list as the input format, so we need to put brackets around the single file we pass.

In [None]:
earthaccess.download([pixc_results[-1]], "./data_downloads")

#### Open data using xarray





The pixel cloud netCDF files are formatted with three groups titled, "pixel cloud", "tvp", or "noise" (more detail [here](https://podaac-tools.jpl.nasa.gov/drive/files/misc/web/misc/swot_mission_docs/pdd/D-56411_SWOT_Product_Description_L2_HR_PIXC_20200810.pdf)). In order to access the coordinates and variables within the file, a group must be specified when calling xarray open_dataset.

In [None]:
ds_PIXC = xr.open_mfdataset("data_downloads/SWOT_L2_HR_PIXC_*.nc", group = 'pixel_cloud', engine='h5netcdf') #If several PIXC files are uploaded to the data_downlaod file, specify which of the files to display
ds_PIXC


#### Simple plot of the results

In [None]:
# This could take a few minutes to plot (approx 5 m)
vmin, vmax = np.nanpercentile(ds_PIXC.height, [2, 98])

fig, ax = plt.subplots(figsize=(10, 10))

cax = ax.scatter(
    ds_PIXC.longitude,
    ds_PIXC.latitude,
    c=ds_PIXC.height,
    s=1,
    vmin=vmin,
    vmax=vmax,
    cmap='viridis'
)

cbar = fig.colorbar(cax, ax=ax, shrink=0.5)
cbar.set_label('Height (m)')

cx.add_basemap(ax, crs='EPSG:4326', source=cx.providers.Esri.WorldStreetMap)

#### Clip and convert PIXC to .shp for QGIS

In [None]:
# Delete Fiona loggs
logging.getLogger('fiona').setLevel(logging.ERROR)

# Bounding box coordinates
lat_min = 45.88
lat_max = 45.98
lon_min = -66.77
lon_max = -66.49

# Extract latitude and longitude
lat = np.asarray(ds_PIXC.latitude[:])
lon = np.asarray(ds_PIXC.longitude[:])
classif  = np.asarray(ds_PIXC.classification[:])

# Define the mask based on coordinates and classification
mask = (lat > lat_min) & (lat < lat_max) & (lon > lon_min) & (lon < lon_max) & (classif>2) & (classif<5)

# Create a dictionary with the desired variables
data = {
    'height': np.asarray(ds_PIXC.height[:])[mask],
    'classif': np.asarray(ds_PIXC.classification[:])[mask],
    'latitude': lat[mask],
    'longitude': lon[mask]
}

# Convert the dictionary to a dataframe
df = pd.DataFrame(data)

# Create geometries & GeoDataFrames
points = [Point(x, y) for x, y in zip(df.longitude, df.latitude)]
gdf_out = gpd.GeoDataFrame(df, geometry=points, crs="EPSG:4326")

# Save as shapefile
out_shp = './data_downloads/PIXC_clipped.shp'
gdf_out.to_file(out_shp)

In [None]:
# Display your clipped PIXC data. You can also download them to visualize in QGIS.
vmin, vmax = np.nanpercentile(gdf_out.height, [2, 98])

fig, ax = plt.subplots(figsize=(10, 10))

cax = ax.scatter(
    gdf_out.longitude,
    gdf_out.latitude,
    c=gdf_out.height,
    s=1,
    vmin=vmin,
    vmax=vmax,
    cmap='viridis',
    rasterized=True
)

cbar = fig.colorbar(cax, ax=ax, shrink=0.5)
cbar.set_label('Height (m)')

cx.add_basemap(ax, crs='EPSG:4326', source=cx.providers.Esri.WorldStreetMap)

#### **4. Water Mask Pixel Cloud Vector Attribute NetCDF**

#### Search for data of interest

In [None]:
pixcvec_results = earthaccess.search_data(short_name = 'SWOT_L2_HR_PIXCVEC_D', # enter 'SWOT_L2_HR_PIXCVEC_2.0' for version C (data before May 6, 2026)
                                        #temporal = ('2024-02-01 00:00:00', '2024-02-29 23:59:59'), # # can also specify by time
                                        granule_name = '*_214_074L*') # pass number, tile number and swath side (R or L)
                                        #bounding_box = (-72.73,46.58,-72.60,46.62)) # filter by bounding box, to find your bounding box : http://bboxfinder.com/


In [None]:
#Print the granule characteristics.
print(pixcvec_results)

Let's download the first data file! earthaccess.download has a list as the input format, so we need to put brackets around the single file we pass. Here, we download the most recent file from the collection


In [None]:
earthaccess.download([pixcvec_results[-1]], "./data_downloads")

#### Open data using xarray

First, we'll programmatically get the filename we just downloaded and then view the file via `xarray`.

In [None]:
ds_PIXCVEC = xr.open_mfdataset("data_downloads/SWOT_L2_HR_PIXCVec_*.nc", decode_cf=False,  engine='h5netcdf')
ds_PIXCVEC

#### Simple plot

In [None]:
pixcvec_htvals = ds_PIXCVEC.height_vectorproc.compute()
pixcvec_latvals = ds_PIXCVEC.latitude_vectorproc.compute()
pixcvec_lonvals = ds_PIXCVEC.longitude_vectorproc.compute()

#Before plotting, we set all fill values to nan so that the graph shows up better spatially
pixcvec_htvals[pixcvec_htvals > 15000] = np.nan
pixcvec_latvals[pixcvec_latvals < 1] = np.nan
pixcvec_lonvals[pixcvec_lonvals > -1] = np.nan


In [None]:

vmin, vmax = np.nanpercentile(pixcvec_htvals, [2, 98])

fig, ax = plt.subplots(figsize=(10, 10))

cax = ax.scatter(
    pixcvec_lonvals,
    pixcvec_latvals,
    c=pixcvec_htvals,
    s=1,
    vmin=vmin,
    vmax=vmax,
    cmap='viridis',
    rasterized=True
)

cbar = fig.colorbar(cax, ax=ax, shrink=0.5)
cbar.set_label('Height (m)')

cx.add_basemap(ax, crs='EPSG:4326', source=cx.providers.Esri.WorldStreetMap)
plt.show()

#### **5. Raster NetCDF**

#### Search for data of interest

In [None]:
raster_results = earthaccess.search_data(short_name = 'SWOT_L2_HR_Raster_D', # enter 'SWOT_L2_HR_Raster_2.0' for version C (data before May 6, 2025)
                                        #temporal = ('2024-02-01 00:00:00', '2024-02-29 23:59:59'), # can also specify by time
                                        #bounding_box = (-72.73,46.58,-72.60,46.62), # filter by bounding box, to find your bounding box : http://bboxfinder.com/
                                        granule_name = '*100m*_214_037F*') # here we filter by files with '100m' in the name (This collection has two resolution options: 100m & 250m)



In [None]:
#Print the granule characteristics associated with the selected pass, scene and resolution.
print(raster_results)

Let's download one data file.  Here, we download the most recent file from the collection.

In [None]:
earthaccess.download([raster_results[-1]], "./data_downloads")

#### Open data with xarray

First, we'll programmatically get the filename we just downloaded and then view the file via `xarray`.

In [None]:
ds_raster = xr.open_mfdataset(f'data_downloads/SWOT_L2_HR_Raster*', engine='h5netcdf')
ds_raster

#### Quick interactive plot with `hvplot`

In [None]:
hv.extension('bokeh', 'matplotlib')
plot = ds_raster['wse'].hvplot.image(y='y', x='x')
hv.output(plot)



#### Mask a variable based on its quality indicator
Example for an L2_HR_Raster dataset, indicator "wse_qual":\
0 = good\
1 = suspect -  may have large errors\
2 = degraded - very likely do have large errors\
3 = bad -  may be nonsensical and should be ignored

In [None]:
variable_to_mask = ds_raster['wse']
mask_variable = ds_raster['wse_qual']


In [None]:
# Set the condition to hide data based on the quality indicator
mask_condition = mask_variable < 3
masked_variable = variable_to_mask.where(mask_condition)
masked_variable


In [None]:
# Update the masked variable in the dataset
hv.extension('bokeh', 'matplotlib')
plot2 = masked_variable.hvplot.image(y='y', x='x')
hv.output(plot2)


#### Clip masked Raster NetCDF

In [None]:
# Create a file for clipped data
os.makedirs('./content/clip_data', exist_ok=True)

In [None]:
#Define region of interest
from shapely.geometry import box

ROI = box(-67.28,45.72,-66.22,46.20)
bbox_gdf = gpd.GeoDataFrame({'geometry': [ROI]}, crs='EPSG:4326')


In [None]:
# Set the spatial dimensions of the dataset and the coordinate reference system (CRS)
masked_variable.rio.set_spatial_dims(x_dim="x", y_dim="y", inplace=True)
masked_variable.rio.write_crs("epsg:32618", inplace=True)


In [None]:
# Clip the raster
# If needed, first reproject the region of interest to the same EPSG as the netCDF
bbox_gdf = bbox_gdf.to_crs("epsg:32619")
clipped = masked_variable.rio.clip(bbox_gdf.geometry.apply(mapping), drop=True)


hv.extension('bokeh', 'matplotlib')
plot2 = clipped.hvplot.image(y='y', x='x')
hv.output(plot2)


In [None]:
# Delete 'grid_mapping' attribute
if 'grid_mapping' in clipped.attrs:
    del clipped.attrs['grid_mapping']

# Save clipped raster in a NetCDF file
clipped_path = './content/clip_data/clipped_raster.nc'

#### **6. Working with HYDROCON - Rivers**


Extract time series with Hydrocon for the Reach and Node of interest

In [None]:
from ast import And
import folium
import requests
from io import StringIO
import pandas as pd
import matplotlib.pyplot as plt

In [None]:

# You can choose a Node or Reach ID from the RiverSP product obtained in Section 1 or go here: https://www.swordexplorer.com/
# Caution: Node and Reach IDs are not the same depending on the downloaded version (C or D).
# Example for the Saint-Maurice River


#feature='Node'
#feature_id="71250300150401"
feature="Reach"
feature_id="72608300043"
start_time="2023-08-01T00:00:00Z"
end_time="2025-05-05T00:00:00Z"
collection_name = "SWOT_L2_HR_RiverSP_D"
#fields=reach_id,time_str,wse,width

parameters = (
    "https://soto.podaac.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?"
    + "feature=" + feature
    + "&feature_id=" + feature_id
    + "&start_time=" + start_time
    + "&end_time=" + end_time
    + "&collection_name=" + collection_name
    + "&output=geojson"
    + "&fields=reach_id,time_str,wse,width,cycle_id"
)

hydrocron_response = requests.get(
    parameters
).json()

hydrocron_response

# Extract the GeoJSON to display it on the map

geojson_data = hydrocron_response['results']['geojson']

geojson_data

# Set up the map using Folium (https://python-visualization.github.io/folium/latest/)


map = folium.Map (zoom_start=13, tiles="cartodbpositron", width=700, height=700)


folium.GeoJson(geojson_data, name='SWOT River Reach').add_to(map)
folium.LayerControl().add_to(map)


map.fit_bounds(map.get_bounds(), padding=(5, 5))

map


Display a water level time series for a node of interest

In [None]:
#St Jonh River

feature='Node'
feature_id="72608300050213" # near Fredericton
start_time="2023-08-01T00:00:00Z"
end_time="2026-01-21T00:00:00Z"
collection_name = "SWOT_L2_HR_RiverSP_D"
#fields=reach_id,time_str,wse,width

#parameters = "https://soto.podaac.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?feature="+feature+"&feature_id="+ feature_id +"&start_time=2024-01-01T00:00:00Z&end_time=2024-06-14T00:00:00Z&output=csv&fields=reach_id,node_id,time_str,node_q,wse,width,cycle_id"

parameters = (
    "https://soto.podaac.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?"
    + "feature=" + feature
    + "&feature_id=" + feature_id
    + "&start_time=" + start_time
    + "&end_time=" + end_time
    + "&collection_name=" + collection_name
    + "&output=csv"
    + "&fields=reach_id,node_id,time_str,node_q,wse,width,cycle_id"
)

hydrocron_response = requests.get(
    parameters
).json()

hydrocron_response
csv_str = hydrocron_response['results']['csv']
df = pd.read_csv(StringIO(csv_str))
ind = df.node_q<3

df = df[df['time_str'] != 'no_data']
df.time_str = pd.to_datetime(df.time_str, format='%Y-%m-%dT%H:%M:%SZ')
fig = plt.figure(figsize=(15,5))
plt.plot(df.time_str[ind], df.wse[ind], marker='o', linestyle='None')

plt.ylabel('Water surface elevation (m)')
plt.xlabel('SWOT observation date')
plt.title('Water Surface Elevation from Hydrocron for Node: ' + str(df.node_id[0]))

#Save image
plt.savefig('/content/WSE_Hydrocon_river.png')


#### Extract profile between two nodes

In [None]:
# Create a file for the data
os.makedirs('./content/profil_node', exist_ok=True)

Data search and download

In [None]:
# Function to search for and download the data
def download_data(pass_numbers, continent_code, path, temporal_range):
    links_list = []
    for pass_num in pass_numbers:
        river_results = earthaccess.search_data(
            short_name='SWOT_L2_HR_RIVERSP_D',
            temporal=temporal_range,
            granule_name=f"*Node*_{pass_num}_{continent_code}*"
        )
        links_list.extend([earthaccess.results.DataGranule.data_links(result, access='external')[0]
                           for result in river_results])

    earthaccess.download(links_list, path)
    return links_list

Extract ZIPs and load shapefiles

In [None]:
# Founction to extract ZIPs
def extract_files(links_list, path):
    filenames = [link.split("/")[-1] for link in links_list]
    for filename in filenames:
        with zipfile.ZipFile(f"{path}/{filename}", 'r') as zip_ref:
            zip_ref.extractall(path)
    return filenames

# Founction to load shapefiles
def load_shapefiles(filenames, path):
    filename_shps = [filename.replace('zip', 'shp') for filename in filenames]
    return gpd.GeoDataFrame(pd.concat([gpd.read_file(f"{path}/{shp}") for shp in filename_shps], ignore_index=True))

Node filtering and distance calculation

In [None]:
# Founction for node filtering
def filter_data_by_nodes(SWOT_HR_df, up_node, dn_node, date=None):
    filtered = SWOT_HR_df[
        (SWOT_HR_df['node_id'] >= dn_node) &
        (SWOT_HR_df['node_id'] < up_node) &
        (SWOT_HR_df['wse'] != -999999999999)
    ] #
    if date:
        filtered = filtered[filtered['time_str'].str.contains(date)]
    return filtered.sort_values(['node_id'])

# Founction for distance calculation

def calculate_distances(SWOT_HR_profil):
    """
    Calculates the cumulative distance from the 'p_length' column of SWOT_HR_profil_1.

    """
    delta = SWOT_HR_profil.p_length
    # Initialize a zero array to store cumulative distances
    dist_1=np.zeros((len(SWOT_HR_profil),1))

    # Cumulative distance calculation
    for n in range(len(SWOT_HR_profil)-1):
      dist_1[n+1]=dist_1[n]+delta.iloc[n]

    return dist_1

Profile plot

In [None]:
# Founction to plot the profile
def plot_profile(profile, dist, label):
    fig, ax = plt.subplots(figsize=(15, 5))
    ax.plot(dist, profile['wse'], marker='o', linestyle='None', label=label)
    ax.set_xlabel('Cumulative distance (m)')
    ax.set_ylabel('Water surface elevation (wse)')
    ax.legend()
    plt.show()
    #Save image
    plt.savefig('/content/profil.png')

Variables to modify as needed

In [None]:
# Directory path where the data will be saved and variables to identify
path = '/content/profil_node'
pass_number    = ["214"]  # pass number
continent_code = "NA"  # Continent code (NA for North America)
temporal_range = '2025-05-24 00:00:00', '2025-05-27 23:59:59'
dn_node = 72608300270021 # Downstream node ID ex: Nashwaak River
up_node = 72608300280071 # Upstream node ID ex : Nashwaak River

Complete execution

In [None]:
# Data dowload
links_list = download_data(pass_number, continent_code, path, temporal_range)
filenames = extract_files(links_list, path)

# Load shapefiles
SWOT_HR_df = load_shapefiles(filenames, path)
SWOT_HR_df['node_id'] = SWOT_HR_df['node_id'].astype(float)


# Filter the profile for May 26, 2025
profile_1 = filter_data_by_nodes(SWOT_HR_df, up_node, dn_node, date='2025-05-26')

# Calculate cumulative distances
dist_1 = calculate_distances(profile_1)

# Plot profil
plot_profile(profile_1, dist_1, 'May 26, 2025')

#### **7. Working with HYDROCON - Lake**

NOTE: Due to the size of the original polygon (L2_HR_LakeSP), only the lake's central point is returned. This is intended to facilitate compliance with GeoJSON specifications. The positions of the central points should not be considered exact.


In [None]:
feature="PriorLake"
feature_id="7260055912" #oromocto lake
start_time="2025-05-06T00:00:00Z"
end_time="2025-10-12T00:00:00Z"
collection_name = "SWOT_L2_HR_LakeSP_D"  # specify the desired version

parameters = (
    "https://soto.podaac.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?"
    f"feature={feature}&feature_id={feature_id}"
    f"&start_time={start_time}&end_time={end_time}"
    f"&collection_name={collection_name}"
    "&output=geojson&fields=lake_id,time_str,wse,area_total"
)

hydrocron_response = requests.get(
    parameters
).json()

hydrocron_response

# Extract GeoJSON for map display

geojson_data = hydrocron_response['results']['geojson']

geojson_data

valid_features = [
    feature for feature in geojson_data['features']
    if float(feature['properties']['wse']) > 0 and  # Select valid WSE values
       -90 <= feature['geometry']['coordinates'][1] <= 90 and  # Select valid latitude values
       -180 <= feature['geometry']['coordinates'][0] <= 180  # Select valid longitude values
]

# Create a geojson with valid features
filtered_geojson = {
    'type': 'FeatureCollection',
    'features': valid_features
}


filtered_geojson


# Set up the map using Folium (https://python-visualization.github.io/folium/latest/)

map = folium.Map (tiles="cartodbpositron", width=700, height=700)

# Add the GeoJSON from Hydrocron to the map
folium.GeoJson(filtered_geojson, name='SWOT Prior Lake').add_to(map)
folium.LayerControl().add_to(map)


# Center on the river
map.fit_bounds(map.get_bounds(), padding=(5, 5))

map




Show water level time series for a node of interest

In [None]:
import requests
import pandas as pd
import matplotlib.pyplot as plt
from io import StringIO

# Parameters definition
feature = "PriorLake"
feature_id = "7260055912"
start_time = "2023-08-31T00:00:00Z"
end_time = "2026-01-21T00:00:00Z"
collection_name = "SWOT_L2_HR_LakeSP_D"
parameters = (
    "https://soto.podaac.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?"
    + "feature=" + feature
    + "&feature_id=" + feature_id
    + "&start_time=" + start_time
    + "&end_time=" + end_time
    + "&collection_name=" + collection_name
    + "&output=csv"
    + "&fields=lake_id,time_str,wse,area_total,quality_f,dark_frac"
)


hydrocron_response = requests.get(parameters).json()

# Extrct CSV and create DataFrame
csv_str = hydrocron_response['results']['csv']
df = pd.read_csv(StringIO(csv_str))

# Select data where 'time_str' is not equal to 'no_data'
df = df[df['time_str'] != 'no_data']
# Select data where 'quality_f' is equal to 0 (good)
df = df[df['quality_f'] == 0]
# Select data where 'dark_frac' is less than 50%
df = df[df['dark_frac'] < 0.5]
# Convert 'time_str' to datetime format
df['time_str'] = pd.to_datetime(df['time_str'], format='%Y-%m-%dT%H:%M:%SZ')

# Create figure
fig = plt.figure(figsize=(15, 5))
plt.plot(df['time_str'], df['wse'], marker='o', linestyle='None')


plt.ylabel('Water surface elevation (m)')
plt.xlabel('SWOT observation date')
plt.title('Water Surface Elevation from Hydrocron for Lake: ' + str(df['lake_id'].iloc[0]))

# Save image
plt.savefig('/content/WSE_Hydrocron_Lake.png')
plt.show()

#### **8. Transform reference systems with csrsppy_modified**

The SWOT reference system is not the same as Canada's. It is therefore necessary to convert the data.
**CAUTION: Each Canadian provincial geodetic agency may use a different vertical reference system and epoch!**
In this section, the transformation is performed to the vertical reference system of New Brunswick.

Clip the data using your area of interest (necessary because each province may adopt a different epoch and vertical reference system).

**Change the RiverSP file name to match the one you want and the data acquisition date.**


In [None]:
# Download the shapefile to clip

shapefile_path = '/content/data_downloads/SWOT_L2_HR_RiverSP_Node_044_214_NA_20260111T003728_20260111T004317_PID0_01.zip'

points_gdf = gpd.read_file(shapefile_path)

# Define bbox
bounding_box = box(-67.28,45.72,-66.22,46.20)

# Filtrer les points qui sont à l'intérieur du bbox
gdf_subset = points_gdf[points_gdf.geometry.within(bounding_box)].copy()


In [None]:

# Parameter to change

acquisition_date = "2026-01-11"  # Enter the date of acquisition of the data to be converted here to ensure the correct epoch.

# Convert the date decimal year
def decimal_year(dt):
    year_start = datetime(dt.year, 1, 1)
    year_end = datetime(dt.year + 1, 1, 1)
    year_length = (year_end - year_start).total_seconds()
    seconds_passed = (dt - year_start).total_seconds()
    return dt.year + seconds_passed / year_length

s_epoch = decimal_year(datetime.strptime(acquisition_date, "%Y-%m-%d"))


In [None]:
# Use the filtered GeoDataFrame
gdf_nodes = gdf_subset.copy()

# Download or synchronize missing grids
sync_missing_grid_files()

# Create column 'h' = WSE_SWOT + geoid_hght
gdf_nodes['h'] = gdf_nodes['wse'] + gdf_nodes['geoid_hght']

#  Prepare the list of results
transformed_results = [None] * len(gdf_nodes)

# Transformation
for idx, row in enumerate(tqdm(gdf_nodes.itertuples(), total=len(gdf_nodes), desc="Transformation")):

    transformer = CSRSTransformer(
        t_ref_frame=Reference.NAD83CSRS,
        s_ref_frame=Reference.ITRF14,
        s_coords=CoordType.GEOG,
        t_coords=CoordType.GEOG,
        s_epoch=s_epoch,
        t_epoch=2010,
        epoch_shift_grid='ca_nrc_NAD83v70VG.tif',
        s_vd=VerticalDatum.WGS84,
        t_vd=VerticalDatum.CGG2013
    )
    # Source coordinates
    coords = [(row.lon, row.lat, row.h)]

    # Transformation (convert map to list)
    out_coords = list(transformer(coords))

    transformed_results[idx] = out_coords[0][2]

# Add the transformed column to the GeoDataFrame
gdf_nodes['wse_transformed'] = transformed_results

# Create the folder
output_dir = '/content/ref_change'
os.makedirs(output_dir, exist_ok=True)

# Save shapefile
gdf_nodes.to_file(os.path.join(output_dir, 'nodes_transformed.shp'))

#### **9. Download all your data to your local computer at once**





#### You can download data one by one by clicking on the three small dots to the right of the file name and then on “Download”. To download all the data at once, run the following cells. Execution and downlaod times may be long.

In [None]:
# Create a zip file with all data and choose the file to compress
!zip -r /content/data.zip /content/data_downloads

In [None]:
# Download data zip file
from google.colab import files
files.download("/content/data.zip")
