<a href="https://colab.research.google.com/github/SteveCoss/SWOTdawgDISTRO/blob/main/ExploreDatasets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to download
The notebooks and the datasets, including an up-to-date version of swotdawgviz can be downloaded using this link: https://filesender.renater.fr/?s=download&token=2dee442e-4b37-41be-83d6-ae22aa8378a2

# Notebook to explore SWOT algorithm input datasets

The input datasets are 1) the SWORD dataset (the prior database); 2) the SWOT data; 3) the so-called SWORD of science (SoS)

In [None]:
import os, sys
from google.colab import drive
drive.mount('/content/drive')
nb_path = '/content/notebooks'
os.symlink('/content/drive/My Drive/DAWGnotebooks/Path_files', nb_path)
sys.path.insert(0,nb_path)
#test

In [None]:
#reset working directory to distro folder
!pwd
import os
os.chdir("/content/drive/My Drive/DAWGnotebooks/dist_4.1")
!pwd

In [None]:
import os,sys

import json
from pathlib import Path
import geopandas as gpd
from netCDF4 import Dataset
import numpy as np
import folium

# Register pandas converters for matplotlib
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

import matplotlib.pyplot as plt

This notebook relies on the swotdawgviz library. Use either the version included in this package or use an up-to-date version (https://github.com/klarnier/swotdawgviz)

In [None]:
# Using embedded version of swotdawgviz
from swotdawgviz.swotdawgviz import io as sdvio
from swotdawgviz.swotdawgviz import maps as sdvm

# # Using installed version of swotdawgviz
# from swotdawgviz import io as sdvio
# from swotdawgviz import maps as sdvm

## Explore SWORD

Here we'll check out the SWORD data. We'll create a quick map of the domain showing the reach centerlines, then create a map with reaches colored by width. 

In [None]:
InputDir=Path('.')
swotdir=InputDir.joinpath('swot')
swot_nc_dir=swotdir.joinpath('timeseries')

sword_dir=InputDir.joinpath('sword')
sword_shp_dir=sword_dir.joinpath('shp').joinpath('NA')

collection = sdvio.SwotObservationsCollection(swot_nc_dir)

sword_hb74_reaches = sdvio.SwordShapefile(sword_shp_dir.joinpath("na_sword_reaches_hb74_v11.shp"),
                                          reaches_list=collection.reaches_list)


In [None]:
rmap = sdvm.ReachesMap(sword_hb74_reaches.dataset)
ridmap = rmap.get_centerlines_map()
ridmap

In [None]:
# the swotdawgviz library is also set up to create maps with reaches (or nodes) colored by attributes
widthmap = rmap.get_centerlines_map(varname="width")
widthmap

In [None]:
# in fact,  swotdawgviz library can even create polygons for the reaches, with width shown by polygon size
# zoom in to check out what's going on!
widthpolymap = rmap.get_polygons_map(varname="width", width_attribute="width")
widthpolymap

In [None]:
# you can visualize pretty much any SWORD characteristic this way
faccmap = rmap.get_centerlines_map(varname="facc",varlimits=[11000,1e6])
faccmap

## Explore SWOT Data

First, show an example swath in the domain, and then look at some of the data products from that swath

In [None]:
# show an example swath in the domain

#read orbit data for one pass, 175, and show coverage
orbitdir=swotdir.joinpath('orbit')
orbitfile=orbitdir.joinpath('swot_science_orbit_sept2015-v2_10s_swath.shp')
df = gpd.read_file(orbitfile)

swathmap = folium.Map(
    location=[38.5, -85],
    tiles='Stamen Toner',
    zoom_start=6.5)

for _, r in df.iterrows():
    if r['ID_PASS']==175:    
        sim_geo = gpd.GeoSeries(r['geometry']).simplify(tolerance=0.001)
        geo_j = sim_geo.to_json()
        geo_j = folium.GeoJson(data=geo_j,
                               style_function=lambda x: {'fillColor': 'orange'})
        folium.Popup(r['ID_PASS']).add_to(geo_j)
        geo_j.add_to(swathmap)

swathmap

In [None]:
# show the node WSE measurements for first cycle of this pass which I _think_ is 428 in the new convention 

obsfile=swotdir.joinpath('RiverSP').joinpath('SWOT_L2_HR_RiverSP_node_1_428_NA_20100830T061857_20100830T061920_PGA2_03.shp')
dfobs = gpd.read_file(obsfile)

nmap = sdvm.NodesMap(dfobs)
obsnodes = nmap.get_map(varname="wse", add_to_map=swathmap)

# folium.GeoJson(data=dfobs['geometry'],
#                marker=folium.CircleMarker(location=None, radius = 3, # Radius in metres
#                                            weight = 0, #outline weight
#                                            fill_color = '#0000FF', 
#                                            fill_opacity = 1),).add_to(swathmap)

swathmap

In [None]:
# plot up all the node WSE data for the Ohio River, from this pass
# TODO change axis labels to make pretty etc. make pdistout human readable

# read in file with all reaches in the domain
reach_json=InputDir.joinpath('reaches.json')
with open(reach_json) as json_file:
    reaches = json.load(json_file)

#extract reach ids
domain_reachids=list()
for reach in reaches:
    domain_reachids.append(reach['reach_id'])

# read data from sword file
swordfile=sword_dir.joinpath('netcdf').joinpath('na_sword_v11.nc')
sword_dataset=Dataset(swordfile)

sword_point_reachids=sword_dataset['centerlines/reach_id'][0,:][:]
swordx=sword_dataset['centerlines/x'][:]
swordy=sword_dataset['centerlines/y'][:]

swordreachids=sword_dataset["reaches/reach_id"][:].tolist()
sword_names=sword_dataset['reaches/river_name'][:]
sword_drainage_area=sword_dataset['reaches/facc'][:]
sword_swot_orbits=sword_dataset['reaches/swot_orbits'][:]

# create sword data dictionary for domain
domain_reach_data={}

for reach in domain_reachids:
    
    # deal with points
    indxs=np.argwhere(sword_point_reachids.data==reach)   
    indxs=indxs[:,0]
    points=[]
    for indx in indxs:           
        points.append(tuple([swordy[indx],swordx[indx]]))    
        
    # deal with reaches
    indx = swordreachids.index(reach)
    
    domain_reach_data[reach]={}
    domain_reach_data[reach]['clpoints']=points
    domain_reach_data[reach]['river_name']=sword_names[indx]
    domain_reach_data[reach]['drainage_area_km2']=sword_drainage_area[indx]
    domain_reach_data[reach]['swot_orbits']=sword_swot_orbits[:,indx]

# grab river name for each node
node_river_name=list()
for _, r in dfobs.iterrows():
    if int(r['reach_id']) in domain_reach_data:
        node_river_name.append(domain_reach_data[int(r['reach_id'])]['river_name'])
    else:
        node_river_name.append('N/A')
        
dfobs['river_name']=node_river_name
dfobs.head()

dfobs_ohio =  dfobs.loc[ (dfobs['river_name'] == 'Ohio River') & (dfobs['wse'] != -9999) ]

dfobs_ohio

In [None]:
rids=dfobs_ohio.reach_id.unique()

import random

fig,ax=plt.subplots()

for rid in rids:
    hexcolor = ["#"+''.join([random.choice('ABCDEF0123456789') for i in range(6)])]
    
    reach_data = dfobs_ohio.loc[dfobs_ohio.reach_id==rid]
    ax.scatter(reach_data["p_dist_out"] * 0.001, reach_data["wse"], c=hexcolor, label=rid)
    
plt.legend(loc='center left', bbox_to_anchor=(1.0, .5))
ax.set_xlabel("outlet distance (km)")
ax.set_ylabel("WSE (m)")
plt.tight_layout()
plt.show()


## Explore SOS
First, take a look at the flow duration curves  / compare the WBM and GRADES priors. There is a ton of information in SoS: take a look at the fluvial geomorph quantities. Then take a look at gage locations. 

In [None]:
sosdir=InputDir.joinpath('sos')
sosfile_con=sosdir.joinpath('constrained').joinpath('na_sword_v11_SOS_priors.nc')
sosfile_uncon=sosdir.joinpath('unconstrained').joinpath('na_sword_v11_SOS_priors.nc')
sos_con = sdvio.SosNetCDF(sosfile_con)
sos_uncon = sdvio.SosNetCDF(sosfile_uncon)

In [None]:
sos_con.dataset

In [None]:
list(sos_con.dataset.columns)

In [None]:
# create a map of GRADES two-year return Q

# add GRADES two-year discharge to the rmap object
rmap._dataset['GRADES_2yr']=-1.

for reachid in domain_reachids:
    
    reach_data = sos_con.dataset.loc[sos_con.dataset['reach_id']==reachid]
    
    if "grades_two_year_return_q" in reach_data.columns:
        #--------------------------------
        # EMBEDDED VERSION OF swotdawgviz
        #--------------------------------
        grades_two_year_return_q = reach_data['grades_two_year_return_q'].values[0]

    else:
        #--------------------------------
        # UP-TO-DATE VERSION OF swotdawgviz
        #--------------------------------
        grades_two_year_return_q = reach_data['model_two_year_return_q'].values[0]
        
    
    rmap._dataset.loc[rmap._dataset['reach_id'].astype(str)==str(reachid),['GRADES_2yr']]=grades_two_year_return_q
    

rmap._json_dataset = rmap._dataset.to_json()    



In [None]:
Grades2yr_map = rmap.get_centerlines_map(varname="GRADES_2yr",varlimits=[1000,15000])
Grades2yr_map

In [None]:
# create a map of A0 prior
rmap._dataset['A0prior']=-1.

for reachid in domain_reachids:
    
    reach_data = sos_con.dataset.loc[sos_con.dataset['reach_id']==reachid]
    gbpriors_logA0_hat = reach_data['gbpriors_logA0_hat'].values[0]
    
    rmap._dataset.loc[rmap._dataset['reach_id'].astype(str)==str(reachid),['A0prior']]=np.exp(gbpriors_logA0_hat)
    

rmap._json_dataset = rmap._dataset.to_json()    

A0hat_map = rmap.get_centerlines_map(varname="A0prior",varlimits=[10,1000])
A0hat_map

In [None]:
rmap._dataset

In [None]:
# create a map of GRADES meanQ

# add GRADES two-year discharge to the rmap object
rmap._dataset['grades_mean_q']=-1.

for reachid in domain_reachids:
    rmap._dataset.loc[rmap._dataset['reach_id'].astype(str)==str(reachid),['grades_mean_q']]=sos_con.dataset.loc[sos_con.dataset['reach_id']==reachid]['model_mean_q'].values[0]
    

rmap._json_dataset = rmap._dataset.to_json()    



In [None]:
GradesmeanQ_map = rmap.get_centerlines_map(varname="grades_mean_q",varlimits=[500,15000])
GradesmeanQ_map

In [None]:
# create a map of WBM meanQ

# add GRADES two-year discharge to the rmap object
rmap._dataset['WBM_mean_q']=-1.

for reachid in domain_reachids:
    rmap._dataset.loc[rmap._dataset['reach_id'].astype(str)==str(reachid),['WBM_mean_q']]=sos_uncon.dataset.loc[sos_uncon.dataset['reach_id']==reachid]['model_mean_q'].values[0]
    

rmap._json_dataset = rmap._dataset.to_json()    
WBMmeanQ_map = rmap.get_centerlines_map(varname="WBM_mean_q",varlimits=[500,15000])
WBMmeanQ_map

In [None]:
# plot up gage locations
nrtfile=InputDir.joinpath('sos/gages/NRT_V3.csv')

nrtdf = gpd.read_file(nrtfile)
nrtdf.crs = 'epsg:4326'
nrtdf.geometry=gpd.points_from_xy(nrtdf.X, nrtdf.Y)
nrtdfCAL=nrtdf[nrtdf.CAL.astype('int32')==1]
nrtdfVAL=nrtdf[nrtdf.CAL.astype('int32')==0]


#--------------------------------
# EMBEDDED VERSION OF swotdawgviz
#--------------------------------
# folium.GeoJson(data=nrtdf['geometry'],
#                marker=folium.CircleMarker(radius = 3, # Radius in metres
#                                            weight = 0, #outline weight
#                                            fill_color = '#0000FF', 
#                                            fill_opacity = 1),).add_to(ridmap)

#----------------------------------
# UP-TO-DATE VERSION of swotdawgviz
#----------------------------------
# Use circle shape for performance !
gmapC = sdvm.GagesMap(nrtdfCAL)
gmapV = sdvm.GagesMap(nrtdfVAL)
gagesmapC = gmapC.get_map(varname_id=None, shape="circle", add_to_map=ridmap)
gagesmapV = gmapV.get_map(varname_id=None, shape="marker", add_to_map=ridmap)
ridmap