<a href="https://colab.research.google.com/github/rivuletsteph/TIAER/blob/main/streamflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Streamflow from the National Water Model CONUS Retrospective Dataset channel output files in Zarr format
---
By Stephanie Brady and Emad Ahmed  
TIAER @ Tarleton State University    
March 13, 2023

Credits: 

[Rich Signell](https://github.com/rsignell-usgs)  
[Explore the National Water Model Reanalysis](https://nbviewer.org/gist/rsignell-usgs/78a4ce00360c65bc99764aa3e88a2493)

[James McCreight](https://github.com/jmccreight)  
[NWM v2.1 Retrospective Zarr Usage Example
](https://github.com/NCAR/rechunk_retro_nwm_v21/blob/main/notebooks/usage_example_streamflow_timeseries.ipynb)

The purpose of this notebook is to aquire National Water Model channel outputs ([NWM model output data version 2.1 in Zarr format](https://registry.opendata.aws/nwm-archive/)) for one reach, including:
* streamflow = River Flow (m3 s-1)










## To Begin
Visit the [OWP National Water Model Interactive Map](https://water.noaa.gov/map).   
This example will use ReachID `5512664`, which is the Brazos River near Glen Rose, Texas.   
Coordinates: 32.2404, -97.7120 

Import Packages

In [None]:
!pip install s3fs
!pip install zarr


import os
import pandas as pd
import numpy as np
import s3fs
import zarr
import fsspec
import xarray as xr
import plotly.graph_objects as go
from plotly.subplots import make_subplots


Create notebook output folder

In [None]:
!mkdir output
folder = os.getcwd()+'/output'

Identify the Reach ID (Refer to the section "To Begin").  
The Reach ID can be changed here.


In [None]:
reach_id=5512664

Setup Dask.distributed the [Easy Way](https://distributed.dask.org/en/stable/quickstart.html#setup-dask-distributed-the-easy-way)

In [None]:
from dask.distributed import Client, progress
client = Client()
client

Point to the AWS CLI (Amazon Web Services Command Line Interface) where the data is storred in Zarr format.

In [None]:
url = 's3://noaa-nwm-retrospective-2-1-zarr-pds/chrtout.zarr'

Print the [CPU Time and Wall Time](https://ipython.readthedocs.io/en/stable/interactive/magics.html?highlight=%25time#magic-time)  
Load and decode a dataset from the Zarr [store](https://docs.xarray.dev/en/stable/generated/xarray.open_zarr.html)  
Usually takes about 5 seconds

In [None]:
%%time
nwm_ds_hourly = xr.open_zarr(fsspec.get_mapper(url, anon=True), consolidated=True)

Print the dataset description

In [None]:
nwm_ds_hourly

Identify what reach or reaches for which the NWM streamflows will be downloaded.

For multiple reaches, use the syntax "reaches = np.sort(np.array([x1,x2,...]))".

In [None]:
reach = np.sort(np.array([reach_id]))

Print the wall time.  
Extract the flow data for the specified reach and  time period.  
Typically takes about 3 mins.

In [None]:
%%time
nwm_ds_hourly_subset = nwm_ds_hourly.streamflow.sel(feature_id=reach).compute()

In [None]:
nwm_ds_hourly_subset

Convert the dataset array into a dataframe.

In [None]:
raw_nwm_df= nwm_ds_hourly_subset.to_pandas()

In [None]:
raw_nwm_df

Create Hydrograph for the Raw NWM data (hourly)

In [None]:
raw_nwm_df.plot(figsize=(14,5), title= "NWM Reach "+ str(reach_id), xlabel="Time", ylabel="Discharge (cms)");

Create a new index column

In [None]:
nwm_df=raw_nwm_df.reset_index()

Update the column names

In [None]:
nwm_df.rename({'time': 'Date_Time', reach_id: 'NWM_cms'}, axis=1, inplace=True)

Calulate discharge in CFS



In [None]:
nwm_df["NWM_cfs"] = nwm_df["NWM_cms"]*35.314666212661

In [None]:
nwm_df['Date_Time'] = nwm_df['Date_Time'].dt.tz_localize('UTC').dt.tz_convert('America/Chicago').dt.tz_localize(None)

Export wrangled NWM dataframe to a csv file

In [None]:
nwm_df

In [None]:
nwm_df.to_csv(folder +'/NWM_Data_for_' + str(reach_id)  + '.csv')

Export the table

Create the Plotly graph  
Typically takes about 10 seconds

In [None]:
flow_fig = go.Figure()

#flow_fig.add_trace(go.Scatter(x=nwm_df['Date_Time'], y=nwm_usgs_df['USGS_cfs'], name="USGS Station "+ str(gage_id),
            #             line = dict(color='blue', width=1.5 )))
flow_fig.add_trace(go.Scatter(x=nwm_df['Date_Time'], y=nwm_df['NWM_cfs'], name="NWM Reach "+str(reach_id),
                         line = dict(color='blue', width=1.5)))

flow_fig.update_layout(title="Daily streamflow at " +str(reach_id), 
                  yaxis_title='Discharge (cfs)')

flow_fig.show()