## Retrieving NWM retrospective streamflow simulations and USGS observations

* This code retrieves streamflow from the NWM retrospective dataset (v2.1 or v3.0.) for defined stream reaches, as well as the corresponding USGS streamflow observations. Outputs are saved as CSV files. <br>
* NWM Retrospective data is located at:
https://registry.opendata.aws/nwm-archive/



In [None]:
import os
import pandas as pd
import xarray
import s3fs
from hydrotools.nwis_client.iv import IVDataService

**Define stations ID and comids of interest**

In [None]:
# Read the sites of interest
sitesPath = './Input/'
savePath = './Output/'

sites_loc = pd.read_csv(f'{sitesPath}SelStn_Q.csv',dtype={'site_no': 'string'})
stations = sites_loc['site_no'].values.tolist()
reaches = sites_loc['comid'].values.tolist()

**Retrieve NWM retrospective streamflow data**

In [None]:
# The following is the path in Amazon Web Services (AWS) where the NWM retrospective dataset lives
s3_path = 's3://noaa-nwm-retrospective-2-1-zarr-pds/chrtout.zarr' #v2.1
#s3_path = 's3://noaa-nwm-retrospective-3-0-pds/CONUS/zarr/chrtout.zarr' #v.3.0

s3 = s3fs.S3FileSystem(anon=True)
store = s3fs.S3Map(root=s3_path, s3=s3, check=False)

# Reads the CHRTOUT dataset and stores it in 'ds_zarr'
ds_zarr = xarray.open_zarr(store=store, consolidated=True)

In [None]:
%%time
## Get NWM retrospective data
#------------------------------------------
# NOTE: If time is not specified, it will retrieve the full retrospective data period
# Define the time you want to retrieve
timerange = slice('1996-01-01', '1997-01-01')

NWM_retro = ds_zarr.sel(feature_id=reaches, time = timerange).streamflow.persist()
NWM_retro = NWM_retro.to_pandas()
NWM_retro.index.name = 'value_time'

In [None]:
# Save NWM retrospective streamflow as CSV
NWM_retro.to_csv(f'{savePath}NWM_Qretro.csv')

**Retrieve USGS obs.**

In [None]:
%%time
# Get USGS obs
#----------------
service = IVDataService(
    value_time_label="value_time"
)
obs_data = service.get(
    sites= stations,
    startDT='1996-01-01',
    endDT='1997-01-01')

obs_data['value_cms'] = obs_data['value'] * (0.3048)**3 # Convert ft3/s to cms

In [None]:
# Save USGS obervations as CSV
obs_data.to_csv(f'{savePath}USGS_Qretro.csv')