# Visualizing CONUS404 and reference data 
 
 Author: Hannah Podzorski, USGS
 Date: 2024-04-03
 
<img src='../../../doc/assets/Eval_Viz.svg' width=600>
The purpose of visualization notebooks is to look at data in pretty ways.

<details>
  <summary>Guide to pre-requisites and learning outcomes...&lt;click to expand&gt;</summary>
  
  <table>
    <tr>
      <td>Pre-Requisites
      <td>To get the most out of this notebook, you should already have an understanding of these topics: 
        <ul>
        <li>pre-req one
        <li>pre-req two
        </ul>
    <tr>
      <td>Expected Results
      <td>At the end of this notebook, you should be able to: 
        <ul>
        <li>outcome one
        <li>outcome two
        </ul>
  </table>
</details>

In [10]:
# library imports
import os
import cf_xarray
import dask
from dask.distributed import LocalCluster, Client
import fsspec 
os.environ['USE_PYGEOS'] = '0'
import geopandas as gpd
# import hvplot.pandas
import hvplot.xarray
import intake
import math
import numpy as np
import pandas as pd
import pygeohydro
import sparse 
import warnings
import xarray as xr

from shapely.geometry import Polygon

warnings.filterwarnings('ignore')


## **Start a Dask client** 
This is an optional step, but can speed up data loading significantly, especially when accessing data from the cloud.

In [9]:
if "client" in locals():
    print("Shutting down existing Dask cluster.")
    cluster.close()
    client.close()

cluster = LocalCluster()
client = Client(cluster)
print(f"The Dask dashboard link is {client.dashboard_link}")

Shutting down existing dask cluster.
The Dask dashboard link is http://127.0.0.1:8787/status


## Accessing already prepared CONUS404 data from OSN using `intake`

Datasets are brought into the notebook using Dask through a couple of steps. 

First, the entry (prism-drb-OSN) in the catalog (conus404_drb_cat) is indexed and the method `to_dask` will automatically load the data from the catalog entry. See below.

In [11]:
# connect to HyTEST catalog
url = 'https://raw.githubusercontent.com/hytest-org/hytest/main/dataset_catalog/hytest_intake_catalog.yml'
cat = intake.open_catalog(url)

# access tutorial catalog
conus404_drb_cat = cat["conus404-drb-eval-tutorial-catalog"]
list(conus404_drb_cat)

['c404-ceres-drb-desc-stats-OSN',
 'c404-crn-drb-desc-stats-OSN',
 'c404-drb-zonal-OSN',
 'c404-hcn-drb-desc-stats-OSN',
 'c404-prism-drb-desc-stats-OSN',
 'ceres-drb-OSN',
 'ceres-drb-zonal-OSN',
 'conus404-drb-OSN',
 'crn-drb-OSN',
 'crn-drb-point-OSN',
 'hcn-drb-OSN',
 'hcn-drb-point-OSN',
 'prism-drb-OSN',
 'prism-drb-zonal-OSN']

Let's get a description of what each data set is. 

In [20]:
for item in  list(conus404_drb_cat):
    descr = conus404_drb_cat[item].description
    print(f"{item}: {descr}\n")

c404-ceres-drb-desc-stats-OSN: Descriptive statistics for the comparison of CONUS404 to CERES-EBAF

c404-crn-drb-desc-stats-OSN: Descriptive statistics for the comparison of CONUS404 to CRN

c404-drb-zonal-OSN: CONUS404 zonal statistics of Delware River Basin

c404-hcn-drb-desc-stats-OSN: Descriptive statistics for the comparison of CONUS404 to HCN

c404-prism-drb-desc-stats-OSN: Descriptive statistics for the comparison of CONUS404 to PRISM

ceres-drb-OSN: CERES-EBAF Delaware River Basin subset, 40 years of monthly data for CONUS404 forcings evaluation

ceres-drb-zonal-OSN: CERES-EBAF zonal statistics of Delware River Basin

conus404-drb-OSN: CONUS404 Delaware River Basin subset, 40 years of monthly data for CONUS404 forcings evaluation

crn-drb-OSN: Climate Reference Network subset, 40 years of monthly data for CONUS404 forcings evaluation

crn-drb-point-OSN: CRN and CONUS404 point statistics of Delware River Basin

hcn-drb-OSN: Historical Climate Network subset, 40 years of monthly 

In [21]:
conus404_drb_zonal = conus404_drb_cat['c404-drb-zonal-OSN'].read()
conus404_drb_zonal

Unnamed: 0,huc6,time,PREC_NC_ACC,RNET,TK
0,020401,1980-01,51.166572,8.634619,267.390965
1,020401,1980-02,39.551063,37.497082,265.023723
2,020401,1980-03,180.614316,71.697241,271.726642
3,020401,1980-04,133.649421,117.075991,279.958756
4,020401,1980-05,50.195687,167.817963,287.487071
...,...,...,...,...,...
1023,020402,2022-06,98.177848,191.787350,295.226971
1024,020402,2022-07,78.151746,180.768256,299.089277
1025,020402,2022-08,96.719828,159.768490,298.274521
1026,020402,2022-09,63.366276,102.287688,293.287686


In [13]:
prism_drb = conus404_drb_cat['prism-drb-OSN'].to_dask()
prism_drb

Unnamed: 0,Array,Chunk
Bytes,17.37 MiB,16.58 MiB
Shape,"(495, 92, 50)","(492, 92, 48)"
Dask graph,4 chunks in 2 graph layers,4 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 17.37 MiB 16.58 MiB Shape (495, 92, 50) (492, 92, 48) Dask graph 4 chunks in 2 graph layers Data type float64 numpy.ndarray",50  92  495,

Unnamed: 0,Array,Chunk
Bytes,17.37 MiB,16.58 MiB
Shape,"(495, 92, 50)","(492, 92, 48)"
Dask graph,4 chunks in 2 graph layers,4 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,8.69 MiB,8.29 MiB
Shape,"(495, 92, 50)","(492, 92, 48)"
Dask graph,4 chunks in 2 graph layers,4 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 8.69 MiB 8.29 MiB Shape (495, 92, 50) (492, 92, 48) Dask graph 4 chunks in 2 graph layers Data type float32 numpy.ndarray",50  92  495,

Unnamed: 0,Array,Chunk
Bytes,8.69 MiB,8.29 MiB
Shape,"(495, 92, 50)","(492, 92, 48)"
Dask graph,4 chunks in 2 graph layers,4 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
