# <ins>DOE WPTO Wave Hindcast Data Access</ins>

This notebook provides examples on accessing the 32-year Wave Hindcast dataset with brief examples on using the data within MHKiT and the SAMs tool. This data was generated via a collaboration between Pacific Northwest National Lab, Sandia National Lab, and the National Renewable Energy Lab and is hosted from Amazon Web Services using the HDF Group's Highly Scalable Data Service (HSDS) for public access.

This dataset is continually growing and will contain data for all US territorial waters when complete.

## <ins>Dataset Description</ins>

The current available data spans the US West Coast spatially with >200 m resolution. The spatial domain extends out the EEZ, 200 nm and has been separated into federal and state controlled regions, accessable through the meta-data. The time step resolution for this spatial data is 3-hours and spans 32 years, 01/01/1979 - 12/31/2010. 

Virtual buoy dataset are also available at specific locations within the large spatial domain. These virtual buoys span the same 32-years of the spatial dataset however the time resolution is reduced to 1-hour. In addition to the shorter timestep, directional spectrum data is included in these dataset and is available.

A list of the available variables in the spatial and virtual buoy datasets are listed below. 

## <ins>Dataset Variables</ins>

### <ins>Spatial Dataset:</ins>

Spatial datasets are located on AWS with the path: "/nrel/US_wave/US_wave_${year}.h5"

#### West Coast Region:

This spatial dataset is indexed by time and latitude/longitude coordinates ('time_index','coordinates'):

- directionality_coefficient
- energy_period
- maximum_energy_direction
- mean_absolute_period
- mean_zero-crossing_period
- omni-directional_wave_power
- peak_period
- significant_wave_height
- spectral_width
- water_depth

### <ins>Virtual Buoy Dataset</ins>

Virtual Buoy datasets are located on AWS with the path: "/nrel/US_wave/virtual_buoy/US_virtual_buoy_${year}.h5"

#### West Coast Region:

These virtual buoys are indexed by time and buoy latitude/longitude coordinates ('time_index','coordinates'). Directional variables have the extra indices of direction and frequency ('time_index','frequency','direction','coordinates')

- directional_wave_spectrum
- directionality_coefficient
- energy_period
- maximum_energy_direction
- mean_absolute_period
- mean_wave_direction
- mean_zero-crossing_period
- omni_directional_wave_power
- peak_period
- significant_wave_height
- spectral_width
- water_depth


## <ins>Installing and Configuring the HSDS service</ins>

For this example to work, the packages necessary can be installed via pip:

pip install -r requirements.txt

Next you'll need to configure h5pyd to access the data on HSDS:

hsconfigure

and enter at the prompt:

hs_endpoint = https://developer.nrel.gov/api/hsds   
hs_username = None   
hs_password = None   
hs_api_key = 3K3JQbjZmWctY0xmIfSYvYgtIcM3CN0cb1Y2w9bf    

The example API key here is for demonstation and is rate-limited per IP. To get your own API key, visit https://developer.nrel.gov/signup/

You can also add the above contents to a configuration file at ~/.hscfg

Finally, you can use Jupyter Notebook or Lab to view the example notebooks depending on your preference

## <ins>Access the HSDS server</ins>

rex enables the efficient and scalable extraction, manipulation, and computation with NRELs flagship renewable resource datasets: the Wind Integration National Dataset (WIND Toolkit), and the National Solar Radiation Database (NSRDB). Development of functionality with the WPTO Wave Hindcast dataset is currently ongoing. MHKiT also leverages rex to access the WPTO Wave data to enable streamlined integrationg into that analysis ecosystem

The WPTO Wave Hindcast Dataset is provided in annual .h5 files.

Each year can be accessed from /nrel/US_wave/US_wave_${year}.h5

To open the desired year of WPTO Wave Hindcast data server endpoint, username, password is found via a config file.



In [None]:
# quick View of metadata, time_index, and coordinates 
from rex import WaveX

waveFile = f'/nrel/US_wave/US_wave_1990.h5'

with WaveX(waveFile, hsds=True) as waves:
    meta = waves.meta          ## meta is an object that contains location information for each data point
    print("meta =", meta)
    time_index = waves.time_index # time_index contains all of the years within the file
    print("time_index =",time_index)
    coordinates = waves.coordinates # coordinates contains all the lat/lon pars within the dataset
    print("coordinates =",coordinates)
    

# <ins>Basic Usage of the Spatial Dataset</ins>

The following examples illustrate basic examples using NREL-rex to access and download parts of the WPTO Wave Hindcast dataset.

## <ins>Accessing the Datasets</ins> 

Below are some example of how you can access data using the NREL-rex Python package.
Datasets are returned as Pandas objects. See the Pandas [documentation](https://pandas.pydata.org/pandas-docs/stable/) 
for more information about working with Pandas objects.  

### Extracting data from a single site:
A single lat/lon pair can be given to extract data nearest to that location. 

In [None]:
# Extract the timeseries of significant_wave_height closest to a coordinate pair
from rex import WaveX

with WaveX(waveFile, hsds=True) as waves:
    lat_lon = (44.624076,-124.280097)
    parameters = 'significant_wave_height'
    swh_single = waves.get_lat_lon_df(parameters, lat_lon)
swh_single

### Extracting data from multiple sites:
A list of latitude/longitude pairs can be passed to extract data from multiple sites

In [None]:
# Extract the timeseries of significant_wave_height closest to multiple coordinate pairs
from rex import WaveX

with WaveX(waveFile, hsds=True) as waves:
    lat_lon = [(44.624076,-124.280097),(43.489171,-125.152137)] # set lat_lon to a list of lat/lon pairs
    parameters = 'significant_wave_height'
    swh_multi = waves.get_lat_lon_df(parameters, lat_lon)
swh_multi

### Extract all data within a specified region: 

Data can be extracted from a particular region based on jurisdiction regions found in the meta object. 

In [None]:
# Extract the timeseries of significant_wave_height in a region specified in the meta data
from rex import WaveX

with WaveX(waveFile, hsds=True) as waves:
    region = 'Oregon'                     # Specify Oregon as the jurisdiction region 
    region_col = 'jurisdiction'           # specify jurisdiction as the meta object column you are searching against
    variables = 'significant_wave_height'
    swh_map = waves.get_region_df(variables, region, region_col=region_col)
swh_map


### Extracting Data over Multiple Years:
Data can be extracted over a number of years and concatonated directly using NREL-rex functionality.
The new multi-year object has the same functionality as a single year dataset. The datasets are concatonated into a single timeseries for convenience 

In [None]:
# Concatonate yearly significant wave height datasets into a single timeseries
from rex import MultiYearWaveX # Yearly concatonation requires the MultiYearResource function

multi_year_waves = f'/nrel/US_wave/US_wave_199*.h5' # file names can now accept Wildcards to specify which files to load, lists of filenames with specific years are also supported

with MultiYearWaveX(multi_year_waves,hsds=True) as mYears:
    lat_lon = (44.624076,-124.280097)
    parameters = 'significant_wave_height'
    swh_multi_year = mYears.get_lat_lon_df(parameters, lat_lon)
    
swh_multi_year
    

This functionality extends to multiple locations and regions

In [None]:
# Concatonate yearly significant wave height datasets for multiple llocations into a single timeseries
from rex import MultiYearWaveX 

multi_year_waves = f'/nrel/US_wave/US_wave_199*.h5' 

with MultiYearWaveX(multi_year_waves, hsds=True) as waves:
    lat_lon = [(44.624076,-124.280097),(43.489171,-125.152137)] # set lat_lon to a list of lat/lon pairs
    parameters = 'significant_wave_height'
    swh_multi = waves.get_lat_lon_df(parameters, lat_lon)

swh_multi

# <ins>Basic Usage of the Virtual Buoys</ins>

Virtual buoys were created during the larger UnSWAN model runs and contain 1-hour timestep data with directional spectrum data

Virtual Buoy dataset fill names follow the convention: "/nrel/US_wave/virtual_buoy/US_virtual_buoy_{year}.h5"

In [None]:
# View the metadata and time_index for the virtual buoy datasets
from rex import WaveX

vBuoy = f'/nrel/US_wave/virtual_buoy/US_virtual_buoy_1995.h5'

with WaveX(vBuoy, hsds=True) as waves:
    meta = waves.meta          ## meta is an object that contains location information for each data point
    print("meta =", meta)
    time_index = waves.time_index # time_index contains all of the years within the file
    print("time_index =",time_index)
    

#### The Virtual Buoy datasets use the same functions which are used to select dataset from the spatial dataset, thus the previous examples apply to the virtual buoys.
#### An example of gathering multi-year data for a single buoy is provided below

In [None]:
# Download and concatonate multiple years of significant wave height data from teh virtual buoy dataset
from rex import MultiYearWaveX

multi_year_vBuoy = f'/nrel/US_wave/virtual_buoy/US_virtual_buoy_199*.h5'

with MultiYearWaveX(multi_year_vBuoy,hsds=True) as mYears:
    lat_lon = (44.624076,-124.280097)
    parameters = 'significant_wave_height'
    swh_buoy = mYears.get_lat_lon_df(parameters, lat_lon)
    
swh_buoy

# <ins>Integration with MHKiT</ins>

Functionality to read the WPTO hindcast data is currently being integrated into [MHKiT](https://mhkit-software.github.io/MHKiT/), and will be avaiable to users soon. The following examples show how you will be able to use MHKiT to access the dataset once the functions are integrated into MHKiT. In addition to the Python functions demonstrated below, MHKiT-MATLAB functions will also be available with the same functionality.  

Note: The following examples currently will NOT run on your local machine. 

In [1]:
# Importing local development version of MHKiT. This code will NOT work on local machines. 
import sys
sys.path.insert(1, '/mnt/c/Users/abharath/Documents/Projects/MHKit/')

### Extracting data from a single site:
A single lat/lon pair can be given to extract data nearest to that location. 

In [None]:
# Extract the timeseries of significant_wave_height closest to a coordinate pair
from mhkit.wave import io

single_year_waves = f'/nrel/US_wave/US_wave_1995.h5'
lat_lon = (44.624076,-124.280097)
parameters = 'significant_wave_height'

wave_singleYear = io.read_US_wave_dataset(single_year_waves,parameters,lat_lon)

wave_singleYear

### Extracting data from multiple sites:
A list of latitude/longitude pairs can be passed to extract data from multiple sites

In [None]:
# Extract the timeseries of significant_wave_height closest to multiple coordinate pairs
from mhkit.wave import io

single_year_waves = f'/nrel/US_wave/US_wave_1995.h5'
lat_lon = [(44.624076,-124.280097),(43.489171,-125.152137)]
parameters = 'significant_wave_height'

wave_multiLocal = io.read_US_wave_dataset(single_year_waves,parameters,lat_lon)

wave_multiLocal

### Extracting Data over Multiple Years:
Data can be extracted over a number of years and concatonated directly with the MHKiT tools.
The new multi-year object has the same functionality as a single year dataset. The datasets are concatonated into a single timeseries for convenience 

In [None]:
# Extract and concatonate a multi-year the timeseries of significant_wave_height closest to a coordinate pair
from mhkit.wave import io

multi_year_waves = f'/nrel/US_wave/US_wave_199*.h5'
lat_lon = (44.624076,-124.280097)
parameters = 'significant_wave_height'

wave_multiYear = io.read_US_wave_dataset(multi_year_waves,parameters,lat_lon)

wave_multiYear

In a similar way to the pervious virtual buoy examples using the rex package, Virtual buoy data can be accessed through MHKiT-Python simply by selecting the correct file location

# <ins>Working with Dataset Bulk Parameters</ins>

In the following example we provide a simple means to view the data from the WPTO Wave Hindcast Datasets

In [2]:
# pull datasets from AWS
from mhkit.wave import io

dataset = f'/nrel/US_wave/US_wave_1995.h5'
lat_lon = (44.624076,-124.280097)
swh_name, owp_name = 'significant_wave_height', 'omni-directional_wave_power'

# First we use the MHKiT loading function to grab the datasets from AWS 
swh = io.read_US_wave_dataset(dataset,swh_name,lat_lon)
owp = io.read_US_wave_dataset(dataset,owp_name,lat_lon)

In [5]:
%config InlineBackend.figure_format ='retina'
%matplotlib widget

from mpl_toolkits.axes_grid1 import host_subplot
import matplotlib.pyplot as plt

host = host_subplot(111)
par = host.twinx()

host.set_xlabel("Time")
host.set_ylabel(swh_name)
par.set_ylabel(owp_name)

p1, = host.plot(swh.index, swh.values, label=swh_name)
p2, = par.plot(owp.index, owp.values, label=owp_name)

leg = plt.legend()

host.yaxis.get_label().set_color(p1.get_color())
leg.texts[0].set_color(p1.get_color())

par.yaxis.get_label().set_color(p2.get_color())
leg.texts[1].set_color(p2.get_color())

plt.title('WPTO Wave Hindacast Data for 1995')
plt.tight_layout()
plt.show()

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

# <ins>Accessing and Working with the Virtual Buoy Direction Wave Spectum</ins>



In [7]:
# pull datasets from AWS
from mhkit.wave import io

dataset = f'/nrel/US_wave/virtual_buoy/US_virtual_buoy_1995.h5'
lat_lon = (44.624076,-124.280097)
spc_name = 'directional_wave_spectrum'

# First we use the MHKiT loading function to grab the datasets from AWS 
spc = io.read_US_wave_dataset(dataset,spc_name,lat_lon)
spc

Exception: Data must be 1-dimensional

In [None]:
spc.to_pickle('../spc_example_dataset.pkl')

#### Capture Length and Power Matrices Calculations with MHKiT

The WPTO hindcast data and MHKiT can be used to compute capture length and power matrices. The following example demonstrates this workflow using a synthetically generated power dataset. 

In [None]:
from pandas import Series
from numpy.random import seed, normal

owp = 'omni-directional_wave_power'
J = io.read_US_wave_dataset(single_year_waves,owp,lat_lon)

# Set the random seed, to reproduce results
seed(1)                                               
# Generate random power values
P = Series(normal(200, 40, J.shape[0]),index = J.index)

In [None]:
from mhkit.wave.performance import wave_energy_flux_matrix, capture_length, capture_length_matrix, power_matrix
from numpy import arange

Te = io.read_US_wave_dataset(single_year_waves,'energy_period',lat_lon)
Hm0 = io.read_US_wave_dataset(single_year_waves,'significant_wave_height',lat_lon)

# Calculate capture length
L = capture_length(P, J) 

# Generate bins for Hm0 and Te, input format (start, stop, step_size)
Hm0_bins = arange(0, Hm0.values.max() + .5, .5)    
Te_bins = arange(0, Te.values.max() + 1, 1)

# Create capture length matrix using a mean
LM = capture_length_matrix(Hm0, Te, L, 'mean', Hm0_bins, Te_bins)
# Create wave energy flux matrix using mean
JM = wave_energy_flux_matrix(Hm0, Te, J, 'mean', Hm0_bins, Te_bins)

# Create power matrix using mean
PM = power_matrix(LM, JM)

In [None]:
from mhkit.wave.graphics import plot_matrix
# Plot the Plot mean matrix
ax = plot_matrix(PM)
