Add WW3 Stations Spec reader #92

mpiannucci · 2023-08-21T01:07:16Z

This PR adds support for reading WW3 Station Spectral Wave Files. These files are distributed with GFS Wave output by NOAA. An example location of these files is here: https://noaa-gfs-bdp-pds.s3.amazonaws.com/index.html#gfs.20230820/18/wave/station/bulls.t18z/ . They are also available from NOMADS and GCP.

Because these files are available from cloud providers, the reader accepts a file like object instead of a filename to parse out the dataset.

Example usage with fsspec:

import wavespectra as ws
import fsspec

fs = fsspec.filesystem("s3", anon=True)
with fs.open('s3://noaa-gfs-bdp-pds/gfs.20230820/18/wave/station/bulls.t18z/gfswave.44097.spec') as file:
    ds = ws.read_ww3_station(file)

ds.spec

…raction

rafa-guedes · 2023-08-21T21:19:05Z

This looks great @mpiannucci thanks for this PR!

Minor comments before merging this, shall we also support files specified by strings rather than a file object in case those are available locally? I think it would be nice in fact to set this up more broadly for other file types as well as you suggested in #91. Also, could you please run black in the new code you have added, we use it to ensure all the code reads the same.

mpiannucci · 2023-08-21T22:00:43Z

Thanks for the feedback!

I will make those changes and then take a stab at making the changes to support fsspec more broadly.

wavespectra/input/__init__.py

…black

mpiannucci · 2023-08-22T01:07:55Z

I took care of the low hanging fruit, but the methods that expect file globs for ascii data are a decent bit harder

rafa-guedes · 2023-08-22T04:24:48Z

wavespectra/input/__init__.py

+            dset = xr.open_mfdataset(
+                filename_or_fileglob, chunks=_chunks, combine="by_coords"
+            )
+        except ValueError:


I get an IndexError rather than a ValueError when trying to open a file-like object with xr.open_mfdataset - do you get ValueError in wavespectra?

This gives me a ValueError, saying that it does not know which engine to use because it is just a file like object passed in.

import xarray as xr with open('./sample_files/swanfile.nc', 'rb') as file: xr.open_mfdataset(file, chunks={}, combine="by_coords")

If I specify the engine, that seems to give me a ValueError as well:

with open('./sample_files/swanfile.nc', 'rb') as file: xr.open_mfdataset(file, chunks={}, combine="by_coords", engine="netcdf4")

gives:

ValueError: can only read bytes or file-like objects with engine='scipy' or 'h5netcdf'

Switching to h5netcdf also throws a ValueError for me:

ValueError: can't open netCDF4/HDF5 as bytes try passing a path or file-like object

Which doesnt totally make sense, but throwing an error does. I am unable to get an IndexError, can you show me the code to reproduce the IndexError? I'm also happy to just catch any exception, but better to be explicit

mpiannucci · 2023-08-24T03:41:11Z

I am not sure if the error is messed up between versions of xarray or netcdf, but for now I just changed it to Exception. If that unacceptable, please let me know.

Otherwise I think this is as far as I want to refactor for this PR, it doesnt remove any existing functionality, but adds flexibility to many of the readers.

* Split open_netcdf into a different function so it can be reused, only raise ValueError and IndexError * New read_ascii_or_binary function to read str or fileobj, change input functions to use these generic openers when possible

mpiannucci added 7 commits August 20, 2023 10:29

Add first pass at parsing gfs wave station output without dataset ext…

c518102

…raction

Working spectral parse

27f7760

Get parsing working but spec values are incorrect

0f63cee

Working ww3 station reader

bba3eac

Add more tests

6ff4e7d

Cut down sample file size

d30834e

Appease flake8

a387978

mpiannucci changed the title ~~Add WW3 Stations reader~~ Add WW3 Stations Spec reader Aug 21, 2023

mpiannucci added 4 commits August 21, 2023 18:26

Run black

3b04a22

Change netcdf reading method to support fileobjs

4faebb9

Get netcdf working with fileobjs or string inputs

9bdbac1

Ad dhelper method to check input type

fd80e75

mpiannucci commented Aug 21, 2023

View reviewed changes

wavespectra/input/__init__.py Outdated Show resolved Hide resolved

mpiannucci added 2 commits August 21, 2023 20:31

Update netcdf loading method, update extra dependencies, format with …

34b20f5

…black

More file read refactor, run black

061aa0e

mpiannucci added 3 commits August 21, 2023 21:40

fsspeccify octopus input

e7dde42

Cleanup

a8f8812

Add doc for fsspec capability with netcdf files

d1448d2

rafa-guedes reviewed Aug 22, 2023

View reviewed changes

CHange exception to catch all in nc loader

549ecc8

Generic open functions (#1)

2b1b1c9

* Split open_netcdf into a different function so it can be reused, only raise ValueError and IndexError * New read_ascii_or_binary function to read str or fileobj, change input functions to use these generic openers when possible

rafa-guedes merged commit 7ce49c6 into wavespectra:master Aug 28, 2023

rafa-guedes mentioned this pull request Aug 28, 2023

Improve compatibility with fsspec #91

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add WW3 Stations Spec reader #92

Add WW3 Stations Spec reader #92

mpiannucci commented Aug 21, 2023

rafa-guedes commented Aug 21, 2023

mpiannucci commented Aug 21, 2023

mpiannucci commented Aug 22, 2023 •

edited

rafa-guedes Aug 22, 2023

mpiannucci Aug 22, 2023

mpiannucci commented Aug 24, 2023

Add WW3 Stations Spec reader #92

Add WW3 Stations Spec reader #92

Conversation

mpiannucci commented Aug 21, 2023

rafa-guedes commented Aug 21, 2023

mpiannucci commented Aug 21, 2023

mpiannucci commented Aug 22, 2023 • edited

rafa-guedes Aug 22, 2023

Choose a reason for hiding this comment

mpiannucci Aug 22, 2023

Choose a reason for hiding this comment

mpiannucci commented Aug 24, 2023

mpiannucci commented Aug 22, 2023 •

edited