Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022-12-09 Xpublish & ZarrDAP meeting notes #138

Open
abkfenris opened this issue Dec 9, 2022 · 5 comments
Open

2022-12-09 Xpublish & ZarrDAP meeting notes #138

abkfenris opened this issue Dec 9, 2022 · 5 comments
Labels

Comments

@abkfenris
Copy link
Member

abkfenris commented Dec 9, 2022

On 2022-12-09 we met to discuss various Xarray based data server projects. Discussion post announcing meeting

Purpose: Meetup to discuss progress and plans for OpenDAP, WMS and other API layers on top of the Xarray/Dask (aka Pangeo) Python stack, such as:

  • Xpublish
    • xreds built on top of Xpublish
  • ZarrDAP
    • Implements OPeNDAP and a custom HTML ZarrDAP Catalog, from which it generates an Intake catalog.

Attendees:

  • Rich Signell / USGS / @rsignell-usgs
  • Alex Kerney / Gulf of Maine Research Institute & NorthEast Regional Association of Coastal and Ocean Observing Systems / @abkfenris
  • Anthony Aufdenkampe / LimnoTech / @aufdenkampe
    • Helping USGS NHGF to configure pygeoapi-edr (+ZarrDAP or Xpublish) against the same stac to document XYZT zarr data in S3
  • Joe Hamman / Earthmover / @jhamman
    • started Xpublish
  • Filipe Fernandes / IOOS / @ocefpaf
  • Don Setiawan / UW OOI Regional Cabled Array @lsetiawan
  • Jonathan Joyce / RPS Group / @jonmjoyce
  • Matthew Iannucci / RPS Group / @mpiannucci
  • Dave Blodgett / USGS Water /
  • Andrew Buddenberg / NOAA/NCEI
    • thinks he's in charge of ZarrDAP now
  • Shane Mill / NOAA/NWS / @ShaneMill1
  • Steve Olson / NOAA/NWS / @solson-nws
    • Implementing EDR
  • Jon Blower / National Oceanography Centre, UK / @jonblower
  • Chad Whitney / NOAA/NCEI
  • Paul Tomasula / LimnoTech / @ptomasula
  • Sarah Jordan / LimnoTech / @sjordan29
  • Xavier Nogueira / LimnoTech / @xaviernogueira
  • Dave Stuebe
  • Michah Wengren / IOOS / @mwengren
  • Patrick Tripp / RPSgroup / @patrick-tripp

Agenda & Notes

  • Intros
    • (Go around by order in attendee list, probably 1-3 min each)
    • who are you, where do you work, background in the space.
  • Why are you/org intrested in working on Discussion & Python
    • Xpublish (Matt)): need a THREDDS replacement (not cloud-ready) data servers?
    • ZarrDAP
      • Chad: Andrew just open-sourced ZarrDAP, but introduced a bug that they need to fix
      • Andrew: We're tired of THREDDS
      • Dave B: THREDDS team is well-aware of thes issues.
        • THREDDS team taking apart to build microservices from allll THREDDS functionality
        • Key issue with THREDDS is cost of S3 egress fees
        • We need ...
    • PyGeoAPI-EDR
      • Shane building AWS scaling capabilities, which he wants to contribute to PyGeoAPI-EDR
        • AWS API Gateway + Lambda & Fargate, reaching out to ECS.
    • Xpublish update from Joe.
      • Very open to others working on it. Such as Benoit Bovey
      • Could still benefit from more active developers
      • We need example arcitectures that use Xpublish
      • Perhaps a router plugin interface would be useful
    • (similar round robin)
  • What are folks working on?
    • (we can start round robin, but this can move into more of a discussion, we will want to keep moving so we don't get bogged down in any one avenue of work)
    • Demos?
  • How can we work together, rather than duplicate each others efforts?
    • Can XPublish & ZarrDAP efforts or codebase be "merged"?
      • Matt: interesting to see that Xpublish & ZarrDAP seem to have almost identical approaches for accessing the data despite being developed totally independently
    • Alex's vision for Xpublish
      • Make it modular. Maybe a core/plugin/distro interface
      • Andrew: Are you suggesting that ZarrDAP be rewritten to be plugin to XPublish? Alex: maybe...
        • Alex: I've made a very alpha OpenDAP Xpublish router ( https://github.com/gulfofmaine/xpublish-opendap ), but you've tested it much more. I'm thinking that you refactor onto Xpublish and adapt your data loading into xpublish.get_dataset. It also means that as we create new Xpublish routers, you can get those for free
    • Caching discussion
      • Dave S: demo of real-time Forecast Model Run Collection (FMRC) for HRRR, with caching using fsspec 'simplecache' command
      • Will post PR for adding the core parts of the HRRR aggregation to https://github.com/asascience-open/nextgen-dmac

Action items

  • Move conversation to XPublish repo, which is followed a bunch of additional people not on this call.
  • Try to get a regular meeting going. Possibly under the Pangeo umbrella?
@abkfenris
Copy link
Member Author

I added my thoughts on how do we make an Xpublish ecosystem based on Xpublish as a core, routers and data loaders as plugins, and opinionated collections as distros, over in the discussions: #139

@abkfenris
Copy link
Member Author

We had a few similar demos, so I didn't share this so we could keep the discussion going. Here is a new model viewer that I've been working on for NERACOOS that's focused on informal users, think fisherman, sailors, surfers. It's currently using Xpublish and an EDR router for time series data, and THREDDs for WMS (though I intend to try swapping in @mpiannucci 's router). It uses STAC to identify what layers are available, and what services.

I also experimented with using Xpublish to add new data services to existing servers. https://xpublish.onrender.com/docs & https://github.com/abkfenris/xpublish-erddap uses the awesome-erddap server list to add EDR and zarr endpoints to most existing ERDDAP datasets other there that are served by GridDAP.

@rsignell-usgs
Copy link
Contributor

@abkfenris thanks so much for suggesting this meeting and for the great notes! That model viewer is very cool, and I look forward to seeing use of the WMS service that @mpiannucci demonstrated today.

Did you use xstac to create the STAC records for your model data? The construction of that viewer would make a nice blog post!

@abkfenris
Copy link
Member Author

@rsignell-usgs other folks took most of the notes, I just made the structure and cleaned it up.

I've been manually generating STAC items so far, though it looks like I might be able to replace some of the more problematic parts of my workflow with xstac.

@abkfenris
Copy link
Member Author

It sounds like folks have made a bunch of cool things since we last met, so lets find a time to meet again: #172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants