Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: ssebop #262

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Conversation

thodson-usgs
Copy link

@thodson-usgs thodson-usgs commented Nov 16, 2023


name: Recipe
about: Demonstrating pangeo forge pipeline to USGS.
title: SSEBOP

Dataset

SSEBOP is an evapotranspiration dataset covering CONUS at 1km2-daily resolution.

@thodson-usgs
Copy link
Author

Hmm, I was following the instructions from the README, but perhaps I should have created the issue before the PR

@thodson-usgs
Copy link
Author

I've so far been unable to run this using pange-forge-runner --prune with Direct Runner.
The run spits out lots of output ending with grpc.FutureTimeoutError.
Not sure if this a problem with a recipe, my environment, or my hardware.

@norlandrhagen
Copy link
Contributor

Hey @thodson-usgs, taking a look at running your recipe locally. Are you on the ESIP slack?

@thodson-usgs
Copy link
Author

@norlandrhagen,
Yes, I'll follow up with you there.

For the record, you identified an error in my recipe; however, my run crashes before that point so I probably need to take a closer look at my configuration file.

Thanks!

@norlandrhagen
Copy link
Contributor

Nice fix! I'm now running into:

AttributeError: 'ZipExtFile' object has no attribute 'size' [while running 'Create|OpenURLWithFSSpec|Preprocess|StoreToZarr/OpenURLWithFSSpec/MapWithConcurrencyLimit/open_url (max_concurrency=1)']

@thodson-usgs
Copy link
Author

Interesting,
I'd been getting an error about opening the zip, but not that one.
In general, I've been testing on the several environments on hand: Ubuntu on WSL2, ESIP-nebari, and HPC. Each one gives a unique error...smells like an environment issue.

Next steps:

  1. I'll set max_concurrency=1 and if that fails avoid fsspec entirely and open the zip url directly with rioxarray.
  2. Try this all on a clean Ubuntu machine.

One question, what type of system are you testing with?

And thank you again, @norlandrhagen

@norlandrhagen
Copy link
Contributor

Ah strange! Happy to help further. I'm on an m1 mac. I'm creating a conda/mamba env and installing pangeo-forge-recipes there + rioxarray.

@thodson-usgs
Copy link
Author

Progress,
I don't understand why OpenURLWithFSSpec failed (this all worked fine when I tested with fsspec), but I can open the zipped TIFs directly from rioxarray.

Now I get
AttributeError: 'Dataset' object has no attribute 'encode' [while running 'Create|Preprocess|StoreToZarr/Preprocess/Map(_preproc)']

Maybe it's time to wade a bit deeper into Beam...

@thodson-usgs
Copy link
Author

thodson-usgs commented Nov 26, 2023

I changed one line and now the recipe runs without error.

def _preproc(item: Indexed[T]) -> Indexed[T]:
to
def _preproc(item: Indexed[T]) -> Indexed[xr.Dataset]:

At the next pangeo-forge meeting I'll follow up on why fsspec didn't work.

@ranchodeluxe ranchodeluxe changed the title WIP: ssebop Unknown: ssebop Jan 8, 2024
@ranchodeluxe ranchodeluxe changed the title Unknown: ssebop WIP: ssebop Jan 8, 2024
@ranchodeluxe
Copy link

sorry about that title change foobar ☝️ @thodson-usgs 😆 I am going to try to run this on my cluster as a data point and was creating a ticket of a similar name in a different tab

@thodson-usgs
Copy link
Author

@ranchodeluxe, this recipe was a bit of a test point for us as well. USGS has a legacy of zipping tiffs, and I was demonstrating that pangeo-forge could handle that pattern. We did get it working, but it might have exposed another bug (pangeo-forge/pangeo-forge-recipes#659). And then I got sidetracked working on the flink runner. Feel free to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants