New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update xesmf to versions >= 0.4.0 #2728
Conversation
@remi-kazeroni, could you possibly give the changed cmorizer a spin in an updated environment? |
I have just created a new environment based on this branch in which I have: # Name Version Build Channel
pangeo-xesmf 0.6.3 pypi_0 pypi
xesmf 0.6.3 pyhd8ed1ab_1 conda-forge but the cmorizer command for CDS-UERRA ( I have also tried the cmorizer with a fresh environment created from the main branch (i.e. with |
that's no bueno, the env now contains two essentially identical packages, but with different names - although the conda forge one should be the only one we want π€¦ββοΈ |
You are right, @valeriupredoi, that is quite unfortunate. I have just opened an issue on the xesmf feedstock for it: conda-forge/xesmf-feedstock#27. The other thing that @remi-kazeroni brought up is simply a bug. Sorry about that, will fix soon. |
good man, Klaus! I built the env locally too and just noticed the pangeo-xesmf unpacks and installs on top of the already-installed conda xesmf, inside the from-conda xesmf install location - that's a mess - I'll go π your issue there πΊ |
Do you expect to be able to fix this by friday? |
Possibly. I don't think it's very difficult, but the right people need to respond in time. If it's not fixed by tomorrow evening, it will not happen. In that case, I propose to go ahead with pinning to <=0.3.0. |
What about the bug that @remi-kazeroni reported? |
Working on it. It's not difficult, but downloading the rawdata locally takes forever. I am trying to test on levante now. |
Sorry, I realize that I have never updated the raw data for CDS-UERRA when working on the new cmorizer interface. Because of a minor permission issue, I can't change that right now. I have downloaded all the raw data in /work/bd0854/b309192/runs/obs_PR/RAWOBS/Tier3/CDS-UERRA and cmoized everything in /work/bd0854/b309192/runs/obs_PR/OBS/Tier3/CDS-UERRA. Feel free to use these data and sorry for the inconvenience! |
No worries. Thanks for the heads up, that should help! |
Ok, the bug should be fixed and I tested it on the data that @remi-kazeroni mentioned. @remi-kazeroni, could you please have a second look and confirm that the data is as it should? I'll try to sort out the xesmf mess then tomorrow before we can merge this. |
To make a decision here, we need a brief insight into how Pypi/pip/setuptools manage metadata vs how conda/mamba/conda-forge does it. Setuptools keeps metadata about installed packages (from pypi, via pip, or otherwise) in dist-info directories inside the normal site-packages directory, for example Conda keeps its metadata in two places, namely inside the environment directory in the conda-meta sub-directory as a collection of json files and in the package in the info subdirectory, for example To achieve some interoperability, conda-forge includes dist-info directories in the conda packages. The problem with xesmf is that the upstream project releases source distributions and wheels on Pypi with a distname of Conda-forge builds its packages from the github releases and consequently puts Since we install everything from conda-forge, I think this is the best way forward now, though we should definitely press upstream to resolve this situation. |
Can I ask why the regridding is using xesfm instead of the esmvalcore regridder? I am assuming this was written a long time ago. Doesn't the updated module use xesfm or something like that anyway? Maybe it would be the worth it to update the cmorizer to use our routines and forget about this dependency nightmares. |
Thanks for your efforts @zklaus. Note that I have moved the CDS-UERRA data to the shared RAWOBS and OBS on Levante to ease the work here. The regridding part seems to work fine indeed but the CMORization returns an error: Traceback (most recent call last):
File "/work/bd0854/b309192/soft/mambaforge/envs/test_xesmf/lib/python3.10/site-packages/iris/__init__.py", line 343, in load_cube
cube = cubes.merge_cube()
File "/work/bd0854/b309192/soft/mambaforge/envs/test_xesmf/lib/python3.10/site-packages/iris/cube.py", line 386, in merge_cube
raise ValueError("can't merge an empty CubeList")
ValueError: can't merge an empty CubeList I did 2 tests on Levante:
In both cases regridded data are produced in
I'm not sure I understand why the CMORization only works for regridded data in case 2 ( |
cheers @remi-kazeroni - and thanks lots to @zklaus and @sloosvel - sorry, was away the second part of last week. I suspect that the xesmf regridding is done since the regridding is done before the CMORization and esmvalcore/iris regridding can't be used because of wonky non-CF coordinates, so iris be complaining. This is, however, not an optimal approach, so I'd reckon we could get rid of xesmf for good, by regridding the data after CMORization - do you think this flow would affect the final data output? |
I'd be in favor of getting rid of xesmf here. I wonder if we should regrid in the cmorizer at all. People might want to have the original data anyways, and when they want it regridded, they might have opinions on how the regridding should be done, so why not leave it to the preprocessor in the recipe? |
Great idea, that would be an even simpler solution |
Note that this applies potentially to several cmorizers that are regridding right now, even though they might not use xesmf for this. Would that be good practice in general? What would be an appropriate forum to make that kind of decision? |
I like the idea of not regridding in cmorizers and only have that done by preprocessors in recipes. But this would impact other cmorizers for which the data are then used in public recipes. Perhaps, it would be useful to open a separate issue on the topic and discuss with the development team. Regarding the |
Hi esmvaltool team :) Yes taking the regridding out of the cmorizer for CDS-UERRA is fine with me. |
Thanks for your input @bascrezee! @zklaus, would you have the time to take care of removing the regridder from this CMORizer? If not, I could take a look. |
Sorry, @remi-kazeroni, today is my last working day before vacation and I'll be back only in September. I can have a look then, but not earlier. ποΈ |
Enjoy your vacation @zklaus π I'll try to have a look next month then. |
good hols (hopefully on a ποΈ ) @zklaus πΈ @remi-kazeroni you can count on me too, am going nowhere nice π¬π§ |
Ping @zklaus @remi-kazeroni @valeriupredoi It looks like this was forgotten after the summer holidays. Is this something we would like to include in the upcoming v2.8 release? |
Indeed, I completely forgot about that PR. I remember trying to get rid of the regridder step of the UERRA CMORizer but did not succeed in doing so because of the structure of the raw data... Do we still have an issue with that version of the xesmf package that should be addressed before v2.8? |
It's getting a bit old so it might be inconvenient for users who want to use ESMValTool and xesmf in the same environment, but otherwise, there is no issue that I'm aware of. |
just realized, while opening conda-forge/esmvaltool-suite-feedstock#18 that we are still using xesmf=0.3.0 which is absolutely decrepit - we should have looked at this before heading over to release v2.8.0 - do we have time to accept it maybe? |
I really lost track of this and don't have time to work on it myself for this release. It would be great if someone could take over the review and push it to the finish line, if possible. If I remember correctly, the |
Oh my. Pangeo has initiated recovery of the pypi project, see pypi/support#2425, but probably removing the regridding is still a good idea and it seems that the recovery is likely to take a few more weeks. Pangeo opened the issue back in November, but PyPI got to it only mid-February and says the process might take six weeks. |
setup.py
Outdated
@@ -64,7 +64,7 @@ | |||
'seawater', | |||
'shapely<2.0.0', # github.com/ESMValGroup/ESMValTool/issues/2965 | |||
'xarray', | |||
'xesmf==0.3.0', | |||
'xesmf>=0.4.0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zklaus maybe leave this to free so the pip installation can go ahead without the version not found error?
Upstream has recovered the PyPI project and published the latest 0.7.1 there. Once this has made it to Conda-forge (see conda-forge/xesmf-feedstock#30), we can pin |
Great to see progress on this! It looks like the linked pull request on conda-forge was merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this now looks very good to me, apart from the small observation about the if/else block, but all the deps and versions look clean now, cheers @zklaus πΊ Incidentally, xesmf
is only needed for this particular cmorizer, is there any way we can replace it with say, esmpy or anything else? It's a proper overkill to have such a fussy dependency for one script only π
great! cheers @zklaus and @bouweandela and @remi-kazeroni πΊ x3 |
Description
This PR updates xesmf to versions newer than 0.4.0. To do that, it updates the UERRA cmorizer to use the newer API, which means providing a filename for the weights explicitly. It also changes the name of xesmf on Pypi to pangeo-xesmf.
With this, we should be able to unpin xesmf and avoid some trouble with recent ABI incompatibilities.
Before you get started
Checklist
It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the π Technical or π§ͺ Scientific review.
New or updated data reformatting script
To help with the number of pull requests: