Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added option to explicitly not use fx variables in preprocessors #1416

Merged
merged 7 commits into from
Jan 13, 2022

Conversation

schlunma
Copy link
Contributor

@schlunma schlunma commented Dec 15, 2021

Description

This PR allows the use of fx_variables: {} (or fx_variables: null or fx_variables: []) to not use any fx variables in the preprocessor.

Deprecations

In addition, it deprecates the option always_use_ne_masks=True for the mask_landsea preprocessor. The same behavior can now be achieved by using fx_variables: {}.

Closes #1415

Link to documentation:


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the πŸ›  Technical or πŸ§ͺ Scientific review.


To help with the number pull requests:

@codecov
Copy link

codecov bot commented Dec 15, 2021

Codecov Report

Merging #1416 (66b4452) into main (b070a4b) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1416      +/-   ##
==========================================
+ Coverage   88.86%   88.87%   +0.01%     
==========================================
  Files         196      196              
  Lines        9965     9971       +6     
==========================================
+ Hits         8855     8862       +7     
+ Misses       1110     1109       -1     
Impacted Files Coverage Ξ”
esmvalcore/_recipe.py 94.55% <100.00%> (+0.12%) ⬆️
esmvalcore/exceptions.py 100.00% <100.00%> (ΓΈ)
esmvalcore/preprocessor/_mask.py 85.47% <100.00%> (+0.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data
Powered by Codecov. Last update b070a4b...66b4452. Read the comment docs.

@schlunma schlunma marked this pull request as ready for review December 15, 2021 16:53
Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should introduce a backwards incompatible change, the behaviour of regular masks should not change - afaik for the previous implementation for mask_landsea/mask_seaice if there were no fx_variables specified by the user it'd try sflf/sftof/sftgif and if those are not found then it'd go straight to NE masks - if this is not the case then I have a big beef if it's choosing its own system fx variables

esmvalcore/_recipe.py Show resolved Hide resolved
esmvalcore/preprocessor/_mask.py Show resolved Hide resolved
@@ -247,6 +247,13 @@ available tables of the specified project.
a given dataset) fx files are found in more than one table, ``mip`` needs to
be specified, otherwise an error is raised.

.. note::
To explicitly **not** use any fx variables in a preprocessor, use either
``fx_variables: null``, ``fx_variables: {}`` or ``fx_variables: []``. While
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing to have multiple options for the same functionality - decide which one single one you like and use only that, you can be fancy and add a catchment in the recipe parser eg if fx_variables == {} or fx_variables == []: raise RecipeError("I see you are trying to be smart, use null for fx_variables, smartypants")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we allow dicts and lists for fx variables (see here), so I think at least an empty list and an empty dict should be allowed. For me it feels natural to also inlcude fx_variables=None, but I can remove that if necessary πŸ˜„

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more in favour of Null or None only - think from a user's perspective and not from the developer's perspective - how nice is it when you read the Numpy docs and they simply say set this to Null if you don't want to have it rather than you can call your grandfather Grampa, Gramps, as well as Granddad, as well as Pops etc... UX first, ma man

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we allow all of them but just mention fx_variables: null in the doc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds very good to me 🍺

@schlunma
Copy link
Contributor Author

schlunma commented Dec 16, 2021

I don't think this should introduce a backwards incompatible change, the behaviour of regular masks should not change - afaik for the previous implementation for mask_landsea/mask_seaice if there were no fx_variables specified by the user it'd try sflf/sftof/sftgif and if those are not found then it'd go straight to NE masks - if this is not the case then I have a big beef if it's choosing its own system fx variables

Here's a summary of the old/new behavior:

Option Old New
fx_variables option is not specified load default fx variables, e.g., sftlf/sftof for mask_landsea same as old
fx_variables: {} (or [] or null) load default fx variables, e.g., sftlf/sftof for mask_landsea do not use any fx files, e.g. for masking go straight to NE
fx_variables: areacello Only use areacello, ignoring defaults same as old

As you can see, only the behavior of using fx_variables: {} (which is not used in any recipe at the moment, btw) is not backwards compatible. Apart from that I didn't change anything, for example mask_landsea still automatically selects sftlf and sftof if nothing else is specified and falls back to NE masks if no files are found.

EDIT:

The main change in this PR is moving from a if not step_settings.get('fx_variables'): to if 'fx_variables' not in step_settings:. The first condition is true for any empty dict, empty list, None, 0, etc.; the latter is only true if the options is not explicitly set by the user.

@valeriupredoi
Copy link
Contributor

I don't think this should introduce a backwards incompatible change, the behaviour of regular masks should not change - afaik for the previous implementation for mask_landsea/mask_seaice if there were no fx_variables specified by the user it'd try sflf/sftof/sftgif and if those are not found then it'd go straight to NE masks - if this is not the case then I have a big beef if it's choosing its own system fx variables

Here's a summary of the old/new behavior:
Option Old New
fx_variables option is not specified load default fx variables, e.g., sftlf/sftof for mask_landsea same as old
fx_variables: {} (or [] or null) load default fx variables, e.g., sftlf/sftof for mask_landsea do not use any fx files, e.g. for masking go straight to NE
fx_variables: areacello Only use areacello, ignoring defaults same as old

As you can see, only the behavior of using fx_variables: {} (which is not used in any recipe at the moment, btw) is not backwards compatible. Apart from that I didn't change anything, for example mask_landsea still automatically selects sftlf and sftof if nothing else is specified and falls back to NE masks if no files are found.

EDIT:

The main change in this PR is moving from a if not step_settings.get('fx_variables'): to if 'fx_variables' not in step_settings:. The first condition is true for any empty dict, empty list, None, 0, etc.; the latter is only true if the options is not explicitly set by the user.

nice summary! so then there is no actual case of backwards incompatibility but rather the existence of a hidden feature in the case of mask_landsea/seaice that we don't want to promote since we want the previous behavior to stay and the user should not complicate things with empty fx_variables settings

@zklaus
Copy link
Contributor

zklaus commented Dec 16, 2021

What if only a few data sources are missing the files? If I add the new option, will all fx files be ignored, even those that are present?

@valeriupredoi
Copy link
Contributor

What if only a few data sources are missing the files? If I add the new option, will all fx files be ignored, even those that are present?

yep

@zklaus
Copy link
Contributor

zklaus commented Dec 16, 2021

What if only a few data sources are missing the files? If I add the new option, will all fx files be ignored, even those that are present?

yep

Do we want that?

@schlunma
Copy link
Contributor Author

nice summary! so then there is no actual case of backwards incompatibility but rather the existence of a hidden feature in the case of mask_landsea/seaice that we don't want to promote since we want the previous behavior to stay and the user should not complicate things with empty fx_variables settings

Well technically it is backwards incompatible, but the feature that changes has never been used before, so I guess it shouldn't be a big deal. However, if one needs a common land/sea mask for a bunch of datasets (previously using the always_use_ne_masks option, this can be achieved now with fx_variables: {})

What if only a few data sources are missing the files? If I add the new option, will all fx files be ignored, even those that are present?

Yes, if you explicitly add the new option, all fx files are ignored. I don't think it's a problem since the user actively has to change that and if a preprocessor needs fx files in order to work (e.g., volumne_statistics or area_statistics with irregular grids) then it will give an error. In addition, I added to the doc that it is not encouraged to use that feature.

@valeriupredoi
Copy link
Contributor

What if only a few data sources are missing the files? If I add the new option, will all fx files be ignored, even those that are present?

yep

Do we want that?

not really no, but that's up to the user to decide methinks - they do a first run without fx_variables: Null then they decide to plug it in since their analysis is all over the shoppe

@schlunma
Copy link
Contributor Author

What if only a few data sources are missing the files? If I add the new option, will all fx files be ignored, even those that are present?

yep

Do we want that?

We allow the user to choose custom fx_variables for the preprocessor right now (I could use e.g. volcello for masking, which doesn't make sense at all) and the preprocessor would run completely fine (using NE masks), so why don't we give the user the possibility to not use any fx file?

@zklaus
Copy link
Contributor

zklaus commented Dec 16, 2021

I am thinking of the following. Say you want to plot tas in maps for a bunch of models, but only over land, so you use mask_landsea. For good measure, you throw in ICON or some satellite product that doesn't have a mask. ESMValTool barfs with missing fx file and you add the new option. But suddenly all your nice working model masks are gone and you get artifact galore close to the coasts.

@valeriupredoi
Copy link
Contributor

Well technically it is backwards incompatible, but the feature that changes has never been used before, so I guess it shouldn't be a big deal. However, if one needs a common land/sea mask for a bunch of datasets (previously using the always_use_ne_masks option, this can be achieved now with fx_variables: {})

Yeah see I don't like this - this is a recipe for inconsistent results, useless panicking of the UX group (moar incompatible changes!), and something that is not that helpful in the case of those preprocessors anyway since if sftlf/of/gif data is missing people have already gotten used to the machinery switching to NE masks. To be clear, all I want for Christmas is you (to remove those backwards incompatible change messages, no actual change of the functionality) 😁

@valeriupredoi
Copy link
Contributor

For good measure, you throw in ICON or some satellite product that doesn't have a mask. ESMValTool barfs with missing fx file and you add the new option. But suddenly all your nice working model masks are gone and you get artifact galore close to the coasts

Very good point @zklaus ! yes, NE masks only for ICON and sft's for models - another good argument to remove the messages about the option for mask_* preprocessors

@schlunma
Copy link
Contributor Author

I am thinking of the following. Say you want to plot tas in maps for a bunch of models, but only over land, so you use mask_landsea. For good measure, you throw in ICON or some satellite product that doesn't have a mask. ESMValTool barfs with missing fx file and you add the new option. But suddenly all your nice working model masks are gone and you get artifact galore close to the coasts.

I completely understand that sometimes this can be problematic (that's why I added the "not encouraged" in the doc). However, as I already mentioned in the description of this PR, I don't want to make a case for using/not using fx files. I just want a consistent behavior of the tool, which is not the case at the moment imho. Why would fx_variables: {} include fx files? And why would we allow arbitrary fx files in the first place if we don't allow not using any?

@schlunma
Copy link
Contributor Author

Well technically it is backwards incompatible, but the feature that changes has never been used before, so I guess it shouldn't be a big deal. However, if one needs a common land/sea mask for a bunch of datasets (previously using the always_use_ne_masks option, this can be achieved now with fx_variables: {})

Yeah see I don't like this - this is a recipe for inconsistent results, useless panicking of the UX group (moar incompatible changes!), and something that is not that helpful in the case of those preprocessors anyway since if sftlf/of/gif data is missing people have already gotten used to the machinery switching to NE masks. To be clear, all I want for Christmas is you (to remove those backwards incompatible change messages, no actual change of the functionality) grin

Haha, yes, done πŸ˜„

@zklaus
Copy link
Contributor

zklaus commented Dec 16, 2021

And why would we allow arbitrary fx files in the first place [...]?

Agreed. We really should think a bit about what we really should be doing here before we throw on a few more layers of confusion.

@schlunma
Copy link
Contributor Author

schlunma commented Dec 16, 2021

To be honest I think that this change rather removes a layer of confusion. When I read the doc about fx variables in preprocessors and found something like:

Alternatively, the fx_variables argument can also be specified as a list:

fx_variables: ['areacello', 'volcello']

I thought: "hey, if I use fx_variables: [], surely this will lead to ESMValTool not using any fx variable!". However, it did not, and that's what I'm trying to solve here. That's the main (and sole) purpose of this PR.

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheers for the work, Manu! I'm ok with it as it is now, so am off on baby Jesus holidays starting in 30min so approving not to be the cork in the bottle, you settle if there's other things with Klaus, I'll see you guys back in January, happy holidays!! πŸŽ„ πŸŽ…

@schlunma
Copy link
Contributor Author

@zklaus is there anything I can change to find a compromise here?

@zklaus
Copy link
Contributor

zklaus commented Dec 20, 2021

Oh well, I suppose it doesn't make the situation much worse. At some point, we'll have to overhaul this situation, but then let's focus on the details of this PR for now.

One thing that sticks out to me is the deprecation exception. Probably that should at least go to esmvalcore.exceptions? I also wonder if this is a departure from the way we have handled deprecations up to now. If so, we should probably explain why, and perhaps document the procedure. I just tried looking up an example for a deprecation and found it in #808. Perhaps have a look. It seems customary to do deprecations as warnings (which still are part of the exceptions hierarchy).

@schlunma
Copy link
Contributor Author

Oh well, I suppose it doesn't make the situation much worse. At some point, we'll have to overhaul this situation, but then let's focus on the details of this PR for now.

Sounds good to me!

One thing that sticks out to me is the deprecation exception. Probably that should at least go to esmvalcore.exceptions? I also wonder if this is a departure from the way we have handled deprecations up to now. If so, we should probably explain why, and perhaps document the procedure. I just tried looking up an example for a deprecation and found it in #808. Perhaps have a look. It seems customary to do deprecations as warnings (which still are part of the exceptions hierarchy).

Yes, I will move it to esmvalcore.exceptions, that makes much more sense (I wasn't aware of that module).

I basically copy-pasted this from here

# Custom warning, because DeprecationWarning is hidden by default
class ESMValToolDeprecationWarning(UserWarning):
"""Configuration key has been deprecated."""

and here

https://github.com/ESMValGroup/ESMValTool/blob/0d983b1e694157474801e71d33e5e9c1c33426bf/esmvaltool/__init__.py#L5-L6

Apparently DeprecationWarnings do not get printed by default, that's why there is an extra print(f"Warning: {msg}") in #808. Apart from that, I think both cases are pretty similar; in both cases warnings.warn is used to issue the warning.

I think the approach of this PR is easier (since it doesn't rely on the extra print), so would it be okay to leave it like this? Currently there is also no other deprecation in ESMValCore, so there is no need to unify things.

@zklaus
Copy link
Contributor

zklaus commented Dec 20, 2021

Hm. I long term, as part of our backwards compatibility efforts, I think we want to catch all deprecations from ourselves and our dependencies. That will mean automatic collections of all deprecations during the test runs in the CI context of ESMValTool. If our own approach to deprecation handling aligns with the rest of the ecosystem, it should be much easier to get a quick overview. At least I'd like to hear some other opinions before we introduce a completely new mechanism for deprecations.

@ESMValGroup/esmvaltool-coreteam, any views on deprecation by custom exception?

@schlunma
Copy link
Contributor Author

Just for reference: iris uses a very similar custom deprecation warning.

If this is an issue, I can just use the "old" approach. As I said, I just copy-pasted this from other parts of our code and thought it will be fine.

Nevertheless, I don't really understand the problem here. We don't have this automatic collection of all deprecations in place yet and I don't really think it will be a problem if we use a custom warning here that (1) inherits from UserWarning and (2) is issued with the default python warnings.warn framework.

@valeriupredoi
Copy link
Contributor

as long as we don't have a standard deprecation handler yet (we should have one soon!) I am fine with this - it's verbosy and user-friendly enough to me; one thing I'd suggest is not to take what iris does as bullet-proof but rather to question their implementations first and decide if the same stuff is good for us too after 🍺

@schlunma
Copy link
Contributor Author

Any further comments on this? I really would like to get this in, soon. Cheers!

@valeriupredoi
Copy link
Contributor

@zklaus maybe some final (or not so final 😁 ) remarks so we can get this over the hill? 🍺

@zklaus
Copy link
Contributor

zklaus commented Jan 12, 2022

Actually, I have to apologize a little bit. I read the Warning as a pure Exception or Error outside of the warning system. Must have been the end-of-year and deliverable frenzy. Sorry!

So go ahead. Perhaps it's a good idea to open an issue about adding a "canonical" approach to deprecations to the documentation?

@schlunma
Copy link
Contributor Author

No worries, thanks for the review! I will open an issue about that πŸ‘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecated feature preprocessor Related to the preprocessor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explicitly **not** using any fx variable is not possible in preprocessor
3 participants