Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the option to weight variable groups in ClimWIP #1856

Merged
merged 20 commits into from
Nov 10, 2020

Conversation

lukasbrunner
Copy link
Contributor

@lukasbrunner lukasbrunner commented Oct 13, 2020

Tasks

  • Create an issue to discuss what you are going to do, if you haven't done so already (and add the link at the bottom)
  • Give this pull request a descriptive title that can be used as a one line summary in a changelog
  • Make sure your code is composed of functions of no more than 50 lines and uses meaningful names for variables
  • Circle/CI tests pass. Status can be seen below your pull request. If the tests are failing, click the link to find out why.
  • Preferably Codacy code quality checks pass, however a few remaining hard to solve Codacy issues are still acceptable. Status can be seen below your pull request. If there is an error, click the link to find out why. If you suspect Codacy may be wrong, please ask by commenting.
  • Please use yamllint to check that your YAML files do not contain mistakes

Modified recipe/diagnostic

  • Update documentation for the recipe to the doc/sphinx/source/recipes folder
  • Update provenance information if needed
  • Assign the author(s) of the affected recipe(s) as reviewer(s)

Closes #1853

@lukasbrunner
Copy link
Contributor Author

lukasbrunner commented Oct 13, 2020

This first try runs for me. I'm doing some output checks now.

@Peter9192: I decided to split the variable group loop in the main function, I hope that this does not create any unforeseen problems. My main intention there was to make sure that every element in the *_contributions parameters is included and have if raise an error if the group was not in the pre-processor. But I can definitely also solve this differently if you prefer.
(In addition I might add that for our work the overlap between performance and independence groups is normally rather small.)
(And it saves reading the observations for the independence only groups)

@lukasbrunner lukasbrunner marked this pull request as ready for review October 15, 2020 09:44
Copy link
Contributor

@ruthlorenz ruthlorenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested this and the recipes fail when they should and work fine when everything is correct. But I cannot judge if having two loops over the variable_groups has any unforeseen implications.

@lukasbrunner
Copy link
Contributor Author

Hey @Peter9192 !
I know you are probably still busy but if you have a couple of minutes to look at this it would be great. Maybe this pull request is actually easier to get through as it does not introduce any additional package? Also @ruthlorenz needs the functionality introduced here for some thing else.

@Peter9192
Copy link
Contributor

Hey @lukasbrunner I was actually waiting with this one until #1864 was merged, to avoid any conflicts. But if it's completely independent, I can have a look now.

Copy link
Contributor

@Peter9192 Peter9192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lukasbrunner ,
Personally I would like it if we calculate performance and independence for all variable groups, and then only apply the contributions weighting later, when we accumulate all variable groups. I can even imagine that you would change the relative contributions based on the intermediate results.

That said, I find your points between brackets quite convincing, so if you prefer to do it like this, that's fine with me. I tested the code and it seems to work fine. You can ignore my comments about the splitted loop then.

esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/recipes/recipe_climwip.yml Show resolved Hide resolved
@lukasbrunner
Copy link
Contributor Author

Hey @Peter9192 Thanks for the review!! I think I addressed all your comments. I was a bit unsure how exactly to handle the complete skipping of either performance or independence weighting. For now I allow either not setting it at all or setting it to empty. I'm not sure it this creates any unforeseen problems but to me it seemed most intuitive like this. But I'm happy for any feedback.

@Peter9192
Copy link
Contributor

@lukasbrunner thanks for the changes. Before I have another look, could you please merge master into this branch? Now that #1864 is merged, we should solve any merge conflicts that might occur.

Copy link
Contributor

@Peter9192 Peter9192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lukasbrunner ,
I think your approach is fine, and it works as far as I tested. Since the code changed quite a bit, the review has become bigger than I had expected. But it's considerably improved from the last round, so let's see if we can improve it still a bit more.

First, I left a couple of small suggestions. But then I came up with an idea and added some more suggestions. The point I'm trying to address is that, in the main function, you now have a couple of if statements that all use different criteria, and you loop over different lists. I think it would be nicer to have a single 'flag' or 'control variable' for those things, and I think you can use the performance_contributions and independence_contributions for that. I hope my idea is clear from the suggestions. If not, let me know.

PS I see the order of my comments is a bit messed up in the conversation view. Suggest to use the review tab to check them.

@@ -7,6 +7,7 @@
import os
from collections import defaultdict
from datetime import datetime
from typing import Union
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add this as a direct dependency if we import it explicitly? @bouweandela
Also not sure if we want to go down this road with type hints. Any thoughts on this @bouweandela or @stefsmeets ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it is over the top... Although it IS nice to know that None explicitly is allowed. I'm fine with it either way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to @lukasbrunner, I'm fine with whatever works for you. The typing module is part of the standard library, so there is no need to add it as a dependency. I would recommend running a test with mypy to make sure that you got all the types right.

esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
lukasbrunner and others added 3 commits November 4, 2020 18:23
Co-authored-by: Peter Kalverla <peter.kalverla@gmx.com>
Co-authored-by: Peter Kalverla <peter.kalverla@gmx.com>
@lukasbrunner
Copy link
Contributor Author

Thanks again @Peter9192 ! I find these suggestions super helpful to improve my coding style, highly appreciated! I hope I did not overlook anything I could not accept all you your suggestion directly

Copy link
Contributor

@Peter9192 Peter9192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lukasbrunner thanks again, I think you got everything.
1 small suggestion, and 1 more idea (only if you still feel like it):

Would it make sense to parse the sigmas together with the contributions? Then you catch the assertionerror from line 401 earlier in the diagnostic. Also it makes it easier for the reader, because these parsers let them view all the requirements for the recipe settings in one place (especially since these two are tightly connected).

esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
esmvaltool/diag_scripts/weighting/climwip.py Outdated Show resolved Hide resolved
lukasbrunner and others added 3 commits November 5, 2020 12:14
Co-authored-by: Peter Kalverla <peter.kalverla@gmx.com>
Co-authored-by: Peter Kalverla <peter.kalverla@gmx.com>
@lukasbrunner
Copy link
Contributor Author

Nice suggestion, thx!

@Peter9192
Copy link
Contributor

Okay cool, thanks for accepting all my changes. @bouweandela can you please do the bot magic again? I think I'm not allowed to.. or am I?

@Peter9192
Copy link
Contributor

@esmvalbot recipe_climwip.yml

@lukasbrunner
Copy link
Contributor Author

Okay one more quick fix. If one does not want to use performance weighting it is now possible to:

  1. set all contributions to zero
  2. not set any contributions but still keep the performance_contributions parameter (commenting out the last three lines below)
  3. not set the performance_contributions parameter at all (commenting out all the lines below)
    performance_contributions:
    tas_CLIM: 1
    pr_CLIM: 2
    psl_CLIM: 1

This Is this correct or should one of these cases not be allowed?

In addition, if either of 1)-3) is true this line can be either commented out or left without a value:

performance_sigma: 0.5

@bouweandela
Copy link
Member

@esmvalbot recipe_climwip.yml

@esmvalbot
Copy link

esmvalbot bot commented Nov 6, 2020

Starting recipe recipe_climwip.yml, output will be generated here

@esmvalbot
Copy link

esmvalbot bot commented Nov 6, 2020

Recipe recipe_climwip.yml ran OK, output has been generated here

@Peter9192
Copy link
Contributor

We're on a roll! 💪

@Peter9192
Copy link
Contributor

@bouweandela ping 🔔

@bouweandela bouweandela merged commit 5f0d04a into master Nov 10, 2020
@bouweandela bouweandela deleted the climwip_variable_groups_update branch November 10, 2020 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
diagnostic EUCP www.eucp-project.eu
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow different relative contributions of variable groups in ClimWIP
4 participants