Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add recipe and diagnostics for Schlund et al., JGR: Biogeosciences, 2020 #1860

Merged
merged 21 commits into from Oct 19, 2020

Conversation

schlunma
Copy link
Contributor

@schlunma schlunma commented Oct 14, 2020

This PR adds the actual diagnostics and recipes necessary for the analysis of Schlund et al., JGR: Biogeosciences, 2020.

I'm terribly sorry for this monster PR - I started the development a long time ago and could only upload it to the public repository when my paper was accepted (a week ago, unfortunately it has not been published online yet). Since I used this exact code to do the analysis I don't think that a scientific review is necessary. I also do not ask you to do a full review since these are way to many lines of code. Maybe you can just have a look at the documentation and run the recipes?

Since I was not able to include the correct citation in this PR (no DOI available yet) I will open an issue to remind myself of doing that. It would be really nice to include this in v2.1 so that the code availability section of the paper can be adapted accordingly.

Thanks for your patience 馃槃


Tasks

  • Create an issue to discuss what you are going to do, if you haven't done so already (and add the link at the bottom)
  • Give this pull request a descriptive title that can be used as a one line summary in a changelog
  • Make sure your code is composed of functions of no more than 50 lines and uses meaningful names for variables
  • Circle/CI tests pass. Status can be seen below your pull request. If the tests are failing, click the link to find out why.
  • Preferably Codacy code quality checks pass, however a few remaining hard to solve Codacy issues are still acceptable. Status can be seen below your pull request. If there is an error, click the link to find out why. If you suspect Codacy may be wrong, please ask by commenting.
  • Please use yamllint to check that your YAML files do not contain mistakes
  • (Only if really necessary) Add any additional dependencies needed for the diagnostic script to setup.py, esmvaltool/install/R/r_requirements.txt or esmvaltool/install/Julia/Project.toml (depending on the language of your script) and also to package/meta.yaml for conda dependencies (includes Python and others, but not R/Julia)
  • If new dependencies are introduced, check that the license is compatible with Apache2.0

New recipe/diagnostic

  • Add documentation for the recipe to the doc/sphinx/source/recipes folder and add a new entry to index.rst
  • Add provenance information

If you need help with any of the tasks above, please do not hesitate to ask by commenting in the issue or pull request.


Closes #1844.

@valeriupredoi
Copy link
Contributor

@schlunma 16k added lines and 14 deletions - phew, am relieved, 14 deletions only - piece of cake to review 馃榿 Dude, can you maybe break this down into smaller chunks - there are a lot of documentation (rst) files that can definitely go in a separate PR, together with the png pictures; we can deal with the python code in one PR I am guessing, albeit a big PR, but at least someone else can look at the documentation (maybe someone who actually knows the science?). Cheers 馃嵑

@schlunma
Copy link
Contributor Author

Cheers V., thanks for taking a look into this! As I said, I'm really really sorry for this monster PR!! Unfortunately, separating the doc from the actual code is not possible since most of the doc is created using .. automodule::.

As I already said, I really don't think that a scientific review is necessary since this code went basically through full peer review!

(P.S. the failing test is probably not related to this PR: Exited with code exit status 137. Do we have an issue for this?)

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Oct 14, 2020

aah right, completely forgot about the review from the paper (congrats on the paper, bro!). OK I will start the code review tomorrow, I can do this all - can take the burden off @bouweandela since he's reviewed a lot elsewhere so far anyway. Exited with code exit status 137 is the CI machine running out of memory since the free account is limited to 4G only - no worries about it, try triggering the Github Action build test on push on this PR if you are afraid the conda build will not work due to some dependency that's introduced here, if you aren't introducing any new dependencies then it's not needed 馃嵑

@schlunma schlunma removed the request for review from bouweandela October 14, 2020 19:22
@valeriupredoi
Copy link
Contributor

OK I promise I will start revieweing tomorrow (really can't start it now at 5pm 馃榿 ) but before I do that - so we make a smooth transition to iris3 - have a look at ESMValGroup/ESMValCore#819 and specifically at https://scitools-iris.readthedocs.io/en/v3.0.0rc0/whatsnew/3.0.html#features (stuff you've already reviewed anyway) and was wondering if there's anything in all this code that may be an issue when switching to iris3. And I promise I'm not stalling the review 馃榿

@axel-lauer
Copy link
Contributor

Tests of recipe_schlund20jgr_gpp_abs_rcp85.yml are still running, but the ones of recipe_schlund20jgr_gpp_change_1pct.yml and recipe_schlund20jgr_gpp_change_rcp85.yml finished successfully. Here a few things I noticed so far:

  • The image files in doc/sphinx/source/recipes/figures/schlund20jgr/ have a quite high resolution and large file size. I think it would be good to reduce that a little bit so the pdf of the user's guide does not get unnecessarily big.
  • Within the output directory "plot", quite some subdirectories named "preprocess*" are created. Those are all empty. Is that on purpose?
  • The two recipes tested so far do not provide any provenance output for the plots, only for the nc files in work.

@schlunma
Copy link
Contributor Author

Thanks for testing Axel!

* The image files in doc/sphinx/source/recipes/figures/schlund20jgr/ have a quite high resolution and large file size. I think it would be good to reduce that a little bit so the pdf of the user's guide does not get unnecessarily big.

I will re-upload them with reduced size.

* Within the output directory "plot", quite some subdirectories named "preprocess*" are created. Those are all empty. Is that on purpose?

That is automatically done by ESMValTool (which creates the subdirectories plot, run, and work for every diagnostic). Since the preprocess diagnostic does not write plots, the plot directory stays empty.

* The two recipes tested so far do not provide any provenance output for the plots, only for the nc files in work.

For every plot file I created an .nc file. The provenance output is attached to that (the plot is attached to that with attribute plot_file). When I wrote these scripts this was the "official" way to do this (at least it was done like this by the example python diagnostic), but this is also changing now (see discussion in #1827). I really don't want to change that since it took me 2 days to add the provenance tracking to all diagnostics in the first place...

@schlunma
Copy link
Contributor Author

OK I promise I will start revieweing tomorrow (really can't start it now at 5pm grin ) but before I do that - so we make a smooth transition to iris3 - have a look at ESMValGroup/ESMValCore#819 and specifically at https://scitools-iris.readthedocs.io/en/v3.0.0rc0/whatsnew/3.0.html#features (stuff you've already reviewed anyway) and was wondering if there's anything in all this code that may be an issue when switching to iris3. And I promise I'm not stalling the review grin

Thanks V.! I just checked the link; I'm not using any of the functions listed in the incompatible changes/deprecations. So (in theory) the code should run in iris3 馃槃

@bouweandela
Copy link
Member

I will re-upload them with reduced size.

It might be nice to rebase, because then the commits containing the large files will not end up in the master branch, so they will be garbage collected at some point and the repo will stay small.

Copy link
Contributor

@axel-lauer axel-lauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last test just finished, looks all fine to me.

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopped right before mmm.py - will resume later 馃憤

esmvaltool/diag_scripts/mlr/__init__.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/__init__.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/__init__.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/__init__.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/__init__.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/__init__.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/__init__.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/custom_sklearn.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/custom_sklearn.py Show resolved Hide resolved
esmvaltool/diag_scripts/mlr/main.py Show resolved Hide resolved
@bouweandela
Copy link
Member

bouweandela commented Oct 16, 2020

You might want to have a closer look at the documentation by clicking the 'Details' link behind the readthedocs build below or building the documentation locally. I noticed several problems, especially in the API documentation:

  • links are not correctly formatted in several places
  • the data type seems to be concatenated to the name of the setting that can be used in the diagnostic section of the recipe
  • there are many functions documented that may be of little interest to the user (and the documentation is hardly informative), for example here, the recipe options are useful, but the rest (documentation of classes and functions in the module) does not really help much. I do not see anyone re-using this as a Python module based on the information provided here.
  • maybe you need to pin autodocsumm < 1.2, because I can see that the documentation is suffering from this issue: Regression bug from 0.1.13 to 0.2.x ?聽Chilipp/autodocsumm#33

@schlunma

This comment has been minimized.

@bouweandela

This comment has been minimized.

@schlunma

This comment has been minimized.

@bouweandela

This comment has been minimized.

@bouweandela

This comment has been minimized.

@schlunma

This comment has been minimized.

@schlunma

This comment has been minimized.

@bouweandela

This comment has been minimized.

@schlunma
Copy link
Contributor Author

schlunma commented Oct 19, 2020

Alright, I made some more progress:

@bouweandela I fixed all issues now by pinning autodocsum to <0.2.0 and removing spaces between colons to fix the formatting of the configuration options.

@valeriupredoi I added the comment about the custom sklearn functions, see here.

Any more comments from your side?

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me and helluva lot of work, well done @schlunma

@bouweandela
Copy link
Member

Thanks, the documentation looks much better now, though it's still not exactly an easy read.

Apart from the minor comment that the provenance is still not quite right, I think this can be merged soon. I'm not going to look at the code since @valeriupredoi already did that.

I'm running the conda build locally now, just to make sure it's still working.

@bouweandela bouweandela merged commit e61e5cb into master Oct 19, 2020
5 of 7 checks passed
@bouweandela bouweandela deleted the add_schlund20jgr branch October 19, 2020 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add recipe for Schlund et al., JGR: Biogeosciences, 2020
4 participants