Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added recipe filler. #1707

Merged
merged 101 commits into from
Mar 16, 2021
Merged

Added recipe filler. #1707

merged 101 commits into from
Mar 16, 2021

Conversation

ledm
Copy link
Contributor

@ledm ledm commented Jul 9, 2020

Added a script that automates filling out the recipe with available data, it's called recipe_filler.py, in utils.

You write your diagnostic in the unfilled recipe like this:

    thetao_global: # Temperature ocean 3D
        short_name: thetao
        preprocessor: prep_global
        mip: Omon
        exp: historical
        project: CMIP6
        dataset: SAM0-UNICON
        ensemble: r1i1p1f1
        start_year: 1995
        end_year: 1999

and it will add an additional_diagnostics section:

    thetao_global: # Temperature ocean 3D
        additional_datasets:
          - {dataset: SAM0-UNICON, project: CMIP6, exp: historical, mip: Omon, ensemble: r1i1p1f1, grid: gn, start_year: 1995, end_year: 1999, }
        short_name: thetao
        preprocessor: prep_global
        exp: historical
        project: CMIP6
        dataset: SAM0-UNICON
        ensemble: r1i1p1f1
        start_year: 1995
        end_year: 1999

So, it works okay. This example is fairly trivial. The more specific your list of requirements, the faster it runs and the fewer datasets it requires. If you wanted to see all datasets, just remove this line from your unfilled recipe:

        dataset: SAM0-UNICON

The problems with this tool:

  • The additional_datasets is added at the top of the preprocessor, which is ugly.
    @valeriupredoi fixed this: it is now added under each variable using correct yaml syntax
  • you still need to edit the recipe to remove the repeated lines (ie dataset, ensemble, exp in the example above).
  • it doesn't understand yml anchors.
    @valeriupredoi fixed this by using yaml dump and save appropriately
  • you most likely only want CMIP6 data from one grid, so there's no way to have a preference.
    @valeriupredoi fixed this - you can ask for the mother of all grids now
  • Hard wired path to the configurtation file ./config-user.yml.
    @valeriupredoi fixed this too: you can input your own file or you can not and the default ~/.esmvaltoon/config-user.yml will be used instead
  • It doesn't check whether the requested years are available.
    @valeriupredoi fixed this too, start_year and end_year are mandatory fields that will allow file filtering
  • I added a hashable dictionary class and made a dictionary where the index was another dictionary. lol. very cool and very xzibit/pimp my ride. This was from before christmas 2019, so who knows why I did that!
    @valeriupredoi has totally unpimped your ride because it's not needed 馃榿

Before you start, please read our contribution guidelines.

Please discuss your idea with the development team before getting started, to avoid disappointment later. The way to do this is to open a new issue on GitHub. If you are planning to modify an existing functionality, please discuss it with the original author(s) by tagging them in the issue.


Tasks

  • Create an issue to discuss what you are going to do, if you haven't done so already (and add the link at the bottom)
  • Give this pull request a descriptive title that can be used as a one line summary in a changelog
  • Make sure your code is composed of functions of no more than 50 lines and uses meaningful names for variables
  • Circle/CI tests pass. Status can be seen below your pull request. If the tests are failing, click the link to find out why.
  • Preferably Codacy code quality checks pass, however a few remaining hard to solve Codacy issues are still acceptable. Status can be seen below your pull request. If there is an error, click the link to find out why. If you suspect Codacy may be wrong, please ask by commenting.
  • Please use yamllint to check that your YAML files do not contain mistakes
  • (Only if really necessary) Add any additional dependencies needed for the diagnostic script to setup.py, esmvaltool/install/R/r_requirements.txt or esmvaltool/install/Julia/Project.toml (depending on the language of your script) and also to package/meta.yaml for conda dependencies (includes Python and others, but not R/Julia)
  • If new dependencies are introduced, check that the license is compatible with Apache2.0

If you need help with any of the tasks above, please do not hesitate to ask by commenting in the issue or pull request.


Closes #671
Closes #1170

EDIT by @valeriupredoi on 28/09/2020

Cheers @ledm for starting this and for the code you wrote! I have taken your draft and rewritten most of it, but the core functionality remains the same; I have fixed a number of outstanding issues which you nicely pointed out in this message and edited the message accordingly. I think it's in an okay shape now, although it still needs testing in anger and actual unit tests for each of the functions plus most of the functions will have to become private (for I am lazy and I really don't want to write a lot of api documentation 馃榿 ). Can I please ask you and @SarahAlidoost @bouweandela @jvegasbsc @nielsdrost @hb326 @mattiarighi and anyone else who wants to test this and report any issues they encounter, please 馃嵑 I will then fix the issues, write more documentation and tests and then, Bouwe-willing, we can merge :)

Copy link
Contributor

@SarahAlidoost SarahAlidoost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ledm thank you for this utility. I had a look at the script, pull request checklist, and its related issues. It seems that this pull request is still a work in progress. Could you please either fix the remaining issues or change it to a draft pull request? Also, to review this task, I run the script on recipe_python using the following diagnostics settings:

pr:
        preprocessor: preprocessor1
        mip: Amon
        start_year: 2000
        end_year: 2002
        dataset: CanESM2
        exp: historical
        ensemble: r1i1p1
        project: CMIP5

It results in an error TypeError: sequence item 0: expected str instance, list found related to the function get_rootpath(). Because there is a list of CMIP5: [~/cmip5_inputpath1, ~/cmip5_inputpath2] in rootpath section in my config_user.yml file. This needs to be fixed somehow.

@SarahAlidoost
Copy link
Contributor

@ledm I am wondering if you have had a chance to look at my review comment? It would be great if we could update the status of the pull request. Thanks.

@ledm
Copy link
Contributor Author

ledm commented Sep 3, 2020

Hi Sarah, hope you're well! I won't really have time to work on this for a while, but I believe that @valeriupredoi talked about taking over this utility. Guess he's pretty busy too though!

@valeriupredoi
Copy link
Contributor

I am a bit less busy these days so I can defo have a stab at this next week 馃憤

@valeriupredoi
Copy link
Contributor

OK I started working on this, more tomorrow 馃憤 馃嵑

@valeriupredoi valeriupredoi marked this pull request as draft September 8, 2020 16:45
@valeriupredoi
Copy link
Contributor

Maybe the situation will improve once #1934 is merged, because then the image we use for ci/circleci: test job will be more up to date (it's now based on a 1 month old version of the master branch).

yeh man am liking what I'm seeing in #1934 馃憤

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Feb 18, 2021

@bouweandela @stefsmeets I would very much like to merge this asap (today or tomorrow) since I would be very happy to include this in the release: Stef, I took care of your suggestions - the ones that dealt with possible changes (like the string formatting etc) and I staled your review to buy you time to just approve this, Bouwe - I got a lot of feedback from both Sarah and Ruth and they both tested it heavily. This is a very useful bit of tool for users, so that's why it'd be very nice to have it released. The only snag is that ruamel is not in the test environment, still waiting on develop to finish when I am writing this but it should be fine 馃憤

@bouweandela
Copy link
Member

@valeriupredoi Can you have a look at the failing tests? I noticed you forgot to add the ruamel.yaml package as a dependency to to setup.py, maybe that will already solve some of the issues? The private imports are causing the other failing tests.

@bouweandela
Copy link
Member

It would be really great if we could get this merged soon so diagnostic developers can start using it, but maybe it's not needed for the release, because diagnostic developers will almost always use a development installation of ESMValTool.

@valeriupredoi
Copy link
Contributor

It would be really great if we could get this merged soon so diagnostic developers can start using it, but maybe it's not needed for the release, because diagnostic developers will almost always use a development installation of ESMValTool.

sounds good, lemme fix the private imports and the tests, then I'll merge and leave it for the next release, there may be bugs to fix in it anyway, well- not bugs - but lack of features, that users may unearth by using it in anger

@valeriupredoi
Copy link
Contributor

@valeriupredoi Can you have a look at the failing tests? I noticed you forgot to add the ruamel.yaml package as a dependency to to setup.py, maybe that will already solve some of the issues? The private imports are causing the other failing tests.

OK @bouweandela man, I fixed all tests, migrated code from private esmvalcore to here, and the env solves well both on my machine and the CI one, the only caveat is that when the new config module will get in the next core release the config developer file path needs to change here, but we'll bridge that bridge when we get to it. You wouldn't object too much about merging would you? 馃嵑

@bouweandela
Copy link
Member

You wouldn't object too much about merging would you? beer

That would be fine with me, but could you please also fix the other private import?

@bouweandela
Copy link
Member

Thanks for fixing the private imports @valeriupredoi. It would be fine with me to merge this now.

@valeriupredoi
Copy link
Contributor

terrific, cheers muchly @bouweandela and the other hard working reviewers, finalmente this one's in 馃榿

@valeriupredoi valeriupredoi merged commit 1bf77b2 into master Mar 16, 2021
November 2020 automation moved this from In Review to Merged Mar 16, 2021
@valeriupredoi valeriupredoi deleted the dev_recipe_filler_671 branch March 16, 2021 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

New tool to fill out recipes More flexibility in recipe datasets
8 participants