Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic support for variable mappings #1124

Merged
merged 50 commits into from Jun 9, 2021

Conversation

zklaus
Copy link
Contributor

@zklaus zklaus commented May 12, 2021

Description

Closes #1157

Link to documentation: https://esmvaltool--1124.org.readthedocs.build/projects/ESMValCore/en/1124/develop/extra_facets.html#


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

esmvalcore/_recipe.py Outdated Show resolved Hide resolved
esmvalcore/_recipe.py Outdated Show resolved Hide resolved
@zklaus
Copy link
Contributor Author

zklaus commented May 18, 2021

I think this is functionally ready. It is still missing documentation and tests, plus the checks fail.

On the failing checks, I think all of them are spurious. In detail:

Codacy complains about two things, namely that _get_variable_details_for_project is too long a name, and that I disable one error locally. The local disabling is the right thing to do because it is complaining about a missing import precisely where I am testing for the right python version to allow the import.

The commit checks fail as follows:

  • conda-build exceeds time limit
  • documentation: success
  • test: The tests succeed, but the uploading of the coverage fails due to missing CODACY_PROJECT_TOKEN, presumably because this branch lives in my fork.
  • install and develop: these fail due to what looks like a typo in the test for a cmip5 fix:
>      self.assertEqual(dates[0].strftime('%Y%m%d%H%M'), ' 30001161200')
E       AssertionError: '30001161200' != ' 30001161200'

Questions:
@bouweandela, are you happy with the new naming?
@senesis, I think I addressed all your comments. Sorry for not responding directly to them. Do you think this is ok now?

@bouweandela
Copy link
Member

install and develop: these fail due to what looks like a typo in the test for a cmip5 fix:

This looks like #876. I think the test could be improved so it uses the actual date instead of a string formatted version.

Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @zklaus!

esmvalcore/_config/_config.py Outdated Show resolved Hide resolved
esmvalcore/_config/_config.py Show resolved Hide resolved
esmvalcore/_config/_config.py Outdated Show resolved Hide resolved
esmvalcore/cmor/_fixes/fix.py Outdated Show resolved Hide resolved
@senesis
Copy link
Contributor

senesis commented May 18, 2021

@senesis, I think I addressed all your comments. Sorry for not responding directly to them. Do you think this is ok now?

I inserted one comment as a review (see above).

I also suggested (maybe in your private repository's branch) to make a change in cmor/_fixes/fix.py, to write

    def get_cube_from_list(self, cubes, short_name=None, mapping_key=None):
    ....
        if short_name is None:
            short_name = self.vardef.short_name
        short_name=self.var_mapping.get(mapping_key,short_name)

in order to add a capacity to this function (with backward compatibility), which allows fix method to easily access cubes using project's specific variable name (by providing project's specific key for this variable name)

Copy link
Contributor

@senesis senesis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's look fine, except for the review comment below and maybe an extended version of get_cube_from_list (see other comment)

I was able to run a basic test with a recipe using simultaneously IPSL data in its two formats (mono-var and multi-var files) by defining two projects for that. For the time being, I cannot use project native6 for such a simultaneous use of two data formats (as long as the DRS cannot be chosen on a dataset instance basis)

esmvalcore/_config/_config.py Outdated Show resolved Hide resolved
@senesis senesis linked an issue May 22, 2021 that may be closed by this pull request
@zklaus zklaus added this to the v2.3.0 milestone May 23, 2021
@senesis senesis linked an issue May 24, 2021 that may be closed by this pull request
@senesis senesis mentioned this pull request May 31, 2021
10 tasks
@zklaus
Copy link
Contributor Author

zklaus commented Jun 3, 2021

@stefsmeets, to make the tests work, I had to add validation for a new option in config_user.yml to the experimental config stuff. Apart from the simple addition, which permits essentially a single path or a list of paths, I needed to move from list to tuple in order to make the result hashable, which I need since it is used in an lru_cached function as an argument. So I added a _tuplify_validator alongside the _listify_validator. Could you please have a look at that and let me know what you think? Perhaps it's generally a good idea to move to tuples for these options?

@zklaus
Copy link
Contributor Author

zklaus commented Jun 3, 2021

@senesis, I have added your suggestion for a mapping_key to the get_cube_from_list method. I have also implemented a config option for the extra_facets_dir, see discussion above.

Do you have any other comments?

@zklaus zklaus requested a review from jvegreg June 3, 2021 13:54
@senesis
Copy link
Contributor

senesis commented Jun 3, 2021

Do you have any other comments?

@zklaus No , that's fine, thanks. I would like to update PR #1143 for this changes, so let me know when you're fully done, and also if you know the best way for updating my PR : locally rebasing my branch on yours and then pushing ?

@stefsmeets
Copy link
Contributor

stefsmeets commented Jun 4, 2021

So I added a _tuplify_validator alongside the _listify_validator. Could you please have a look at that and let me know what you think?

Perhaps it is not necessary to copy the entire function, see my comment for a possible different approach! Can you also add some tests for the new function?

Perhaps it's generally a good idea to move to tuples for these options?

That depends on what is needed from the config. yaml arrays are loaded as lists in python. If they are needed as tuples in the code, then that could be changed in this part of the code using your validator.

@jvegreg
Copy link
Contributor

jvegreg commented Jun 4, 2021

Just a concept question. ¿Why you decided to keep the extra_facets separated instead of updating the var_info with them if not present? That's the way we do with things like institutes and activities for CMIP and has the advantage keeping most of the functions completely agnostic of the change

@jvegreg
Copy link
Contributor

jvegreg commented Jun 4, 2021

Just a concept question. ¿Why you decided to keep the extra_facets separated instead of updating the var_info with them if not present? That's the way we do with things like institutes and activities for CMIP and has the advantage keeping most of the functions completely agnostic of the change

I just see that this is indeed what you are doing, but then why have the extra parameter for the fix function?

@zklaus
Copy link
Contributor Author

zklaus commented Jun 4, 2021

@jvegasbsc, thanks for looking at this. I'm sorry, but I am not sure I understand. I see no var_info around in the code that we are talking about, but let me ask it this way around: If we don't pass extra_facets, how does a fix get access to this information in, say, fix_metadata?

Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quickly scanned the PR now, just two more comments on the documentation. I noticed that several of the tests are failing, but that looks unrelated to this pull requests. Trying to fix that in the fix-tests branch now..

doc/develop/fixing_data.rst Outdated Show resolved Hide resolved
doc/quickstart/find_data.rst Outdated Show resolved Hide resolved
@zklaus zklaus added the enhancement New feature or request label Jun 9, 2021
@bouweandela
Copy link
Member

Ideally all tests would need to be green before merging this.

@zklaus
Copy link
Contributor Author

zklaus commented Jun 9, 2021

Assuming the fixes coming in from main took care of the circleci stuff (as yet to be seen), this leaves us with a couple of codacy issues. They are:

  • locally disabling no-name-in-module. This should be ignored because this disabling is inside a guard that distinguished between the built-in module that's available from Python 3.9 onwards and the otherwise installed version.
  • the suggestion to re-raise using from. I suggest to ignore this as well since raise ... from is a new Python feature and this is in a part of the code in which this PR only improves the formatting.
  • A small number of formatting complaints in docstrings, that come about because of conflicting formatting standards in local and ci configuration (I think). Can also be ignored imho.

@zklaus zklaus merged commit 1235fb5 into ESMValGroup:main Jun 9, 2021
zklaus added a commit that referenced this pull request Jun 11, 2021
* Add basic support for variable mappings

* Add first era5 mapping

* Find files for CMIP6 DCPP startdates (#771)

* First attempte

* Do not require start and end years, add them later

* Correct condition

* Avoid key error in fx variables

* Consider two possible paths

* Fix function name

* Fix variable name

* Avoid duplicates in filename

* Add test for startdate expansion

* Add test for the replace tags method

* Rename tag

* Add documentation

* Allow to load subexps per timerange or as a whole

* Fix condition

* Remove 'all_years' functionality

* Fix conditions

* Fix flake

* Remove whitespace

Co-authored-by: Javier Vegas-Regidor <javier.vegas@bsc.es>

* Skip regridding if the target grid is almost identical to the source grid (#507)

Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl>
Co-authored-by: Stef Smeets <s.smeets@esciencecenter.nl>

* Fixes for sos and siconc of BCC models (#1090)

* sos and siconc fixed

* tests added

* test fixed

* fix flake8

* fix flake8

* fix codacy issue

* Pin cf-units and fix tests (cf-units>=2.1.5) (#1140)

* pin cf-units

* pin cf-units

* fix test

* fix test

* Handle IPSL-CM6 (the feature won't actually work without #1124)

* class Huss inherits from cass Tas. Also : Fix codacy diags.

* Replace os.system() by subprocess.run()

* Fix flake8 diags

* var_mapping -> extra_facets

* Rename _config/variable_details to _config/extra_facets

* Fix doc re. lack of 'output_file as a dict', and choice of native6

* Fix codacy diags in ipsl_cm6.py

* Use project IPSLCM to handle IPSL-CM6

* Implement changes according to Bouwe's review, 2021/06/07 (except unit tests)

* Add unit tests for _fixes/ipslcm/ipsl_cm6.py

* delete esmvalcore/cmor/_fixes/native6/ipsl_cm6.py

* Delete old file esmvalcore/_config/extra_facets/native6-ipsl-cm6-mappings.yml

* Restore main versions for _recipe.py and cmor_fixes/fix.py

* Restore main version for _recipe.py

* Delete extraneous era5-mappings.yml

* Avoid using mapping_key when calling fix.get_cube_from_list()

* Empty change in fix.py for forcing codacy to re-scan it

* Polish doc

* Polish doc again

* Again...

* and again ...

* Fix typo in comment

* Fixes according to @zklaus review

* Reduce formatting changes

* Update doc/develop/fixing_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/develop/fixing_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/develop/fixing_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/develop/fixing_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/develop/fixing_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/develop/fixing_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/develop/fixing_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/quickstart/find_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Update doc/quickstart/find_data.rst

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>

* Minor formatting improvements

* Organize mapping file in each realm in two sections (CMIP6 and IPSL)

Co-authored-by: Klaus Zimmermann <klaus.zimmermann@smhi.se>
Co-authored-by: sloosvel <45196700+sloosvel@users.noreply.github.com>
Co-authored-by: Javier Vegas-Regidor <javier.vegas@bsc.es>
Co-authored-by: Benjamin Müller <b.mueller@iggf.geo.uni-muenchen.de>
Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl>
Co-authored-by: Stef Smeets <s.smeets@esciencecenter.nl>
Co-authored-by: Rémi Kazeroni <70641264+remi-kazeroni@users.noreply.github.com>
Co-authored-by: Valeriu Predoi <valeriu.predoi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support extra information for additional datasets Variable 'derive' flag should be settable at project level
5 participants