Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix provenance of NCL figures created using the log_provenance function #2279

Merged
merged 1 commit into from
Sep 7, 2021

Conversation

bouweandela
Copy link
Member

@bouweandela bouweandela commented Sep 2, 2021

Description

Logging provenance using the plot_file entry of the corresponding NetCDF file is a buggy feature of ESMValCore. It is also not supported by the functionality that creates the index.html page. This pull request improves the NCL log_provenance function so it writes a separate entry for image files and they are nice shown in the resulting webpage.


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the πŸ›  Technical or πŸ§ͺ Scientific review.


To help with the number of pull requests:

@schlunma
Copy link
Contributor

schlunma commented Sep 2, 2021

General question on this before the review to make sure we are on the same page. What do you mean by

Logging provenance using the plot_file entry of the corresponding NetCDF file is a buggy feature of ESMValCore.

? I always use the attribute plot_file in my (python) diagnostics as it was recommended by an early version of the python example diagnostic. In addition, also the documentation advertises the use of plot_file. I'm also pretty sure that the intended way (at least in earlier times) was to always write a netCDF file for a plot and add the provenance to this. So what is the desired way to implement provenance?

  • Write it to the .nc file and add the corresponding plot with plot_file, i.e.
provenance_logger['plot_file`] = 'path_to_plot.png'
with ProvenanceLogger(cfg) as provenance_logger:
      provenance_logger.log('path_to_dataset.nc', provenance_record)
  • Write it to the .png file, i.e.
with ProvenanceLogger(cfg) as provenance_logger:
      provenance_logger.log('path_to_plot.png', provenance_record)
  • Write it to both separately
with ProvenanceLogger(cfg) as provenance_logger:
      provenance_logger.log('path_to_dataset.nc', provenance_record)
with ProvenanceLogger(cfg) as provenance_logger:
      provenance_logger.log('path_to_plot.png', provenance_record)

?

@bouweandela
Copy link
Member Author

bouweandela commented Sep 6, 2021

Good questions @schlunma and my apologies for the confusion. The only working way of writing provenance at the moment is the last option you describe: log the provenance for every file separately. The documentation that describes that lives here: https://docs.esmvaltool.org/en/latest/community/diagnostic.html#provenance-items-provided-by-the-diagnostic-script. Unfortunately, it looks like I forgot to update the documentation that you linked, I'll update that too.

Indeed using plot_file as an attribute was the recommended way to do it until I changed it about a year ago in #1827, but did you ever notice that it does not work? The only thing it does is embed the provenance of the associated NetCDF file in the .png, provided that you happen to select .png as an output format. If you select another output format, e.g. .pdf, no provenance is available at all for the figure. Because embedding provenance is giving our users problems, I'm considering removing the embedded provenance feature completely and only writing the provenance to the _provenance.xml file: ESMValGroup/ESMValCore#1148.

@bouweandela
Copy link
Member Author

I created a pull request with updated documentation here: ESMValGroup/ESMValCore#1305

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good and (two) test recipes ran successfully πŸŽ‰

Before this change, NCL diagnostics only output provenance for the nc files; now, they also output provenance information for the plot files without any change in the diagnostics themselves. Great job!

One question: Would it be possible to implement a similar feature for the python provenance logging to support the "deprecated" interface? I think this would be really helpful. There are so many diagnostic that use this old version (at least 10+ diagnostics that I have written) and I guess most developers (e.g., I) do not have the time to overhaul these diagnostics.

@bouweandela
Copy link
Member Author

I'll have a look at the Python interface and see if it's easy to fit something in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants