-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement/yaml diagnostics #227
Merged
mnlevy1981
merged 40 commits into
marbl-ecosys:master
from
mnlevy1981:enhancement/YAML-diagnostics
Feb 13, 2018
Merged
Enhancement/yaml diagnostics #227
mnlevy1981
merged 40 commits into
marbl-ecosys:master
from
mnlevy1981:enhancement/YAML-diagnostics
Feb 13, 2018
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I've put a few diagnostics into default_diagnostics.yaml, but still have lots more to add. I've also converted YAML -> JSON (but still need to write a script to read the JSON file), which required skipping the consistency check in yaml_to_json.py. I want to update yaml_to_json.py to take a list of YAML files and convert them all to JSON (so user will specify a directory for the output but the script will come up with file names on its own), but haven't done that yet.
Still need to finish adding diagnostics to default_diagnostics.yaml and also clearly define the schema being used (it will be less complicated than the schema in default_settings.yaml), but now I can also rough out a script to generate something similar to a tavg contents file.
Descriptions in both yaml_to_json.py and MARBL_generate_settings_file.py were out of date
This is just the shell that will contain a new class in MARBL_tools. It still needs hooks to the default_diagnostics.json file, plus a lot of other stuff... Tested with: \# (1) import MARBL tools import sys sys.path.append($MARBLROOT) import MARBL_tools \# (2) create a settings object settings_obj = MARBL_tools.MARBL_settings_class('../autogenerated_src/default_settings.json') \# (3) create a diagnostics object diags_obj = MARBL_tools.MARBL_diagnostics_class('../autogenerated_src/default_diagnostics.json', settings_obj, input_diagnostics_file=[test file]) format for the input_diagnostics_file will be diagnostic_name : frequency and '#' will be treated as comments. Comment separated frequencies => output the same variable at multiple frequencies.
Also continued work on MARBL_diagnostics_file_class.py, which now reads frequencies from JSON and creates a dictionary (key is diagnostic name, value is recommended frequency). Still need to add support for reading a text file to change frequency from default in cases where user wants non-standard output.
Unlike generate_settings_file(), which uses --settings_file_in to provide a text file with individual settings file overrides, generate_diagnostics_file() will not allow values from the JSON file to be overridden via text file input. It will be up to the GCM to provide a way for the user to change the diagnostics from the default (CESM will allow users to put marbl_diagnostics into SourceMods/src.pop) Also added some comments to the top of the diagnostic file output.
Want generate_diagnostics_file() to take MARBL_diagnostics_class object as an argument so that it can be called cleanly from the GCM.
no need to have these two fields in default_diagnostics.yaml / json; I think STF_O2 will be removed from MARBL entirely, and FG_ALT_CO2 will be added back in once a flag for "provide _ALT_CO2 fields" exists.
Surface tracer fluxes are available in the driver if the GCM wants to save them; no need to use the marbl diagnostic framework as well.
Allowable operator values are none, average, instantaneous, minimum, and maximum; MARBL_utils:diagnostics_dictionary_is_consistent now actually includes some checks (such as "are frequency and operator valid?", and the MARBL_generate_diagnostics_file.py script now returns text formatted as DIAGNOSTIC : frequency_operator (where before, the _operator was not included)
Script on GCM side will determine whether or not to include them in final lists of diagnostics.
Unless ciso_on = .true. in MARBL settings, the diagnostics associated with carbon isotopes should not be included in the diagnostic file. I added a key to the diagnostic dictionary ("module"), and diagnostics where module = ciso are only included if ciso_on = .true. I added a few ciso surface diagnostics to test this out but have lots more to add in the next commit.
Setting "operator = none" is confusing, so if frequency = never then operator just has to be any valid operator ("instantaneous", "average", "minimum", or "maximum").
default_settings.json now contains _tracer_list instead of _tracer_cnt; it is a dictionary that is used in conjunction with the rest of settings_dict to determine which tracers are being requested. get_tracer_count now returns the length of the list of all requested tracers (and there is a get_tracer_list routine to provide the tracers themselves as well).
Diagnostics need tracer long-name and units
The YAML and JSON file now have a '_tracers' key, and the consistency check in MARBL_utils.py knows what to do with it. Still does not produce any tracer-specific diagnostics, though (that needs to go into MARBL_diagnostics_file_class.py)
Diagnostics are going to need to know tracer long name and units in order to get the tracer-specific metadata right.
This is kludgy -- ideally, MARBL_diagnostics_file_class::_process_diagnostic_frequency would be able to determine if a tracer is listed in tracer_restore_vars entirely based on logic outlined in the JSON file... but instead __init__ has a block of code under the comment \# Special treatment for tracers, PFTs, etc that, in part, passes information about whether the tracer was found in tracer_restore_vars or not.
Carbon tracers are [auto_name]13C and [auto_name]14C, original entry used [auto_name]C13 and [auto_name]C14 which are not valid tracer names.
Based on discussions at Jan 23rd MARBL meeting, a lot of re-writing code that processes default_diagnostics.json. Some highlights: 1. instead of a 'module' key, an optional 'dependencies' key is used to specify diagnostics that are not always defined 2. Better use of templating for per-tracer diagnostics (still need to expand to PFTs) The process is now: 1. default_diagnostics.json is read into self._diagnostics 2. self._diagnostics is "processed" into self._resolved_diagnostics: i. diagnostics that are not available are removed ii. all templates are filled iii. if 'frequency' is a dict, proper value is determined iv. frequency / operator are converted to lists even if just single value 3. self.diagnostics_dict is determined from self._resolved_diagnostics -- I don't know if we really want this to be part of the diagnostics class, or if this should really be part of the generate_diagnostics script. If the latter, I will rename self._resolved_diagnostics to self.diagnostics or something similar.
No need to store the frequency_operator string in diagnostics class, it can be constructed on the fly by MARBL_generate_diagnostics_file.py
Needed to rework the logic in _expand_template_value() because the location of the autotroph metadata is fundamentally different from corresponding tracer metadata: settings_dict['autotrophs(auto_ind)%sname'] vs settings_dict['_tracer_dict'][tracer_name]['short_name']. So we need to be able to loop over either auto_ind or tracer_name depending on what template we are expanding. Also noted where in the code per-autotroph dependencies will be checked (i.e. some diagnostics only apply to autotrophs that are calcifiers or silicifiers)
To support some autotroph metadata, I reworked how some of the dependency logic was determined. determined. "template_fill_dict" is turning out to be extremely useful, as it can now play a role in determining the diagnostic name (and values of some diagnostic metadata) as well as in the dependency check and determining the proper frequency of diagnostic output.
Also started to add support for per-zooplankton diags
With this commit, I believe all diagnostics are now defined in the YAML and properly processed by MARBL_generate_diagnostics_file.py
default_settings.yaml had been using //lname// to denote something to be replaced by either the autotroph long name or the zooplankton long name; to be consistent with default_diagnostics.yaml it should use ((autotroph_lname)) and ((zooplankton_lname))
Regardless of whether users prefer ".true. / .false.", "True / False", or "T / F" when setting a logical parameter via an input file, MARBL needs to be consistent internally. There are lots of checks for [variable] = ".true." in MARBL's Python so that's the format other values get converted to.
Clarified / added comments, and also cleaned up how template_fill_dict is set up
Instead of sorting tracers by module and PFT properties, now use dependencies dictionary (much like the diagnostics YAML file) Also made the tracers dictionary a seperate object in MARBL_settings rather than an entry in settings_dict[].
* renamed _array_size to _array_shape in settings YAML * cleaned up docstring for _get_array_info() * replaced "isinstance(__,unicode)" with "isinstance(__,type(u''))" (latter also works with python3) * renamed _get_dim_size -> _get_value [it's a generic routine even if it's currently only used to get shape of arrays] * replaced an enumerate with a zip() * Cleaned up error message when yaml_to_json can't import PyYAML
1. Only define autotroph _Qp diagnostic if running with variable P:C 2. Provide some documentation in default_diagnostics.yaml
Introduced MARBL_share.py to contain routines used by the two file_class.py files -- also cleaned up some of the code in each of those two files to allow as much as possible to be moved to MARBL_share.py. Subroutines in MARBL_share.py are accessed via MARBL_tools.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request sets up infrastructure (scripts and more YAML / JSON files) to allow MARBL to define what diagnostics are available without building / running any Fortran code. The script
MARBL_tools/MARBL_generate_diagnostics_file.py
creates a text file of the formatand each GCM will need to convert it to whatever format it uses for setting up diagnostic output. The corresponding POP changes are in place, but I still need to update how the POP-owned MARBL diagnostics (tracer states, etc) are included in the tavg file before bringing this branch onto
master
.