Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only allow GCM to request certain output fields if base_bio_on is True #453

Merged
merged 14 commits into from
Feb 26, 2024

Conversation

mnlevy1981
Copy link
Collaborator

It doesn't make sense for MARBL to pass chlorophyll (surface or 3D), CO2 flux, O2 flux, or NHx flux if the base bio tracer module is inactive. Besides updating marbl_interface_public_types.F90 to check the value of base_bio_on, changes were made to the stand-alone driver so that the call_compute_subroutines test could still run with marbl_with_abio_only.settings.

alabaster 0.7.16 failed the CI; the last passing test used 0.7.13 so I am
pinning that version
There are also updates to the driver so that the call_compute_subroutines test
still runs when abio_dic_on = T and base_bio_on = F. I suspect there will be
baseline failures in that case due to changes in the output variables
@mnlevy1981
Copy link
Collaborator Author

Two additional notes:

  1. c64e5b7 has expected failures because there are output fields in the call_compute_subroutines.history_with_abio_only.nc baseline that are no longer in the history file
  2. I also modified the pip requirements file for the documentation to pin alabaster; the latest version of that package is not compatible with the version of Sphinx we are using

output_for_GCM_flux_co2, output_for_GCM_total_Chl, and
output_for_GCM_total_surfChl are no longer generated by the
call_compute_subroutines test when using marbl_with_abio_only.settings so they
have been removed from the baseline
@mnlevy1981 mnlevy1981 marked this pull request as ready for review January 29, 2024 18:07
@mnlevy1981
Copy link
Collaborator Author

@klindsay28 and I had a conversation about this PR at lunch, and we both agree think it would be beneficial to have the initialization routines register what outputs can be provided (under what conditions; e.g. CO2 flux requires base_bio_on = True), and then marbl_single_output_constructor() can compare against what has been registered rather than using the clunky select case currently in the code.

This would make the output_to_GCM part of the code look more like the settings and diagnostics part of the code, though we definitely want to register all the possible outputs every run so that error messages can say "flux_co2" requires base_bio_on = True rather than "flux_co2" is not a valid output_for_GCM.

@mnlevy1981
Copy link
Collaborator Author

Followup to the last comment -- introducing a registration for the output_for_GCM variables with an error_message argument could set up something like

if (base_bio_on) then
  error_message = ""
else
  error_message = "flux_co2 can only be provided when base_bio_on = True"
end if
call ???%register_output('flux_co2', error_message)

And then, when the GCM requests an output, we can loop through the registry. If len(trim(error_message)) == 0 then proceed with otherwise abort with error_message (and if an unregistered field is requested, abort with a generic {variable} not a valid output_to_GCM message)

The registry makes it easier to add new outputs for the GCM or to add more
restrictions on when certain outputs are available while maintaining useful
error messages.

Also cleaned up some white space / alignment issues
@mnlevy1981 mnlevy1981 marked this pull request as draft February 12, 2024 18:44
@mnlevy1981
Copy link
Collaborator Author

After chatting with @klindsay28 there are two tasks remaining

  1. Instead of returning some fields in marbl_instance%surface_flux_output and other fields via marbl_instance%get_output_for_GCM(), use a unified marbl_instance%output_for_GCM that can allocate memory for either a 2D or 3D field depending on what is requested. Basically, rename surface_flux_output to output_for_GCM and then make sure we can use the type to return full depth chlorophyll in addition to 2D fields like O2 flux, CO2 flux, etc
  2. Add a page to the documentation. Note that there are already a few pages that mention surface_flux_output and surface fluxes in the GCM. The first link will need to be updated to reflect the new setup, and the second link could use more detail under "2. Surface forcing fields, which may be requested by the GCM". The new documentation page for adding an output could link to one of those two pages under the heading "GCM Code Changes."

This required a lot of small changes.

MARBL library:

1. Rename surface_flux_output* -> output_for_gcm*
2. Add total_Chl index to indexing type, and rename sfo_ind -> ofg_ind
3, Remove marbl_instance%get_output_for_GCM()
4. Add field_source to registry type as well as marbl_single_output_type (so
   GCM knows whether the field should be copied after surface_flux_compute() or
   after interior_tendency_compute())
5. Generalize registry definition, and make more specific error messages
   (flux_o2 requires both base_bio_on and lflux_gas_o2 to be true)
6. Pass output_for_GCM to interior_tendency_compute so it can populate
   total_Chl (if requested by GCM)

Stand-alone driver:

1. rename total_Chl -> interior_tendency_output, add num_vars dimension (this
   is the 3D equivalent to surface_flux_output). Note that there is an
   assumption that all outputs_for_GCM from surface_flux_compute are 2D, while
   all from interior_tendency_compute are 3D. This is fine for now, but we may
   need to introduce datatypes if future mods contradict that assumption
2. Rely on field_source in output_for_GCM type to know what to copy out of the
   type / when to copy it
3. Total Chl is no longer a special case in marbl_io_mod.F90, it comes through
   the output_for_GCM object
@mnlevy1981
Copy link
Collaborator Author

In d572db6 there's a typo in an error message, so requesting flux_o2 when it is not available references flux_co2 rather than flux_o2 (the error message is Can not add flux_co2 to outputs without base biotic tracers and lflux_gas_o2); I'll fix this when I add a documentation page, otherwise the first task from #453 (comment) has been addressed.

Put surface_flux_output back on the interface, and also created
interior_tendency_output. This change lets us drop field_source from the output
type, and add_output_for_GCM() does all the error checking I had introduced
into marbl_single_output_constructor (making sure the variable is registered
and available).

Note that add_output_for_GCM() now returns field_source as well, so the GCM can
keep the surface flux output and interior tendency output separate when copying
to local memory (without knowing what fields come from surface_flux_compute()
vs interior_tendency_compute())
In surface_flux_mod.F90, I had changed the argument to
marbl_surface_flux_compute() from output_for_gcm to surface_flux_output (this
is what it was called before I tried to integrate the outputs for
surface_flux_compute and interior_tendency_compute; I've split those outputs
apart again, so the name change is not necessary)
1. Better comments in create_registry
2. The registry error message is now prepended with the field name in
   add_output_for_GCM()
3. create_registry can use base_bio_on, doesn't need it as an intent(in)
4. better comment in add_output_for_GCM (logs an error message, doesn't return
   one)
5. If ofg_ind is a target, components of output_for_gcm_type don't need to be
   pointers (registry%id => ofg_ind%* works)

Still have a few slightly bigger changes to make in the next commit
@mnlevy1981
Copy link
Collaborator Author

Reviewed with @klindsay28 -- addressed most of the small concerns in 89cf5b9 but two big tasks remain:

  • make it easier to add a new output for the GCM to the registry (linked list? re-allocatable array?)
  • documentation

@mnlevy1981 mnlevy1981 marked this pull request as ready for review February 15, 2024 21:56
Replace marbl_output_for_GCM_single_register_type with
marbl_output_for_GCM_linked_list_type; array of former becomes a pointer to
latter. This will make it much easier to add a new output_for_GCM, simply copy
format of a block of code in create_registry() [and add entry to
marbl_output_for_GCM_indexing_type]
it is called from create_registry, which is already in the
marbl_output_for_GCM_registry_type class
1. Typo in marbl_interface.F90 -- abort if registered_output is not associated
   (was checking this%output_for_gcm_registry%registered_outputs by accident)
2. Add comment about why create_registry() resets err_message to "" before each
   entry (and clean up errant white-space misalignment)
3. Also, blocks in create_registry() are easier to read if we set err_message
   to "" and then use a continutation line for the if () rather than using
   if-else
4. For adding to the registry linked list, order doesn't matter but the logic
   is simpler (and code is faster) if we prepend entries rather than looping
   through to the end of the list every time
Provides a mechanism to see what output_for_GCM fields are available for a
given configuration. Note that output_for_gcm_registry is now a public member
of the MARBL interface class so the standalone test can access it.
A new page explains what additional output is available from the GCM, and a
second page shows how to add a new output. The new available_output test is
also documented on the regression tests page.
@mnlevy1981
Copy link
Collaborator Author

I don't know why it isn't showing up here, but 1c59f37 passed CI tests on my fork

@mnlevy1981 mnlevy1981 merged commit d5862c6 into marbl-ecosys:development Feb 26, 2024
1 check passed
@mnlevy1981 mnlevy1981 deleted the chl_requires_base_bio branch March 1, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

With introduction of base_bio_on, users may request fields that MARBL can not provide
1 participant