Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential for unit mismatch between formulations providing variables with the same name #829

Open
donaldwj opened this issue Jun 3, 2024 · 2 comments

Comments

@donaldwj
Copy link
Contributor

donaldwj commented Jun 3, 2024

In the current code there is nothing preventing a scenario where two Formulations both provide an output variable with the same name.
This is an issue because the netcdf output code creates variables for each output variable defined by any formulation in a layer. This means the variable would be created when first located in a formulation and ignored thereafter. However the units for later instance of the output variable name are not guaranteed to be the same, which can lead to either incorrect values in output or program termination.

##Example

Formulation A --> (stream-flow-excess cm^3/s float )
Formulation B --> (stream-flow-excess m^3/h double)
Formulation C --> (stream-flow-excess 'none' string)

The variable stream flow excess will only be created once with the type and units set by the first encountered formulation (we can not create separate variables for each formulation as there is currently no way to determine what formulation is bound to a BMI_Formulation object.)

Proposed Fix

Allow formulation name to be retrievable this would allow per formulation variables with maps used to combine compatible definitions.

@donaldwj
Copy link
Contributor Author

donaldwj commented Jun 3, 2024

For example if we have 'topmodel' and 'cfe' variables names of

  1. cfe-infiltration-excess
  2. topmodel-infiltration-excess

could be generated. This names would be checked to see if there was a known conversion for the output that would allow the name to be changed to 'infiltration-excess' when variable names match expected units.

@robertbartel
Copy link
Contributor

So, as I see things:

  • The framework needs a source for globally unique variable identifiers for what is going into the ngen netcdf output
  • BMI module themselves cannot fulfill this (implicitly or explicitly), because of the nature of BMI; problematic examples include:
    • hypothetically, two independent TOPMODEL implementations get written and someone wants to use them both in ngen
    • someone wants to use two different versions/variants/compilations of a module in different places within a run (I want to say this is already a practical possibility with CFE)
  • that basically leaves the runtime configuration to be this source
  • the framework must create a scope for such identifiers and the entities for managing it, with these being responsible for either:
    • all relevant BMI module variables, or
    • all relevant BMI modules themselves (which can then also handle the variables)

In either case, something dedicated to explicitly defining things within this scope will have to be added to the realization config, either centrally or in each formulation.

We might could leverage the variable names mapping capabilities, but additional restrictions would need to be enforced, on a different scope than those currently in place for it. It'd probably be better to just add something similar strictly for aliasing variables within the netcdf output scope.

We could also leverage model_type_name from realization configs, but we'd need to add a way to ensure two modules with the same values for this really are the same thing, and also perhaps that two of the same module usages receive the same model_type_name.

Along those lines, while this risks being premature optimization, it might be wise to:

  • Centralize the definitions of all loaded BMI modules within the realization config
    • Include things like a globally-unique name/identifier and package/library file here
    • Do not move things like variable name mapping or init config path here
    • Potentially add support for other things at this level (now or later), like equivalences and associations between output variables from different modules, automatic conversions to desired units, etc.
  • Use reference to the BMI module definitions within formulations, alongside formulation specific details like init config paths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants