Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix minor bugs with default values and specificity in ngen.cal and ngen.config #34

Merged
merged 22 commits into from
Jul 25, 2023

Conversation

aaraney
Copy link
Member

@aaraney aaraney commented Nov 14, 2022

Fixes minor issues in cal and config's pydantic model fields. This does not introduce any breaking changes.

fixes #31, #32, #33

Ngen Cal [0.2.1] - 2023-06-24

Changes

  • ngen.cal Objective enum now properly subclasses str. This fixes pydantic models' json schemas that use Objective as a field type. See ngen.cal Objective Enum needs type constraints #31 for impacts.
  • ngen.cal calibration strategy types: NgenExplicit, NgenIndependent, and NgenUniform, now have default strategy field values.

Ngen Config [0.1.9] - 2023-06-24

Changes

  • Fixed field discriminator issue in ngen.config's Formulation model params field.
    Note, during Formulation object creation if the params field is deserialized (e.g. params's value is a dictionary), the name field is required. If name is not 'bmi_multi', the model_type_name field is also required. Neither are required if a concrete known formulation instance is provided (see Typo in formulation model constraint #32).

Testing

  1. Added tests to verify NgenExplicit, NgenIndependent, and NgenUniform strategy field default value.

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist

Target Environment support

  • Linux
  • MacOS

@aaraney
Copy link
Member Author

aaraney commented Nov 14, 2022

It looks like by fixing the typo in Formulation now the tests are failing. Looking into a resolution to this now.

@aaraney
Copy link
Member Author

aaraney commented Nov 14, 2022

So, the issue is in the Formulation pydantic model on the params field. Now that the discriminator field is model_name (alias is model_type_name), pydantic wants that field too be a Literal instead of a string. This should be a constant to for Topmod, CFE, NoahOWP, and LSTM but not MultiBMI, correct @hellkite500? Or can we also make it a constant, i.e. a Literal for MultiBMI as well?

@hellkite500
Copy link
Member

The multi BMI should be constant at the top level, and the sub modules would be constant in the recursive modules under there

@aaraney
Copy link
Member Author

aaraney commented Nov 14, 2022

The multi BMI should be constant at the top level, and the sub modules would be constant in the recursive modules under there

So it is safe to tie model_name to a Literal? For concreteness and example, LSTM.model_name should always equal LSTM, @hellkite500?

@hellkite500
Copy link
Member

For now I think that will work. That was the intent of the original implementation at least...

@aaraney
Copy link
Member Author

aaraney commented Nov 14, 2022

For now I think that will work. That was the intent of the original implementation at least...

Just so we are 1000% on the same page, this means that all MultiBMI formulations will have a model_name of BMIMulti.

@hellkite500
Copy link
Member

hellkite500 commented Nov 18, 2022

Just so we are 1000% on the same page, this means that all MultiBMI formulations will have a model_name of BMIMulti

As long as that only applies to the container, not the sub formulations, then I don't see a problem with that. (though we can decide if it should be BMIMulti or MultiBMI...haha)

@aaraney
Copy link
Member Author

aaraney commented Mar 30, 2023

@robertbartel was having some related issues over here which made me revisit this.

The multi BMI should be constant at the top level, and the sub modules would be constant in the recursive modules under there

In re-reading this, @hellkite500, I think I was being a little pedantic in my use of the word 'constant'. To re-phrase, is it okay to enforce that all MultiBMI models have the same model_name field value? Right now, this is not the case model_name is being used to often convey the formulations in the modules field (i.e. "model_type_name": "bmi_multi_noahowp_cfe").

@robertbartel
Copy link
Contributor

robertbartel commented Mar 31, 2023

@aaraney, @hellkite500, I'm only just starting to dig into this, so I may not quite understand everything correctly yet. But ngen-config functionality seems to depend on using specific strings for the model_name of a Formulation object (i.e., the model_type_name element(s) in realization configs).

E.g., a BMI multi-formulation must be configured with something like this in the realization config file:

   "model_type_name": "BMIMulti",

Based on the first few example realization configs I have looked at after beginning to review this (see here), that appears to be a condition that not necessarily followed in practice or in principle, at least in the provided examples. I.e., model_type_name is used to provide a summary description, like bmi_multi_noahowp_cfe.

Additionally, the ngen documentation on BMI formulations in realization configs only indicates that some model_type_name value is required. It doesn't express a need for any specific values. It actually does something closer to the opposite, conveying that the particular value isn't even always important, despite the field itself always being required. I haven't yet checked to see if the ngen code is consistent with the documentation in that manner; perhaps @mattw-nws or @hellkite500 might be able to confirm without having to do a lot of digging.

This probably warrants further design and dependencies discussion, although a verification of my assessments is probably needed first.

@aaraney
Copy link
Member Author

aaraney commented Mar 31, 2023

@aaraney, @hellkite500, I'm only just starting to dig into this, so I may not quite understand everything correctly yet. But ngen-config functionality seems to depend on using specific strings for the model_name of a Formulation object (i.e., the model_type_name element(s) in realization configs).
...
This probably warrants further design and dependencies discussion, although a verification of my assessments is probably needed first.

@robertbartel, your assessment is correct. ngen.config uses the model_name name field to determine the appropriate model variant for a given formulation (e.g. PET, NoahOWP). From my understanding, this is purely a pragmatic solution as there is not another constant unique field across all models.

@robertbartel
Copy link
Contributor

@aaraney, I rebased this branch, then fixed a few other non-Literal model_type_name fields, and I was able to get tests passing locally for the related dmod#316. I have that in my fork, under https://github.com/robertbartel/ngen-cal/tree/pr/34/rebased, if that's of any use to this issue.

@aaraney
Copy link
Member Author

aaraney commented Apr 4, 2023

@robertbartel, thanks for the breadcrumbs. Ill have a look to see what changes I might need to make in this PR.

@aaraney
Copy link
Member Author

aaraney commented Apr 4, 2023

@robertbartel, I see. This PR predates the SLOTH module, so it was not added in this PR. That makes sense why it was not working for you.

@aaraney
Copy link
Member Author

aaraney commented Apr 4, 2023

I added support for SLOTH and PET plus fixed a test that was failing.

@aaraney
Copy link
Member Author

aaraney commented Jul 24, 2023

The tests should fail because of #62.

@aaraney
Copy link
Member Author

aaraney commented Jul 24, 2023

The only tests are that failing are because of #62. Ill get in a quick PR to fix that, we can rebase this once that fix is in and re-review.

…pe_name fields

Note, during object creation if the `params` field is deserialized (e.g. `params`'s value is a
dictionary), the `name` field is required. If `name` *is not* 'bmi_multi', the `model_type_name`
field is also required. Neither are required if a concrete known formulation instance is
provided.
@@ -22,7 +23,7 @@ class MultiBMI(BMIParams, smart_union=True):
# NOTE this is derived from the list of modules
main_output_variable: Optional[str]
#NOTE aliases don't propagate to subclasses, so we have to repeat the alias
model_name: Optional[str] = Field(alias="model_type_name")
model_name: str = Field(alias="model_type_name")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this still be optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. In the super class it is a str. I think the most sensible thing is to either set a default empty string or leave it as is. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the validator build_model_name was using the optional None in order to build the name from the named nested modules. We can set a default empty string "" but would need to ensure that validator recognizes that and substitutes the built name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is one of my gripes with pydantic. There is not a semantic way to signify to the caller that something is computed if it is not provided. The way that we have been doing this in DMOD is to set the default of a non-optional field to None. I guess in a perfect world, we would create a constant, say COMPUTED and use that as the default value.

Here, I just went with setting the default value to None and updated the root validator to always set the model name to something if it is not provided.

@@ -6,7 +6,7 @@
"name": "bmi_multi",
"params": {
"name": "bmi_multi",
"model_type_name": "NoahOWP_CFE",
"model_type_name": "BMIMulti",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we revert this if "BMIMulti" isn't the required literal now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine with me. Good catch

hellkite500
hellkite500 previously approved these changes Jul 25, 2023
@aaraney aaraney merged commit 9d44a00 into NOAA-OWP:master Jul 25, 2023
13 checks passed
@aaraney aaraney deleted the pydantic_fixes branch July 25, 2023 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ngen.cal Objective Enum needs type constraints
3 participants