Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated model API #354

Merged
merged 27 commits into from
Jul 27, 2023
Merged

Updated model API #354

merged 27 commits into from
Jul 27, 2023

Conversation

Peter9192
Copy link
Collaborator

@Peter9192 Peter9192 commented Jun 13, 2023

This PR rewrites the core eWaterCycle model class, such that

  • There is a base implementation for a LocalModel (mostly for development purposes)
  • There is a base implementation for a ContainerizedModel (intended for production)

Together with the new with the new plugin architecture (#336, #347), this should make it easier to add new models to eWaterCycle.

To do:

  • Fix typeerror
  • Use the new model interface for Wflow
  • Use the new model interface for Marrmot
  • Use the new model interface for Lisflood
  • Use the new model interface for Hype

moved to #359:

  • Update tests for abstractmodel to test the new API (once crystallized)
  • Update documentation
  • Remove new_model_api.py from repo.
  • Nice to have: Extend containerimage class to check version
  • Nice to have: Extend containerimage class to automatically pull docker image and convert to singularity

@Peter9192
Copy link
Collaborator Author

@BSchilperoort @sverhoeven I'm gonna call it a (holi)day. Tried not to make too big of a mess.. Hope you can pick it up from here.

@BSchilperoort
Copy link
Contributor

I discussed the API with Peter, and we've decided to (for now) pin the container version in the model class.
Model plugin developers will have to release a new version of when a new docker version is released.

E.g.:

class Wflow(ContainerizedModel):
    bmi_image: ContainerImage("ewatercycle/wflow-grpc4bmi:2020.1.3")

Setting the version to "latest" will have reproducibility issues & managing lists of valid versions is a lot of bookkeeping and hassle.
Of course, as bmi_image is a kwarg, a user can overwrite this.

Additionally, I believe this will remove the need for ParameterSet.supported_model_versions, as the model's parameter set is packaged with the model class in the plugin.

@BSchilperoort
Copy link
Contributor

@Peter9192 I have moved over Hype, and fixed the Hype tests (/hype/test_model.py). Could you have a look before I move all the rest over and fix their tests?
I also had to make some modifications to /base/model.py

Copy link
Collaborator Author

@Peter9192 Peter9192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Bart, nice work so far. I have a few comments, we can discuss them if anything is unclear.

src/ewatercycle/base/model.py Outdated Show resolved Hide resolved
src/ewatercycle/plugins/hype/model.py Outdated Show resolved Hide resolved
in CFG.output_dir
"""
cfg_path = super()._make_cfg_dir(
cfg_dir=cfg_dir, folder_prefix="hype", **kwargs
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do this for every model now only to add the folder prefix? Would it be possible to get that from self.__class__ or self.__name__?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using self.__class__.__name__.lower() would probably be a nice way of retrieving the folder prefix. It would usually remove the need to define _make_cfg_dir

Comment on lines +54 to +63
_model_name: str = PrivateAttr("m_01_collie1_1p_1s")
_parameters: List[float] = PrivateAttr([1000.0])
_store_ini: List[float] = PrivateAttr([900.0])
_solver: Solver = PrivateAttr(Solver())
_model_start_time: str | None = PrivateAttr()
_model_end_time: str | None = PrivateAttr()

def __init__(self, version: str, forcing: MarrmotForcing): # noqa: D107
super().__init__(version, forcing=forcing)
self._parameters = [1000.0]
self.store_ini = [900.0]
self.solver = Solver()
self._check_forcing(forcing)
_forcing_filepath: str = PrivateAttr()
_forcing_start_time: datetime.datetime = PrivateAttr()
_forcing_end_time: datetime.datetime = PrivateAttr()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to store all these as attributes separately on the model, or whether we could store them all in a single config object (can be a simple dict). Ideally we could use the models' native configuration files for that. So on creation you'd parse the template config file, update it in memory as necessary, then in setup save a copy with the latest changes and return that one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it matters too much if they're separate private attributes or items of a private attr dictionary.

Up to now I've mostly focused on getting everything moved over to the new API and working again. Seeing as these are all private attributes (i.e. not user accessible) we can change the internal workings (e.g., by rearranging them in a dict) without breaking things.

@Peter9192 Peter9192 mentioned this pull request Jul 26, 2023
@Peter9192
Copy link
Collaborator Author

Additionally, I believe this will remove the need for ParameterSet.supported_model_versions, as the model's parameter set is packaged with the model class in the plugin.

This only holds for the example parameter sets. However, we envision that researchers could share their parametersets as a standalone dataset, and that you could then load that by means of a configuration file. Ideally datasets would be shared via zenodo or other repos and retrieved through their doi. So then, you'd need to add sufficient metadata about compatibility.

@BSchilperoort
Copy link
Contributor

BSchilperoort commented Jul 26, 2023

I have now moved over all models, and attempted to standardize them a bit more. All tests except the lisflood ones (and AbstractModel one) are now passing. The lisflood ones are a bit difficult to work on because a large download takes place every time the test is run.

Working with pydantic is a bit frustrating as it doesn't seem to fit our use completely. In v1 for example, a post_init method is missing, and the @root_validator decorator is not a very good replacement (as it takes cls and not self). At least that's fixed in v2.

Also, pydantic seems to have an attribute "parameters" which can get a bit confusing for us/our users.

@Peter9192
Copy link
Collaborator Author

Working with pydantic is a bit frustrating as it doesn't seem to fit our use completely.

We should definetely start using v2 asap. But if that's still frustrating, we could consider using "normal" dataclasses.

@Peter9192 Peter9192 mentioned this pull request Apr 6, 2023
5 tasks
lat: Latitudinal value
lon: Longitudinal value
def get_parameters(self):
self._post_init()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this post_init is not defined, should it call _parse_config instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah you're right. I guess the tests didn't catch any of that... I think renaming the _parse_config is best.

Doing these operations in a post_init in all models would make them look more similar.

def get_latlon_grid(self, name: str) -> tuple[Any, Any, Any]:
raise NotImplementedError("Hype coordinates cannot be mapped to grid")

def get_parameters(self) -> Iterable[Tuple[str, Any]]:
Copy link
Collaborator Author

@Peter9192 Peter9192 Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BSchilperoort why did you change the parameters from a property called parameters to a method called get_parameters? Is this to do with what you said about pydantic?

Note that we want to use parameter akin to pymt: https://pymt.readthedocs.io/en/latest/usage.html#model-setup

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see that *Model.parameters is defined as a variable property in BaseModel. I think it conflicted with defining the parameters as @property in the specific model implementations.

I guess it's best to change how the BaseModel implements that.

Side note: I've found it a bit confusing that our BaseModel has the same class name as the pydantic BaseModel 😕

class BaseModel(pydantic.BaseModel, abc.ABC):
    ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to eWaterCycleModel in 998b226

@BSchilperoort
Copy link
Contributor

Working with pydantic is a bit frustrating as it doesn't seem to fit our use completely.

We should definetely start using v2 asap. But if that's still frustrating, we could consider using "normal" dataclasses.

If we do at some point decide to not use pydantic, a good alternative is probably typing's Protocol, and use @runtime_checkable. Although those checks are not as robust as Pydantic.

@Peter9192
Copy link
Collaborator Author

Pff merge concluded.. 😅

@Peter9192 Peter9192 mentioned this pull request Jul 27, 2023
8 tasks
@Peter9192 Peter9192 marked this pull request as ready for review July 27, 2023 13:05
Copy link
Member

@sverhoeven sverhoeven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better then before. Still lots to do.

@Peter9192 Peter9192 merged commit 92cd616 into main Jul 27, 2023
2 of 3 checks passed
@Peter9192 Peter9192 deleted the defaultmodels branch July 27, 2023 13:07
@Peter9192
Copy link
Collaborator Author

Peter9192 commented Jul 28, 2023

Better then before. Still lots to do.

See #359

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants