Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Driver for configurations using CMEPS with the CESM driver #333

Merged
merged 15 commits into from
May 26, 2023

Conversation

dougiesquire
Copy link
Collaborator

@dougiesquire dougiesquire commented Mar 8, 2023

I'm not exactly sure what to call this driver since in theory it could be used/extended to run any model configurations that use CMEPS with the CESM driver, though the primary use case at the moment is for an ACCESS-OM3 configuration (currently it's called Cesm, which is probably not ideal).

In trying to make this general to all CMEPS configs, I've added another optional field to the input config.yml called components where users specify which model components are included in the configuration being run (e.g. see https://github.com/dougiesquire/gmom_jra_wd/blob/main/config.yaml). This is maybe not a desired change, in which case one option would be to make the driver specific to OM3. This would also solve the naming ambiguity, but it would make the driver less general (e.g. it could no longer be used to run https://github.com/dougiesquire/d_jra_wd)

Interested to hear people's thoughts

  • Get rid of components specification in config file
  • Add ability to collate mom output
  • Do some test runs, including testing collation
  • Update configs for MOM6-CIC6-WW3, MOM6-CICE6 and CICE6-WW3

@dougiesquire dougiesquire marked this pull request as draft March 8, 2023 00:29
@micaeljtoliveira
Copy link
Contributor

What about later on adding a ACCESS-OM3 driver that would be a child class of the Cesm model?

@coveralls
Copy link

coveralls commented Mar 8, 2023

Coverage Status

Coverage: 41.267% (-1.0%) from 42.245% when pulling e1bbe27 on dougiesquire:cesm_cmeps into 0109f2f on payu-org:master.

@aidanheerdegen
Copy link
Collaborator

I'm not exactly sure what to call this driver since in theory it could be used/extended to run any model configurations that use CMEPS with the CESM driver, though the primary use case at the moment is for an ACCESS-OM3 configuration (currently it's called Cesm, which is probably not ideal)

Would it be more accurate to call it Cmeps? Or would it also work with CESM?

I've added another optional field to the input config.yml called components where users specify which model components are included in the configuration being run

This is interesting. Previously coupled models used a sub-model approach, e.g. ACCESS-OM2

https://github.com/payu-org/payu/blob/master/payu/models/accessom2.py

but I guess the CMEPS architecture effectively removes the idea of separate models? Certainly makes writing a driver a fair bit simpler.

AFAICT none of the components utilise the existing drivers of the individual models, is that correct? I can't see how collate is going to work for the FMS based models if this is the case. Or have I missed something?

@dougiesquire
Copy link
Collaborator Author

dougiesquire commented Mar 8, 2023

Would it be more accurate to call it Cmeps? Or would it also work with CESM?

Cmeps_cesm is maybe the most accurate, since it expects the CESM driver. Though, tbh, I'm not actually sure how different things would be if a different driver were used. Maybe I should look into this.

This is interesting. Previously coupled models used a sub-model approach, e.g. ACCESS-OM2

https://github.com/payu-org/payu/blob/master/payu/models/accessom2.py

but I guess the CMEPS architecture effectively removes the idea of separate models? Certainly makes writing a driver a fair bit simpler.

Yeah, the components aren't really sub-models since they don't have their own executables. I'm sure there's a way around having components in the config.yml (which is confusing). I'll try to get back to this this week.

AFAICT none of the components utilise the existing drivers of the individual models, is that correct? I can't see how collate is going to work for the FMS based models if this is the case. Or have I missed something?

That's correct. Everything is handled by the CMEPS driver and the output is quite specific to the driver (e.g. restarts from each component all get named consistently and output to the run directory). Re collation... I'm not sure... For the tests runs I've done, MOM6 output is collated does not need collation.

@aidanheerdegen
Copy link
Collaborator

Cmeps_cesm is maybe the most accurate, since it expects the CESM driver. Though, tbh, I'm not actually sure how different things would be if a different driver were used. Maybe I should look into this.

I'm confused. Isn't this the CESM driver? Or does driver in this context mean the model code itself?

Yeah, the components aren't really sub-models since they don't have their own executables. I'm sure there's a way around having components in the config.yml (which is confusing). I'll try to get back to this this week.

I don't dislike the design, just trying to get a better understanding of the design process/limitations.

Everything is handled by the CMEPS driver and the output is quite specific to the driver (e.g. restarts from each component all get named consistently and output to the run directory).

Into their own "namespace"?

Re collation... I'm not sure... For the tests runs I've done, MOM6 output is collated.

In the TWG meeting it was noted that the io_layout = 1,1 was potentially a limiting factor to scalability, so you're definitely going to want to be able to collate outputs, which is already something that is handled by the fms driver. I don't think you want to reimplement that. A couple of options spring to mind:

  1. Split the collating code out to some sort of 'tools` module and import it in both drivers
  2. Do some sort of fancy-dancy python multiple inheritance which I have zero knowledge of to grab the collate functionality from the fms driver (not even sure that is possible, pure speculation)

Maybe you or @marshallward or @angus-g have an opinion or some knowledge about the best way to implement that.

@dougiesquire
Copy link
Collaborator Author

I'm confused. Isn't this the CESM driver? Or does driver in this context mean the model code itself?

Yeah sorry, I was a bit fast and loose with my language in my previous comment. There're two "drivers" in these discussions:

  • a Payu driver (the content of this PR)
  • the NUOPC/CMEPS driver (I think we've settled on using the version of this driver that was written for CESM - see here)

The Payu driver in this PR is intended to run model configurations that use NUOPC/CMEPS along with the CESM (NUOPC/CMEPS) driver. What I meant to say was that I'm not sure whether it would work (or could be easily extended to work) with other NUOPC/CMEPS drivers.

Hmmm... reading that back, I'm not sure it's any clearer

Re your other comments/questions, I'll try to get my head back into this later this week.

@micaeljtoliveira
Copy link
Contributor

Cmeps_cesm is maybe the most accurate, since it expects the CESM driver. Though, tbh, I'm not actually sure how different things would be if a different driver were used. Maybe I should look into this.

From what I could see, different drivers have different ways to specify which components to use at runtime. So some things would definitely need to be different for a different driver.

"drof_in",
"drof.streams.xml"
]
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dougiesquire payu dies with KeyError: 'docn' when running https://github.com/COSIMA/CICE6-WW3 so I guess we need something like this here?

    "docn": {
        "realm": "ocn",
        "config_files": [
            "docn_in",
            "docn.streams.xml",
        ],
    },

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry, I had the change locally but hadn't pushed it. Pip installing from this PR again should include the required change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thanks!

@aekiss
Copy link
Contributor

aekiss commented Apr 11, 2023

There are some rough edges that should be fixed - e.g. payu sweep doesn't work reliably (sometimes removes link but not dir that was linked to)

@dougiesquire
Copy link
Collaborator Author

Thanks for reporting this @aekiss. I need to find some time to implement this properly

@dougiesquire
Copy link
Collaborator Author

dougiesquire commented May 21, 2023

Finally coming back to this. Before spending too much time on it, I want to check in with @aekiss and @micaeljtoliveira on ACCESS-OM3 development.

Is it still looking like the configuration set-up (config/input files structure etc) and output from ACCESS-OM3 will be the same/similar as CESM-CMEPS? I.e., is it looking like the Payu driver in this PR could end up being what is used by everyone to run ACCESS-OM3, or is it likely just to be used during development of ACCESS-OM3 (for comparing ACCESS-OM3 executables to CIME-built executables)?

If the latter, we probably want to keep this Payu driver out of main and we should start writing a dedicated ACCESS-OM3 driver.

@aekiss
Copy link
Contributor

aekiss commented May 21, 2023

Thanks for looking at this. Good question - I think it will definitely be needed for development of ACCESS-OM3 for comparing to CIME/CESM, but it's too early to say what config the final production version will use. The inputs will all be different, but maybe the directory and file structure can be retained.

But if we call it the cesm driver then we should be able to put it into main and then adapt or duplicate it for an accessom3 driver as needed, right?

@dougiesquire
Copy link
Collaborator Author

But if we call it the cesm driver then we should be able to put it into main and then adapt or duplicate it for an accessom3 driver as needed, right?

Yes... I think. But I think the cesm driver possibly needs a bit of rethinking.

@dougiesquire
Copy link
Collaborator Author

In the TWG meeting it was noted that the io_layout = 1,1 was potentially a limiting factor to scalability, so you're definitely going to want to be able to collate outputs, which is already something that is handled by the fms driver. I don't think you want to reimplement that. A couple of options spring to mind:

  1. Split the collating code out to some sort of 'tools` module and import it in both drivers
  2. Do some sort of fancy-dancy python multiple inheritance which I have zero knowledge of to grab the collate functionality from the fms driver (not even sure that is possible, pure speculation)

Possibly an FmsCollate mixin?

@aidanheerdegen
Copy link
Collaborator

But I think the cesm driver possibly needs a bit of rethinking.

If it works I'd merge and work on improvements with follow up PRs, unless we're talking a radical rethink.

Possibly an FmsCollate mixin?

Yeah that is what I was thinking of with "fancy-fancy python multiple inheritance".

@dougiesquire
Copy link
Collaborator Author

If it works I'd merge and work on improvements with follow up PRs, unless we're talking a radical rethink.

What I'm thinking is that we might want to have a different approach than specifying the CESM model components in the config.yaml. This only makes sense for CESM-CMEPS models and it's maybe a bit confusing to users how components is different than submodels.

As suggested by @micaeljtoliveira, I'm thinking a CesmCmepsBase class that specific configurations inherit from (e.g. AccessOm3).

@dougiesquire
Copy link
Collaborator Author

There are some rough edges that should be fixed - e.g. payu sweep doesn't work reliably (sometimes removes link but not dir that was linked to)

@aekiss, I've been unable to reproduce this. Can you provide any more details about what happened?

@aekiss
Copy link
Contributor

aekiss commented May 23, 2023

Just ignore my comment - it was very sporadic. If it happens enough to be an problem I'll make an issue for it.

@dougiesquire dougiesquire marked this pull request as ready for review May 23, 2023 06:23
@dougiesquire
Copy link
Collaborator Author

@aidanheerdegen are you the right person to ping for a review?

There is now a AccessOm3 driver that inherits from a new CesmCmepsBase. The AccessOm3 driver will collate mom output. This can be used to run the config at:

(after a couple of small tweaks to the config that I haven't pushed yet). 

It would also be trivial to create drivers from CesmCmepsBase to run the following configs:

but I personally don’t think this is worthwhile. It will just clutter the payu.models module with drivers that probably no one will ever use. Instead, I’d suggest we archive those config repos and make it clear in their READMEs that those CESM-CMEPS configs are not supported in the main branch of payu. We can always add the drivers later if we decide we want them.

Once this PR is merged, I’ll update the configs above.

@aekiss
Copy link
Contributor

aekiss commented May 23, 2023

Awesome, thanks @dougiesquire, I like this inheritance approach.

We should retain these configs, as these combinations will actually be used a lot, and so we will need payu support for them as part of the staged development plan
https://github.com/COSIMA/MOM6-CICE6
https://github.com/COSIMA/CICE6-WW3

@dougiesquire
Copy link
Collaborator Author

dougiesquire commented May 23, 2023

We should retain these configs, as these combinations will actually be used a lot, and so we will need payu support for them as part of the staged development plan
https://github.com/COSIMA/MOM6-CICE6
https://github.com/COSIMA/CICE6-WW3

Sure. Any thoughts on what to call the payu models for these (ie what's entered in the config.yaml)? Perhaps something like "cesm-mom6-cice6" and "cesm-cice6-ww3"? If we go this route then perhaps we should rename "access-om3" to "cesm-mom6-cice6-ww3" for consistency (at least until things are more bedded down), although that's a bit of a mouthful...

@aekiss
Copy link
Contributor

aekiss commented May 23, 2023

How about just "mom6-cice6", "cice6-ww3" and "mom6-cice6-ww3"? Or do we expect we will need different drivers for the access-om2 configs, in which case the cesm prefix would help differentiate them?

@dougiesquire
Copy link
Collaborator Author

Or do we expect we will need different drivers for the access-om2 configs, in which case the cesm prefix would help differentiate them?

I don't understand sorry. There's already an AccessOm2 driver (model: access-om2) right? Are we expecting to need more access-om2 drivers?

@micaeljtoliveira
Copy link
Contributor

About the naming scheme, I would actually propose something different. I would call all of those models ACCESS-OM3, even thought that's incorrect if one considers that ACCESS-OM3 is always MOM6+CICE6+WW3. The point here is that we would like to have a compact and flexible way to swap some of the components by a data model in the payu config. So I would still allow one to specify the submodels in the payu config, but only for the ocean, waves and sea-ice components. All the rest (atm, etc) should be fixed. This last point is what would distinguish the access-om3 model in payu from the cesm model: some of the components would be hard-coded.

Does this make sense?

@aekiss
Copy link
Contributor

aekiss commented May 24, 2023

Apologies @dougiesquire - I meant ACCESS-OM3, not 2

@dougiesquire
Copy link
Collaborator Author

Does this make sense?

It does, but now we're back to the original problem of how to list the ACCESS-OM3 submodels in the payu config. We can't list them under submodels as this is for when each submodel has their own executable. I originally added a components key, but I went off this approach since it means that the config set-up is different than for all the other payu models. I'm quite possibly overthinking this...

A components key would mean the payu configs look like this, for example:

...

model: access-om3
components:
  - mom6
  - cice6
  - ww3
 
...

@micaeljtoliveira
Copy link
Contributor

Couldn't we get the components list from the nuopc.runconfig file instead of getting it from the config file?

@aekiss
Copy link
Contributor

aekiss commented May 24, 2023

All the rest (atm, etc) should be fixed.

Actually I expect it will be common for users to want to modify these components, e.g. perturbing the forcing or runoff, or replacing it with a different product (eg ERA5) so some flexibility should be retained here too. These would always be data models but the data sources should be easy to modify.

@aidanheerdegen
Copy link
Collaborator

Yes, but have been flat out! Sorry. Will look tomorrow.

@aidanheerdegen
Copy link
Collaborator

We can't list them under submodels as this is for when each submodel has their own executable.

So the mapping of CPUs between different sub-domains is done internally?

@aekiss
Copy link
Contributor

aekiss commented May 26, 2023

I think so...?

@dougiesquire
Copy link
Collaborator Author

So the mapping of CPUs between different sub-domains is done internally?

Yup, this is done in the nuopc.runconfig file. At the moment, for all the configs I set up, each component just runs sequentially and is allocated 48 cores.

@aidanheerdegen aidanheerdegen merged commit a771fe7 into payu-org:master May 26, 2023
@dougiesquire dougiesquire deleted the cesm_cmeps branch May 26, 2023 07:10
@dougiesquire dougiesquire mentioned this pull request May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants