-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create an 'EC-Earth CMIP6 data request' json for each MIP experiment #253
Comments
I think it is the easiest to create this file with |
…s (but nemo only) in order to use this as a data request file by ece2cmor when cmorizing the result of test-all. It is kind of similar thing as asked for in #253 but then for this specific case.
Hi @treerink, what is the situation here? At SMHI we are in the process of settling on on-the-fly generated xlsx files for the data request. Basically we want to use
where we change the experiment, of course, but keep That means we need one data request file per experiment, regardless of the involved mips, multiplied by the configurations. I guess we should make a decision one way or the other (perhaps in the TWG?) and then document this so that everyone can approach this in the same way. |
@zklaus the original idea of producing a json variant of the data request which then only includes the variables which are requested for a certain experiment AND which can be produced by the used EC-Earth3 model configuration and archiving this in the control output sub directories for each experiment would be the most convenient. The difficulty here, which hindered us to quickly implement this, is again the "preference" issue (also referenced here as double counting issue). The whole bench of original xlsx CMIP6 data request files are of course produced by |
I've been thinking about this issue and we also discussed the xlsx files here at BSC. It would be nice to have the xlsx tables and/or the .json files that should be used by ece2cmor3 to cmorize each one of the MIPs in the ctrl folder. We could use this file as a reference for that MIP, assuming it was generated by the Data Request and has the correct variables. Right now our idea was to have the ppt/xml files in runtime/ctrl and the tables somewhere else, but I'm not sure this is the best approach. If we had a reliable table inside each folder, for DCPP, piControl, OMIP, etc., we can just point ece2cmor.py to that file. |
See also the discussion in #224. The solution of this issue to provide json data request files depends on a solution for the double-counting variables with a preference file. |
We just discussed the general design if and how we will create the json data request file and where it will be archived. We noted that for a joint data request like for the Core MIP experiments run by the AOGCM version (the joined request of these 10 MIPs) the There will be created an additional script (which will be called for each experiment by The control output files themselves won't be made preference (i.e. Earth3 model configuration) specific, in order to keep the design clear, on costs of a very limited tiny bit of additional (useless) output. |
This sounds good. Indeed, the
Sounds good.
Wrt the configurations, note that this is CMIP6 controlled vocabulary as
Note the capitalization, the presence of the 3, the absence of an explicit
I'm not sure I understand how treating the CMIP experiments differently from the others simplifies things, but I guess you are in the better position to judge that. |
Subtasks for this issue
|
When running:
I get the following additions when changing from e46cc12 to the latest version 3ae9a71:
|
Hi Gijs, I get also quite some differences in the output of |
Hi @treerink yes I changed the task loader, so this is expected to impact the genecec script. I do expect that it generates more 'double counted' variables, because the realm check was there to prevent such variables. I inserted a new warning whenever a duplicate variable is encountered:
so searching for this message may pinpoint to where the script is behaving differently... |
…nly for the "first" model configuration #253.
…h used EC-Earth3 model configuration #253.
Hi @treerink ,
and then
No output is produced. (As a side not, the IFS job still goes on doing all the time-consuming grib filtering.) |
Hi Uwe you aren't doing anything wrong, this is a signal that our "preference" script is incomplete, since it doesn't make a choice between ifs or lpjg for e.g. mrsos. I will make the preferences complete and add a check for ifs variables before entering the grib filtering |
Hi @ufladrich or @tommibergman can you post the list of duplicate variables that were reported? |
I got these: mrsos Some of them are doubly mentioned through different tables, but maybe that doesn't matter. |
Ok I committed a fix in which the above variables will be removed from the lpjguess variable list. |
…previous varlist.json), and the adding the production of the new varlist.json by adding a call to drq2varlist #253.
…alled now drqlist.json while the ones created by drq2varlist will get the name varlist.json #253.
…to the new type varlist.json files #253.
I have yet to understand what "preference" means in the context of this issue. @goord when you say above that the "preference script is incomplete", do you mean |
There are two more duplicated targets:
and some duplicated output names:
I'm not sure what to think about the latter, according to the CMIP6-CMOR tables the duplication is okay. |
It means that the |
Hi @ufladrich the preference script is here. It is just a python function that determines which variables to keep for which configurations and which to dismiss. Yes the preference logic is called from drq2varlist. This script gathers all variables that any EC-Earth component could produce, and then runs all of them through the preference function that determines whether to keep it or not. This procedure is supposed to yield a unique set of variables for all data requests and all EC-Earth configurations. Whenever you call ece2cmor with the --drq option, it does a drq2varlist first and then a cmorization with the component-wise variable set. It performs a check on the latter to ensure there are no duplicates, because that may give rise to files being overwritten. BTW whenever calling ece2cmor with --drq option or drq2varlist, it is best to give also a target EC-Earth configuration (use --help to get a list of those), because that can be used to determine the preference and hence reduces the chance of ending up with duplicates. |
The duplication of ua, ua7h etc. is a problem because it will cause overwritten variables since the output file names for these variables are identical (see issue #334 ). I believe they have different priorities, and we should decide which ones to keep. |
Hi @treerink the biggest change is the removal of variables for components that are not in the ec-earth configuration. I figured that e,g, AOGCM experiments should not be bothered with duplicates from e.g. land-surface or tm5 right? This will give a lot of removed variables I guess, I would expect entire blocks of component variables to be removed for certain configurations. |
Hi @goord, Ok, that seems indeed the case. I just show one example below, can you check this 71a72
> "evspsblsoi",
116c117,199
< "lpjg": {},
---
> "lpjg": {
> "Amon": [
> "fco2antt",
> "fco2nat"
> ],
> "Emon": [
> "cSoil",
> "mrsol",
> "treeFracNdlDcd",
> "treeFracBdlEvg",
> "treeFracBdlDcd",
> "grassFracC3",
> "grassFracC4",
> "pastureFracC3",
> "pastureFracC4",
> "nep",
> "fLuc",
> "cWood",
> "nwdFracLut",
> "fracLut",
> "vegFrac",
> "treeFracNdlEvg",
> "cropFracC3",
> "cropFracC4"
> ],
> "Eyr": [
> "treeFrac",
> "grassFrac",
> "shrubFrac",
> "cropFrac",
> "vegFrac",
> "baresoilFrac",
> "fracOutLut",
> "fracInLut",
> "fracLut"
> ],
> "Lmon": [
> "mrsos",
> "mrso",
> "mrros",
> "mrro",
> "prveg",
> "evspsblveg",
> "evspsblsoi",
> "tran",
> "tsl",
> "treeFrac",
> "grassFrac",
> "shrubFrac",
> "cropFrac",
> "pastureFrac",
> "baresoilFrac",
> "residualFrac",
> "cVeg",
> "cLitter",
> "cProduct",
> "lai",
> "gpp",
> "ra",
> "npp",
> "rh",
> "fFire",
> "fGrazing",
> "fHarvest",
> "nbp",
> "fVegLitter",
> "fLitterSoil",
> "cLeaf",
> "cRoot",
> "cCwd",
> "cLitterAbove",
> "cLitterBelow",
> "cSoilFast",
> "cSoilMedium",
> "cSoilSlow",
> "landCoverFrac",
> "rGrowth",
> "rMaint"
> ],
> "day": [
> "mrso"
> ]
> },
282c365,378
< "tm5": {}
---
> "tm5": {
> "AERmon": [
> "abs550aer",
> "od550aer"
> ],
> "Amon": [
> "o3",
> "o3Clim",
> "ch4",
> "ch4Clim",
> "ch4global",
> "ch4globalClim"
> ]
> } |
So this is for the AOGCM configuration I assume? Yes |
So @ufladrich and @tommibergman if you run drq2varlist or ece2cmor with the --drq option and you don't want to be bothered with duplicates from other submodels than your targeted EC-Earth configuration, you have to provide your configuration, e.g.
to remove all variables not in ifs or nemo. |
@treerink I removed tsl from lpjguess in the prefs.py and fixed a bug concerning EC-EARTH-CC so you may want to regenerate the json files... |
I had |
Done, the current latest version of the control output files in the |
I think we can (nearly) close this issue. The only sub issue I am not sure whether it is solved by now is this one about "duplication of ua, ua7h etc. which is a problem because variables will be overwritten". |
A separate issue is created in #422 for the last sub issue mentioned above. Closing this issue. |
With 'EC-Earth CMIP6 data request' I mean the subset of CMIP6 requested variables for a certain MIP experiment which indeed can be produced by EC-Earth3.
If this 'EC-Earth CMIP6 data request' is written to a json file it can be easily used as the data request file at time of cmorization, it can be easily diffed and it can be copied in the namelist subdir of each MIP experiment and thus archived at the EC-Earth svn repository. The latter wouldn't be a good idea with the *.xlsx data request files.
The text was updated successfully, but these errors were encountered: