Skip to content

Using C‐ESM‐EP with libIGCM

Stéphane Sénési edited this page Oct 14, 2024 · 119 revisions

Shortcuts

The full list of config card Post section parameters.
The Outputs.
The slices example.

Content

Pre-requisites :

  • some basic knowledege of C-ESM-EP (i.e. organization of a C-ESM-EP home directory, and concept of a 'comparison')
  • how to run a libIGCM simulation

Introduction

When running a simulation (at TGCC on Irene and Irene-rome, at IDRIS on Jean-Zay and at ISPL mesocenter on Spirit) you can trigger the computation of a C-ESM-EP atlas based on one of the output types (Seasonal, TimeSeries, packed Output, or even Output on scratch for e.g. a TEST simulation). This works when using a libIGCM version from trunk with a release number >= 1633.

The atlas shows a series of the most recent time slices of the running simulation. Slices period, number and duration are tunable.

You just have to set one or more parameters in the simulation config.card. At IDRIS, you also have to go (once for all) through an init phase.

You can also run C-ESM-EP on a simulation that already ran; see Using C-ESM-EP on a pre-existing simulation

Basic use

Init phase

At IDRIS you must tell singularity where to find a relevant environment, by interactively issuing a few commands:

module load singularity
idrcontmgr cp /gpfswork/rech/psl/commun/Tools/cesmep_environment/<file>

where <file> is the most recent .sif file there (ask your C-ESM-EP guru in case of trouble). You must check using idrcontmgr ls that you have only one registered file; use idrcontmgr rm ... to discard any other file.

This init phase should be done only once by each user. The paths above is correct at the time of writing but may have changed.

On all centers you have also to make sure that you are allowed to use the Thredds. At IDRIS and TGCC please test it using command thredds_cp. On spirit, try touch /thredds/ipsl/<username>

Settings:

Computing a C-ESM-EP atlas is triggered using parameter Cesmep in section Post of the config.card, which can be set to :

  • TRUE, for using the most efficient output type (among those which are generated by the simulation, so SE else TS else packed Outputs).
  • SE, TS, or Pack, for selecting explicitly an output type (and its time slices duration as CesmepPeriod); that output type must be activated independently
  • AtEnd for running the atlas only at simulation end, from Output type data (this works also for TEST simulations with output on scratch)
  • FALSE for deactivating any C-ESM-EP atlas (which is the default)

Note : at the time of writing, SE data are ill-formed regarding time origin metadata and cannot be processed by the C-ESM-EP

In same section:

  • parameter CesmepReferences allows to set one or more reference simulation(s) and their periods. The simulation(s) must be IGCM_OUT simulation(s). The parameter value should be a comma separated list of the full path of their directory outputs, up to the e.g. TS_MO part, suffixed by the period to process; the 'DIR' part of the path (e.g. 'ATM', 'OCE'...) can be the wildcard '*'; see the example; you can also use keyword default to use default climatologies (and this is implicit if parameter CesmepReferences is missing). NEMO components does not yet support multiple references and will use only the first one.
  • parameter CesmepSlices allows to set the number of time slices to show in atlas (it defaults to 8)
  • parameter CesmepPeriod allows to set the time interval between each slice with the form <nyears>Y.
  • parameter CesmepSlicesDuration allows to set the duration of each slice with the form <nyears>Y.
  • parameter CesmepInputFrequency should be set to daily or yearly if C-ESM-EP has to use simulation outputs at that frequency

Plot slices end at intervals regularly spaced by CesmepPeriod years (from simulation start year), and covers a duration of CesmepSlicesDuration (backward)

Slices example

Simulation first year = 1850

CesmepPeriod = 10Y

CesmepSlicesDuration = 3Y (Dur)

CesmepSlices = 2

First slice = [1857,1860[ (or [1857,1859])

    Simulation
       Start
        |
      1850+-+-+-+-+-+-+-+-1860+-+-+-+-+-+-+-+-1870+-+-+-+-+-+-+-+-1880+-+-+-+-+-+-+-+-1890
        | 1 2 3 4 5 6 7 8 9 | 1 2 3 4 5 6 7 8 9 | 1 2 3 4 5 6 7 8 9 | 1 2 3 4 5 6 7 8 9 |
        |                   |                   |                   |                   |
        [<--CesmepPeriod--->[<--CesmepPeriod--->[<--CesmepPeriod--->[<--CesmepPeriod--->[
        |                   |                   |                   |                   |
        |             [<Dur>[             [<Dur>[             [<Dur>[             [<Dur>[
        |             +-+-+-+             +-+-+-+             +-+-+-+             +-+-+-+
        |             [Slice[             [Slice[             [Slice[             [Slice[
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

If the simulation post-processing step has processed data ending at year 1879, two plots will show: [1867-1869] and [1877-1879]. No newer plot will show until a post-processing delivers data for 1889.

Note : if CesmepPeriod is longer than the simulation duration, you won't get any plot, except if Cesmep = AtEnd

Example of config.card minimal content (in section 'Post'):

#D- Activate C-ESM-EP atlas by setting Cesmep to TRUE 
Cesmep=TRUE

See here for an example showing the full list of settings.

How it works

When installing a simulation using ins_job, the C-ESM-EP code (from a reference code version) is partially copied to $SUBMIT_DIR/cesemp_lite/, which becomes the root C-ESM-EP directory for that simulation.

The C-ESM-EP comparison that is ran by default is run_comparison and, in directory cesmep_lite/, that comparison name is further prefixed by your JobName (this matters when looking for outputs, see below)

The atlas is computed each time a batch of output is available for the selected output type, provided it allows to process a new time slice. You can also launch an atlas computation manually, see Re-running C-ESM-EP for a given period

The end year of time slices are aligned with simulation start date and comply with values provided for parameters CesmepSlices and CesmepPeriod (see example above)

The account used for C-ESM-EP jobs is the one used by libIGCM for the simulation. If you wish to change it, please edit file cesmep_lite/settings.py accordingly (after execution of ins_job and before the end of first simulation period).

Outputs

  • The standard output of last C-ESM-EP launch is available in $SUBMIT_DIR/cesemp_lite/libIGCM_post.out
  • The output for each component run is located, as for all C-ESM-EP runs, in the component directory
  • The atlas main page is available on thredds/work like for other C-ESM-EP simulation e.g., if JobName is 'piCesmep5' on Irene, atlas main index could be found at https ://thredds-su.ipsl.fr/thredds/fileServer/tgcc_thredds/work/senesis/C-ESM-EP/IPSLCM6/DEVT/piControl/piCesmep5/Output/piCesmep5_run_comparison/AtlasExplorer/atlas_AtlasExplorer_piCesmep5_run_comparison.html (this link is not active; you can have a look here at an example of C-ESM-EP atlas, for comparison standard_comparison, which however was not generated using libIGCM)

The actual location value for your simulation can be found in the file quoted above, $SUBMIT_DIR/cesemp_lite/libIGCM_post.out

  • You can receive mails for the completion of each new atlas slice by setting CesmepMail=TRUE in config.card, and by providing your email adress (either in config.card using parameter MailName in section UserChoices, or through the content of file ~/.forward). Depending on the content of file $SUBMIT_DIR/cesmep_lite/settings.py (see there variable one_mail_per_component), you will get a mail for each component's job, or a mail for the set of jobs. Please pay attention to the value of exit code in the mail subject line.

Advanced use

C-ESM-EP version used

By default, the C-ESM-EP code used is a shared, reference one (which location shows below); this can be changed using config.card's Post section's parameter CesmepCode.

The reference C-ESM-EP code locations are :

  • at TGCC : ~igcmg/Tools/cesmep
  • at IDRIS : /gpfswork/rech/psl/commun/Tools/cesmep
  • on spirit: /net/nfs/tools/Users/SU/jservon/cesmep_installs/cesmep_for_libIGCM

C-ESM-EP 'comparison', 'components', and their parameters

The C-ESM-EP 'comparison' can be chosen using config.card's Post parameter CesmepComparison. Its default value is run_comparison, which includes a limited number of components, and does not fit for an Orchidee Offline run (please use standard_comparison for that)

The comparison 'components' are activated based on the simulation physical components; their list can be changed manually after running ins_job by editing file $SUBMIT_DIR/cesemp_lite/libIGCM_post.param. See file's description here

At that stage, you may also change component parameters in component directories in $SUBMIT_DIR/cesemp_lite/.

Additionally, you may also make changes to the datasets_setup.py source for customizing the datasets to use; for that, you can make use of the variables available in comparison's directory file libIGCM_fixed_settings.py, as e.g. :

root           = '/ccc/store/cont003/gen0826'
Login          = 'senesis'
TagName        = 'IPSLCM6'
SpaceName      = 'DEVT'
ExperimentName = 'piControl'
JobName        = 'piCesmep'
OUT            = 'Analyse'
frequency      = 'monthly'
DateBegin      = '18500101'
CesmepSlices   = 4
CesmepSlicesDuration   = 4
CesmepPeriod   = 1

which names are self-explanatory in C-ESM-EP and libIGCM contexts except DateBegin, which is the simulation start date

CliMAF cache used

The location for CliMAF cache is dedicated to the simulation and under a root path chosen by C-ESM-EP :

${root}/cesmep_climaf_caches/${OUT}\_${TagName}\_${SpaceName}\_${ExperimentName}\_${JobName}

With:

  • on Irene, root=${CCCSCRATCHDIR}
  • on Jean-Zay, root=$SCRATCH
  • on Spirit, root=/scratchu/$user

You may change the cache location by editing file $SUBMIT_DIR/cesemp_lite/libIGCM_post.param. See file's description here. However, this would disturb its automated cleaning by libIGCM house-keeping script(s).

Example of config.card full content

Example:

#D- Activate C-ESM-EP atlas by setting Cesmep to TRUE, or to SE, TS, Pack or AtEnd. 
#D- This defines the atlas period, except case TRUE (defaults to FALSE)
Cesmep=TRUE

#D- Name of C-ESM-EP 'comparison' to run (defaults to run_comparison)
CesmepComparison=run_comparison

#D- Tell where is C-ESM-EP source code (defaults to a center-dependant value)
CesmepCode=/ccc/cont003/home/igcmg/igcmg/Tools/cesmep/

#D-If C-ESM-EP has to use daily or yearly simulation outputs, state it here.
CesmepInputFrequency=monthly  # 'monthly' also fits for for seasonal outputs

#D- How many time slices in C-ESM-EP atlas (defaults to 8, use 'max' for covering the whole simulation)
CesmepSlices=4

#D- Period of data slices in C-ESM-EP atlas (defaults to the period of the selected output type)
CesmepPeriod=20Y

#D- Duration of each time slice in C-ESM-EP atlas (defaults to CesmepPeriod)
CesmepSlicesDuration=5Y

#D- Send mail for each Cesmep Period (either a single one, or one per 
#D- component, depending on settings.py)(defaults to FALSE)
CesmepMail=TRUE

#D- Paths for reference simulation(s) (up to segment OUT, and with a period suffix) 
#D- You can aso use value 'default'
#D- Optional. Default is to use C-ESM-EP default observation dataset for each variable
CesmepReferences=/ccc/store/cont003/gencmip6/lurtont/IGCM_OUT/IPSLCM6/PROD/historical/CM61-LR-hist-01/*/Analyse/TS_MO/1980_1989,/ccc/store/cont003/gencmip6/lurtont/IGCM_OUT/IPSLCM6/PROD/ssp245/CM61-LR-scen-ssp245-r6/ATM/Analyse/TS_MO/2015_2025,default

Re-running C-ESM-EP for a given period

You can easily re-run C-ESM-EP by executing:

$SUBMIT_DIR/cesmep_lite/libIGCM_post.sh <begin> <end> [<cesmep_component>]

This is the script that libIGCM calls at the end of the type of post-processing chosen (in config.card) to feed data to C-ESM-EP (or at the end of simulation for case Cesmep=AtEnd. It submits one job per component.

Parameters begin and end normally describe (in years) the data period which post-processing just occurred. The script diagnose which is the last time slice to show in atlas, namely the one which ends in the given period (slice scheme here); note that this slice may end before end. C-ESM-EP will show the requested number of slices, ending with that one.

You may change CesmepPeriod, CesempSlices and CesmepSlicesDuration in file libIGCM_fixed_settings.py of the comparison directory for finely driving the choice of slices shown. An alternate method is to modify the parameters in config.card, followed by a re-run of ins_job (see next paragraph)

If you want to process only one C-ESM-EP component, you may pass its name as argument #3

You may change the reference simulation(s) by modifying file libIGCM_references.py in the comparison directory

Also, after any first run of C-ESM-EP (either launched by a a post-processing or as explained just above), you can run it interactively as explained in that page. You may then wish to tune the last year of last atlas slice by modifying value of data_end in file cesmep_lite/<comparison>/libIGCM_settings.py

Using C-ESM-EP on your pre-existing simulation

You can install and run C-ESM-EP for a simulation which already ran, by :

  • making sure that the libIGCM release for your simulation is OK (see the introduction)
  • modifying it's config card as indicated above (see the parameters
  • running again ins_job (this will install the cesmep_lite directory and relevant setup files)
  • running C-ESM-EP as described just above

Using C-ESM-EP on somebody's simulation

If you have access, for somebody's simulation, to the configuration directory and to the output data, you can run C-ESM-EP on it :

  • create a fake configuration dir where you copy the config.card, change it to add relevant Cesmep parameters (and the JobName if you want), and run some ins_job.
  • manually modify the three last lines of file cesmep_lite/*_run_comparison/libIGCM_fixed_settings.py in order to provide the root, Login and JobName value which match the corresponding segments of location of the original simulation's output data. You will let untouched the similar values listed near the file top. See the comments at file end, which are also reproduced below.
  • and go on as explained above using libIGCM_post.sh

Excerpt of libIGCM_fixed_settings.py:

# Next lines will allow to build the simulation output data path. This 
# is needed when creating a fake simulation for computing an atlas, and e.g.
# the user or the project used when running ins_job for C-ESM-EP (which 
# shows above) is not the same as when running the simulation (which shows 
# in the data path)
# Each such parameter should be specified

# DataPathRoot =   # e.g. '/ccc/store/cont003/gen0826'
# DataPathLogin =   # user login showing in the data path
# DataPathJobName =    # needed only if you changed w.r.t.the initial config.card

For power users

Directory cesmep_lite/ does not include all files of a standard C-ESM-EP root directory, this in order to save inodes (and this is achieved thanks to the PYTHONPATH set by libIGCM for running C-ESM-EP, and by symbolic links for some other files). If you wish to be able to modify such files for further customizing your run, just copy them in cesmep_lite/ and change them the way you like. This should occur after ins_job call and before submiting the simulation job.

Using an alternate CliMAF version

The CliMAF version used by the C-ESM-EP is set in file cesmep_lite/setenv_C-ESM-EP.sh, in the section relevant for each computing center. At TGCC and IDRIS, this occurs as a prefix of PYTHONPATH, and the default value is a version located in account igcmg. On Spirit, this occurs through a load module command, but you can also add a prefix to PYTHONPATH after that command

Using an alternate container

At TGCC, containers for running the C-ESM-EP are stored at /ccc/work/cont003/igcmg/igcmg/climaf_python_docker_archives/. When launching the C-ESM-EP using libIGCM_post.sh, you can use whatever alternate container located there by first setting export CESMEP_CONTAINER=ipsl:some_other_container. Default value is ipsl:cesmep_container. The list of available containers is the list of entries in subdirectory pcocc-rs/images/meta

The containers repository location is set in cesmep_lite/setenv_C-ESM-EP.sh by PCOCC_CONFIG_PATH=/ccc/work/cont003/igcmg/igcmg/climaf_python_docker_archives/.config/pcocc, which leads to file repositories.yaml there. You may change this config path for leading to another file repositories.yaml for using another repository.

Syntax of file libIGCM_post.param

This file ($SUBMIT_DIR/cesemp_lite/libIGCM_post.param) is read on each call of libIGCM_post.sh, so by default for each batch of output delivered by the simulation. Its content drives the corresponding C-ESM-EP run. Its fields are:

  • Cesmep code location
  • comparison name
  • simulation start date
  • cache location
  • components list
Clone this wiki locally