Skip to content

Latest commit

 

History

History
465 lines (389 loc) · 40.7 KB

input.rst

File metadata and controls

465 lines (389 loc) · 40.7 KB

Obtaining input data

ESMValTool supports input data from climate models participating in CMIP6, CMIP5, CMIP3, and CORDEX as well as observations, reanalysis, and any other data, provided that it adheres to the CF conventions and the data is described in a CMOR table as used in the various Climate Model Intercomparison Projects.

Note

CORDEX support is still work in progress. Contributions, in the form of pull request reviews <reviewing> or pull requests <esmvalcore:contributing> are most welcome. We are particularly interested in contributions from people with good understanding of the CORDEX project and its standards.

This section provides an introduction to getting (access to) climate data for use with ESMValTool.

Because the amount of data required by ESMValTool is typically large, it is recommended that you use the tool on a compute cluster where the data is already available, for example because it is connected to an ESGF node. Examples of such compute clusters are Levante and Jasmin, but many more exist around the world.

If you do not have access to such a facility through your institute or the project you are working on, you can request access by applying for the ENES Climate Analytics Service or, if you need longer term access or more computational resources, the IS-ENES3 Trans-national Access call.

If the options above are not available to you, ESMValTool also offers a feature to make it easy to download CMIP6, CMIP5, CMIP3, CORDEX, and obs4MIPs from ESGF. ESMValTool also provides support to download some observational dataset from source.

The chapter in the ESMValCore documentation on finding data <esmvalcore:findingdata> explains how to configure ESMValTool so it can find locally available data and/or download it from ESGF if it isn't available locally yet.

Models

If you do not have access to a compute cluster with the data already mounted, ESMValTool can automatically download any required data that is available on ESGF. This is the recommended approach for first-time users to obtain some data for running ESMValTool. For example, run

esmvaltool run --search_esgf=when_missing examples/recipe_python.yml

to run the default example recipe and automatically download the required data to the directory ~/climate_data. The data only needs to be downloaded once, every following run will re-use previously downloaded data stored in this directory. See esmvalcore:config-esgf for a more in depth explanation and the available configuration options.

Alternatively, you can use an external tool called Synda to maintain your own collection of ESGF data.

Observations

Observational and reanalysis products in the standard CF/CMOR format used in CMIP and required by ESMValTool are available via the obs4MIPs and ana4mips projects at the ESGF (e.g., https://esgf-data.dkrz.de/projects/esgf-dkrz/). Their use is strongly recommended, when possible.

Other datasets not available in these archives can be obtained by the user from the respective sources and reformatted to the CF/CMOR standard. ESMValTool currently supports two ways to perform this reformatting (aka 'CMORization'):

  1. Using a CMORizer script: The first is to use a CMORizer script to generate a local pool of reformatted data that can readily be used by ESMValTool. This method is described in detail below.
  2. Using fixes for on-the-fly CMORization: The second way is to implement specific 'fixes' <esmvalcore:fixing_data> for your dataset. In that case, the reformatting is performed 'on the fly' during the execution of an ESMValTool recipe (note that one of the first preprocessor tasks is 'CMOR checks and fixes'). Details on this second method are given at the end of this chapter <inputdata_native_datasets>.

Using a CMORizer script

ESMValTool comes with a set of CMORizers readily available. The CMORizers are dataset-specific scripts that can be run once to generate a local pool of CMOR-compliant data. The necessary information to download and process the data is provided in the header of each CMORizing script. These scripts also serve as template to create new CMORizers for datasets not yet included. Note that datasets CMORized for ESMValTool v1 may not be working with v2, due to the much stronger constraints on metadata set by the iris library.

ESMValTool provides the esmvaltool data command line tool, which can be used to download and format datasets.

To list the available commands, run

esmvaltool data --help

It is also possible to get help on specific commands, e.g.

esmvaltool data download --help

The list of datasets supported by ESMValTool through a CMORizer script can be obtained with:

esmvaltool data list

Datasets for which auto-download is supported can be downloaded with:

esmvaltool data download --config_file [CONFIG_FILE] [DATASET_LIST]

Note that all Tier3 and some Tier2 datasets for which auto-download is supported will require an authentication. In such cases enter your credentials in your ~/.netrc file as explained here.

An entry to the ~/.netrc should look like:

machine [server_name] login [user_name] password [password]

Make sure that the permissions of the ~/.netrc file are set so only you and administrators can read it, i.e.

chmod 600 ~/.netrc
ls -l ~/.netrc

The latter command should show -rw-------.

For other datasets, downloading instructions can be obtained with:

esmvaltool data info [DATASET]

To CMORize one or more datasets, run:

esmvaltool data format --config_file [CONFIG_FILE] [DATASET_LIST]

The path to the raw data to be CMORized must be specified in the user configuration file<config-user> as RAWOBS. Within this path, the data are expected to be organized in subdirectories corresponding to the data tier: Tier2 for freely-available datasets (other than obs4MIPs and ana4mips) and Tier3 for restricted datasets (i.e., dataset which requires a registration to be retrieved or provided upon request to the respective contact or PI). The CMORization follows the CMIP5 CMOR tables or CMIP6 CMOR tables for the OBS and OBS6 projects respectively. The resulting output is saved in the output_dir, again following the Tier structure. The output file names follow the definition given in config-developer file <esmvalcore:config-developer> for the OBS project:

[project]_[dataset]_[type]_[version]_[mip]_[short_name]_YYYYMM_YYYYMM.nc

where project may be OBS (CMIP5 format) or OBS6 (CMIP6 format), type may be sat (satellite data), reanaly (reanalysis data), ground (ground observations), clim (derived climatologies), campaign (aircraft campaign).

At the moment, esmvaltool data format supports Python and NCL scripts.

Supported datasets for which a CMORizer script is available

A list of the datasets for which a CMORizers is available is provided in the following table.

p{6cm}p{3cm}|

Dataset Variables (MIP) Tier Script language
APHRO-MA pr, tas (day), pr, tas (Amon)

3

Python
AURA-TES tro3 (Amon)

3

NCL
BerkelyEarth tas, tasa (Amon), sftlf (fx)

2

Python
CALIPSO-GOCCP clcalipso (cfMon)

2

NCL
CALIPSO-ICECLOUD cli (AMon)

3

NCL
CDS-SATELLITE-ALBEDO bdalb (Lmon), bhalb (Lmon)

3

Python
CDS-SATELLITE-LAI-FAPAR fapar (Lmon), lai (Lmon)

3

Python
CDS-SATELLITE-SOIL-MOISTURE sm (day), sm (Lmon)

3

NCL
CDS-UERRA sm (E6hr)

3

Python
CDS-XCH4 xch4 (Amon)

3

NCL
CDS-XCO2 xco2 (Amon)

3

NCL
CERES-EBAF rlut, rlutcs, rsut, rsutcs (Amon)

2

Python
CERES-SYN1deg rlds, rldscs, rlus, rluscs, rlut, rlutcs, rsds, rsdscs, rsus, rsuscs, rsut, rsutcs (3hr) rlds, rldscs, rlus, rlut, rlutcs, rsds, rsdt, rsus, rsut, rsutcs (Amon)

3

NCL
CLARA-AVHRR clt, clivi, lwp (Amon)

3

NCL
CLOUDSAT-L2 clw, clivi, lwp (Amon)

3

NCL
CowtanWay tasa (Amon)

2

Python
CRU tas, pr (Amon)

2

Python
CT2019 co2s (Amon)

2

Python
Duveiller2018 albDiffiTr13

2

Python
E-OBS tas, tasmin, tasmax, pr, psl (day, Amon)

2

Python
Eppley-VGPM-MODIS intpp (Omon)

2

Python
ERA51 cl, clt, evspsbl, evspsblpot, mrro, pr, prsn, ps, psl, ptype, rls, rlds, rlns, rlus2, rsds, rsns, rsus3, rsdt, rss, uas, vas, tas, tasmax, tasmin, tdps, ts, tsn (E1hr/Amon), orog (fx)

3

n/a
ERA5-Land4 pr

3

n/a
ERA-Interim cl, cli, clivi, clt, clw, clwvi, evspsbl, hfds, hur, hus, lwp, orog, pr, prsn, prw, ps, psl, rlds, rlut, rlutcs, rsds, rsdt, rss, rsut, rsutcs, sftlf, ta, tas, tasmax, tasmin, tauu, tauv, tdps, tos, ts, tsn, ua, uas, va, vas, wap, zg

3

Python
ERA-Interim-Land sm (Lmon)

3

Python
ESACCI-AEROSOL abs550aer, od550aer, od550aerStderr, od550lt1aer, od870aer, od870aerStderr (aero)

2

NCL
ESACCI-CLOUD clivi, clt, cltStderr, lwp, rlut, rlutcs, rsut, rsutcs, rsdt, rlus, rsus, rsuscs (Amon)

2

NCL
ESACCI-FIRE burntArea (Lmon)

2

NCL
ESACCI-LANDCOVER baresoilFrac, cropFrac, grassFrac, shrubFrac, treeFrac (Lmon)

2

NCL
ESACCI-LST ts (Amon)

2

Python
ESACCI-OC chl (Omon)

2

Python
ESACCI-OZONE toz, tozStderr, tro3prof, tro3profStderr (Amon)

2

NCL
ESACCI-SEA-SURFACE-SALINITY sos (Omon)

2

Python
ESACCI-SOILMOISTURE dos, dosStderr, sm, smStderr (Lmon)

2

NCL
ESACCI-SST ts, tsStderr (Amon)

2

NCL
ESACCI-WATERVAPOUR prw (Amon)

3

Python
ESDC tas, tasmax, tasmin (Amon)

2

Python
ESRL co2s (Amon)

2

NCL
FLUXCOM gpp (Lmon)

3

Python
GCP2018 fgco2 (Omon5), nbp (Lmon6)

2

Python
GCP2020 fgco2 (Omon7), nbp (Lmon8)

2

Python
GHCN pr (Amon)

2

NCL
GHCN-CAMS tas (Amon)

2

Python
GISTEMP tasa (Amon)

2

Python
GLODAP dissic, ph, talk (Oyr)

2

Python
GPCC pr (Amon)

2

Python
GPCP-SG pr (Amon)

2

Python
GRACE lweGrace (Lmon)

3

Python
HadCRUT3 tas, tasa (Amon)

2

NCL
HadCRUT4 tas, tasa (Amon), tasConf5, tasConf95

2

NCL
HadCRUT5 tas, tasa (Amon)

2

Python
HadISST sic (OImon), tos (Omon), ts (Amon)

2

NCL
HALOE tro3, hus (Amon)

2

NCL
HWSD cSoil (Lmon), areacella (fx), sftlf (fx)

3

Python
ISCCP-FH alb, prw, ps, rlds, rlus, rlut, rlutcs, rsds, rsdt, rsus, rsut, rsutcs, tas, ts (Amon)

2

NCL
JMA-TRANSCOM nbp (Lmon), fgco2 (Omon)

3

Python
JRA-25 clt, hus, prw, rlut, rlutcs, rsut, rsutcs (Amon)

2

Python
Kadow2020 tasa (Amon)

2

Python
LAI3g lai (Lmon)

3

Python
LandFlux-EVAL et, etStderr (Lmon)

3

Python
Landschuetzer2016 dpco2, fgco2, spco2 (Omon)

2

Python
Landschuetzer2020 spco2 (Omon)

2

Python
MAC-LWP lwp, lwpStderr (Amon)

3

NCL
MERRA2 sm (Lmon) clt, pr, evspsbl, hfss, hfls, huss, prc, prsn, prw, ps, psl, rlds, rldscs, rlus, rlut, rlutcs, rsds, rsdscs, rsdt, tas, tasmin, tasmax, tauu, tauv, ts, uas, vas, rsus, rsuscs, rsut, rsutcs, ta, ua, va, tro3, zg, hus, wap, hur, cl, clw, cli, clwvi, clivi (Amon)

3

Python
MLS-AURA hur, hurStderr (day)

3

Python
MOBO-DIC_MPIM dissic (Omon)

2

Python
MODIS cliwi, clt, clwvi, iwpStderr, lwpStderr (Amon), od550aer (aero)

3

NCL
MSWEP9 pr

3

n/a
MTE gpp, gppStderr (Lmon)

3

Python
NCEP-NCAR-R1 clt, hur, hurs, hus, pr, prw, psl, rlut, rlutcs, rsut, rsutcs, sfcWind, ta, tas, tasmax, tasmin, ts, ua, va, wap, zg (Amon) pr, rlut, ua, va (day)

2

Python
NCEP-DOE-R2 clt, hur, prw, ta (Amon)

2

Python
NDP cVeg (Lmon)

3

Python
NIWA-BS toz, tozStderr (Amon)

3

NCL
NOAA-CIRES-20CR clt, clwvi, hus, prw, rlut, rsut (Amon)

2

Python
NOAAGlobalTemp tasa (Amon)

2

Python
NSIDC-0116-[nh|sh] usi, vsi (day)

3

Python
OceanSODA-ETHZ areacello (Ofx), co3os, dissicos, fgco2, phos, spco2, talkos (Omon)

2

Python
OSI-450-[nh|sh] sic (OImon), sic (day)

2

Python
PATMOS-x clt (Amon)

2

NCL
PERSIANN-CDR pr (Amon), pr (day)

2

Python
PHC thetao, so (Omon10)

2

Python
PIOMAS sit (day)

2

Python
REGEN pr (day, Amon)

2

Python
Scripps-CO2-KUM co2s (Amon)

2

Python
TCOM-CH4 ch4 (Amon11)

2

Python
TCOM-N2O n2o (Amon12)

2

Python
UWisc clwvi, lwpStderr (Amon)

3

NCL
WFDE5 tas, pr (Amon, day)

2

Python
WOA thetao, so, tos, sos (Omon) no3, o2, po4, si (Oyr)

2

Python

Datasets in native format

ESMValCore also provides support for some datasets in their native format. In this case, the steps needed to reformat the data are executed as dataset fixes during the execution of an ESMValTool recipe, as one of the first preprocessor steps, see fixing data <esmvalcore:fixing_data>. Compared to the workflow described above, this has the advantage that the user does not need to store a duplicate (CMORized) copy of the data. Instead, the CMORization is performed 'on the fly' when running a recipe. Native datasets can be hosted either under a dedicated project (usually done for native model output) or under project native6 (usually done for native reanalysis/observational products). These projects are configured in the config-developer file <esmvalcore:configure_native_models>.

A list of all currently supported native datasets is provided here <esmvalcore:read_native_datasets>. A detailed description of how to include new native datasets is given here <esmvalcore:add_new_fix_native_datasets>.

To use this functionality, users need to provide a path in the esmvalcore:user configuration file for the native6 project data and/or the dedicated project used for the native dataset, e.g., ICON. Then, in the recipe, they can refer to those projects. For example:

datasets:
- {project: native6, dataset: ERA5, type: reanaly, version: v1, tier: 3, start_year: 1990, end_year: 1990}
- {project: ICON, dataset: ICON, exp: icon-2.6.1_atm_amip_R2B5_r1i1p1f1, mip: Amon, short_name: tas, start_year: 2000, end_year: 2014}

For project native6, more examples can be found in the diagnostics ERA5_native6 in the recipe examples/recipe_check_obs.yml.


  1. CMORization is built into ESMValTool through the native6 project, so there is no separate CMORizer script.

  2. Derived on the fly from down & net radiation.

  3. Derived on the fly from down & net radiation.

  4. CMORization is built into ESMValTool through the native6 project, so there is no separate CMORizer script.

  5. The frequency of this variable differs from the one specified in the table. The correct entry that needs to be used in the recipe can be found in the corresponding section of recipe_check_obs.yml.

  6. The frequency of this variable differs from the one specified in the table. The correct entry that needs to be used in the recipe can be found in the corresponding section of recipe_check_obs.yml.

  7. The frequency of this variable differs from the one specified in the table. The correct entry that needs to be used in the recipe can be found in the corresponding section of recipe_check_obs.yml.

  8. The frequency of this variable differs from the one specified in the table. The correct entry that needs to be used in the recipe can be found in the corresponding section of recipe_check_obs.yml.

  9. CMORization is built into ESMValTool through the native6 project, so there is no separate CMORizer script.

  10. The frequency of this variable differs from the one specified in the table. The correct entry that needs to be used in the recipe can be found in the corresponding section of recipe_check_obs.yml.

  11. The frequency of this variable differs from the one specified in the table. The correct entry that needs to be used in the recipe can be found in the corresponding section of recipe_check_obs.yml.

  12. The frequency of this variable differs from the one specified in the table. The correct entry that needs to be used in the recipe can be found in the corresponding section of recipe_check_obs.yml.