# Community Earth System Model Large Ensemble (CESM LENS) Data Sets on AWS


## Overview

The [National Center for Atmospheric Research (NCAR)](https://ncar.ucar.edu/)
Community Earth System Model Large Ensemble
([CESM LENS](http://www.cesm.ucar.edu/projects/community-projects/LENS/))
dataset includes a 40-member ensemble of climate simulations for the period
1920-2100. All model runs were subject to the same radiative forcing scenario:
historical up to 2005, and RCP8.5 thereafter. (RCP8.5 - Representative
Concentration Pathway 8.5 - refers to the worst-case scenario considered in the
[Fifth Assessment Report](https://www.ipcc.ch/report/ar5/wg1/) of the
Intergovernmental Panel on Climate Change - IPCC). Each of the 40 runs begins
from a slightly different initial atmospheric state (created by randomly
perturbing temperatures at the level of round-off error). The data comprise both
surface (2D) and volumetric (3D) variables in the atmosphere, ocean, land, and
ice domains.

The total LENS data volume is ~500 TB, and is traditionally accessible through
the NCAR Climate Data Gateway
([CDG](https://www.earthsystemgrid.org/dataset/ucar.cgd.ccsm4.CESM_CAM5_BGC_LE.html))
for download or via web services. A subset (currently ~70 TB compressed)
including the most useful variables is now
[freely available on AWS S3](https://registry.opendata.aws/ncar-cesm-lens/)
thanks to the
[AWS Public Dataset Program](https://aws.amazon.com/opendata/public-datasets/).

Slides from an informational briefing about this dataset are available
[here](http://ncar-aws-www.s3-website-us-west-2.amazonaws.com/20200212.CESM_LENS.AWStelecon.pdf).


## Accessing CESM LENS on AWS

- S3 bucket name: **ncar-cesm-lens**
- Region: **us-west-2** region
- Amazon resource name: **arn:aws:s3:::ncar-cesm-lens**
- Bucket contents list: https://ncar-cesm-lens.s3.amazonaws.com/


## Data

Zarr format: The LENS data on AWS are structured according to the
[Zarr](https://zarr.readthedocs.io/en/stable/) storage format. There are
independent Zarr stores for each component, frequency, experiment, and variable.
The naming convention is:
`{component}/{frequency}/cesmLE-{experiment}-{variable}.zarr` where:

- `component` = atm (atmosphere), lnd (land), ocn (ocean), ice_nh or ice_sh
  (ice, northern and southern hemispheres)
- `frequency` = monthly, daily, or hourly6-startYear-endYear (6-hourly data are
  available for distinct periods)
- `experiment` = 20C (20th century runs), RCP85 (RCP 8.5 runs), HIST (historical
  run), CTRL (fully-coupled control run), CTRL_AMIP (atmosphere-only control
  run), CTRL_SLAB (slab-ocean control run)
- `variable` = one of the variable names listed in the tables below

### Data Catalog

The table below shows the available Zarr stores, including the experiments,
variables, time ranges, and 2D or 3D nature (3D means multiple atmosphere levels
or ocean depths are present). See also
[collection description](https://ncar-cesm-lens.s3-us-west-2.amazonaws.com/catalogs/aws-cesm1-le.json)
and
[catalog file](https://ncar-cesm-lens.s3-us-west-2.amazonaws.com/catalogs/aws-cesm1-le.csv)
used by [Intake-esm](https://intake-esm.readthedocs.io/).


In [3]:
import pandas as pd
from ipyaggrid import Grid
from IPython.display import HTML

df = pd.read_csv(
    "https://ncar-cesm-lens.s3-us-west-2.amazonaws.com/catalogs/aws-cesm1-le.csv"
)

column_defs = [
    {"headerName": "variable", "field": "variable", "rowGroup": False},
    {"headerName": "long_name", "field": "long_name", "rowGroup": False},
    {"headerName": "component", "field": "component", "rowGroup": False},
    {"headerName": "experiment", "field": "experiment", "rowGroup": False},
    {"headerName": "frequency", "field": "frequency", "rowGroup": False},
    {
        "headerName": "vertical_levels",
        "field": "vertical_levels",
        "rowGroup": False,
    },
    {
        "headerName": "spatial_domain",
        "field": "spatial_domain",
        "rowGroup": False,
    },
    {"headerName": "units", "field": "units", "rowGroup": False},
    {"headerName": "start_time", "field": "start_time", "rowGroup": False},
    {"headerName": "end_time", "field": "end_time", "rowGroup": False},
    {"headerName": "path", "field": "path", "rowGroup": False},
]

grid_options = {
    "columnDefs": column_defs,
    "enableSorting": True,
    "enableFilter": True,
    "enableColResize": True,
    "enableRangeSelection": True,
}

css_rules = """
.ag-row-hover {
    background-color: lightblue !important;
}

.ag-column-hover {
    background-color: powderblue;
}

.ag-row-hover .ag-column-hover {
    background-color: deepskyblue !important;
}
"""

g = Grid(
    grid_data=df,
    grid_options=grid_options,
    quick_filter=True,
    show_toggle_edit=False,
    export_csv=False,
    export_excel=False,
    theme="ag-theme-blue",
    show_toggle_delete=False,
    columns_fit="auto",
    index=False,
    keep_multiindex=False,
    css_rules=css_rules,
)

html = g.export_html(build=True)
HTML(html)

## Notebook Examples

A Jupyter Notebook illustrating how to read the LENS data on AWS, and
reproducing Figures 2 and 4 from Kay et al. (2015), has been developed. This
Notebook and other resources on GitHub will be gradually improved and augmented.

- [Rendered (static) version of the Notebook](http://gallery.pangeo.io/repos/NCAR/cesm-lens-aws/notebooks/kay-et-al-2015.v3.html)
- [Reusable Notebook on GitHub](https://github.com/NCAR/cesm-lens-aws)


## Data Citation

Data are freely available and reusable under the terms of the CC-BY-4.0 license.
See [Terms of Use](https://www.ucar.edu/terms-of-use/data). If you use these
data, we request that you provide attribution in any derived products. The
original, complete LENS dataset and the AWS-hosted subset have different DOIs
(Digital Object Identifiers) to reflect their differing scope and format, so
please cite whichever version of the dataset used, as well as the Kay et al.
(2015) paper:

- AWS-hosted subset:
  [doi:10.26024/wt24-5j82](https://doi.org/10.26024/wt24-5j82) de La
  Beaujardière, J., Banihirwe, A., Shih, C.-F., Paul, K., and Hamman, J.,
  (2019), "NCAR CESM LENS Cloud-Optimized Subset," UCAR/NCAR Computational and
  Informations Systems Lab
- Original dataset: [doi:10.5065/d6j101d1](https://doi.org/10.5065/d6j101d1)
  Kay, J. and Deser, C. (2016). "The Community Earth System Model (CESM) Large
  Ensemble Project" UCAR/NCAR Climate Data Gateway.
- Kay et al. (2015) paper:
  [doi:10.5065/d6j101d1](https://doi.org/10.5065/d6j101d1) Kay, J. E., Deser,
  C., Phillips, A., Mai, A., Hannay, C., Strand, G., Arblaster, J., Bates, S.,
  Danabasoglu, G., Edwards, J., Holland, M. Kushner, P., Lamarque, J.-F.,
  Lawrence, D., Lindsay, K., Middleton, A., Munoz, E., Neale, R., Oleson, K.,
  Polvani, L., and M. Vertenstein (2015), "The Community Earth System Model
  (CESM) Large Ensemble Project: A Community Resource for Studying Climate
  Change in the Presence of Internal Climate Variability," Bulletin of the
  American Meteorological Society, 96, 1333-1349


## Contact

If you have questions or want to submit a data request, please reach out to us
on [our GitHub Discussions](https://github.com/NCAR/cesm-lens-aws/discussions)
page or via email: `rdahelp` at `ucar` dot `edu`.
