# Reproducing Key Figures from Kay et al. (2015) Paper


## Introduction

This Jupyter Notebook demonstrates how one might use the NCAR Community Earth System Model (CESM)
Large Ensemble (LENS) data hosted on AWS S3 ([doi:10.26024/wt24-5j82](https://doi.org/10.26024/wt24-5j82)). The notebook shows how to reproduce figures 2 and 4 from the Kay et al. (2015) paper describing the CESM LENS dataset ([doi:10.1175/BAMS-D-13-00255.1](https://doi.org/10.1175/BAMS-D-13-00255.1))

This resource is intended to be helpful for people not familiar with elements of the [Pangeo](https://pangeo.io) framework including Jupyter Notebooks, [Xarray](http://xarray.pydata.org/), and [Zarr](https://zarr.readthedocs.io/) data format, or with the original paper, so it includes additional explanation.


## Set up environment

In [1]:
# Display output of plots directly in Notebook
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")

import intake
import numpy as np
import pandas as pd
import xarray as xr
import hvplot.xarray

## Create and Connect to Dask Distributed Cluster

In [2]:
# Create cluster
from dask_gateway import Gateway
from dask.distributed import Client
gateway = Gateway()
cluster = gateway.new_cluster()
cluster.adapt(minimum=2, maximum=100)
# Connect to cluster
client = Client(cluster)
# Display cluster dashboard URL
cluster

VBox(children=(HTML(value='<h2>GatewayCluster</h2>'), HBox(children=(HTML(value='\n<div>\n<style scoped>\n    …


☝️ Link to scheduler dashboard will appear above.

## Load data into xarray from a catalog using intake-esm


In [3]:
import intake
cat_url = 'https://raw.githubusercontent.com/hydrologie/datasets/main/intake-catalogs/atmosphere.yml'
cat = intake.open_catalog(cat_url)

Unnamed: 0,unique
component,5
frequency,6
experiment,4
variable,65
path,394
variable_long_name,62
dim_per_tstep,3
start,13
end,12


In [None]:
cat.gui

In [None]:
ds = cat.era5_hourly_reanalysis_single_levels_sa.to_dask()
ds

In [None]:
ds.t2m

In [None]:
(ds.t2m - 273.15).sel(time="1980-01-01 00:00").hvplot(tiles='ESRI', geo=True)

In [None]:
%%time

(ds.t2m - 273.15)\
 .mean(['latitude','longitude']) \
 .hvplot(grid=True)