<span style='color:#009999'> <span style='font-family:serif'> <font size="15"> **CMIP6** - Coupled Model Intercomparison Project Phase 6

<span style='color:#ff6666'><font size="5">**Additional Requirements**

- <font size="3"><span style='color:Black'> None.


 <span style='color:#ff6666'><font size="5"> **Objectives**
- <font size="3"><span style='color:Black'> To demonstrate remote access via tokens to ESGF Portal.
- <font size="3"><span style='color:Black'> To access and subset remote data implementing the DAP2 Protocol.
- <font size="3"><span style='color:Black'> Understand the subtle differences between DAP2 and DAP4.
- <font size="3"><span style='color:Black'> To identify when an OPeNDAP server only implements DAP2.



<span style='color:#ff6666'><font size="5"> **Browsing Data:**

The <font size="3.5"><span style='color:#0066cc'>**Earth System Grid Federation** <font size="3.5"><span style='color:black'> [ESGF](https://aims2.llnl.gov/search/cmip6/) Contains a broad of model output (e.g, CMIP3, CMIP5, [CMIP6](https://pcmdi.llnl.gov/CMIP6/), E3SM) from which you can obtain OPeNDAP URLs for data variables. TO access the ESGF Node and browse data [click here](https://aims2.llnl.gov/search/cmip6/).

<img src="img/ESGF.png" alt="drawing" width="750"/>    



In [1]:
import matplotlib.pyplot as plt
import numpy as np
from pydap.client import open_url
import cartopy.crs as ccrs

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **CMIP6 Access via OPeNDAP**

<font size="3.5">You can also directly inspect a THREDDS catalog for [CMIP6](https://crd-esgf-drc.ec.gc.ca/thredds/catalog/esgB_dataroot/AR6/CMIP6/catalog.html). For example, you can navigate to `CDRMIP/CCCma/CanESM5/esm-pi-cdr-pulse/r2i1p2f1/Eday/ts/gn/v20190429` and access [ts data](https://crd-esgf-drc.ec.gc.ca/thredds/dodsC/esgB_dataroot/AR6/CMIP6/CDRMIP/CCCma/CanESM5/esm-pi-cdr-pulse/r2i1p2f1/Eday/ts/gn/v20190429/ts_Eday_CanESM5_esm-pi-cdr-pulse_r2i1p2f1_gn_54510101-56501231.nc.html) via OPeNDAP DAP2 protocol.



In [2]:
url = "https://crd-esgf-drc.ec.gc.ca/thredds/dodsC/esgB_dataroot/AR6/CMIP6/CDRMIP/CCCma/CanESM5/esm-pi-cdr-pulse/r2i1p2f1/Eday/ts/gn/v20190429/ts_Eday_CanESM5_esm-pi-cdr-pulse_r2i1p2f1_gn_54510101-56501231.nc"


<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Create dataset access via pydap**

-  <font size="3.5"> By default protocol='dap2', however this behavior may change in the nearby future.


In [3]:
%%time
ds = open_url(url, protocol='dap2')

CPU times: user 77.2 ms, sys: 10.3 ms, total: 87.5 ms
Wall time: 1.19 s


In [4]:
ds.tree()

.esgB_dataroot/AR6/CMIP6/CDRMIP/CCCma/CanESM5/esm-pi-cdr-pulse/r2i1p2f1/Eday/ts/gn/v20190429/ts_Eday_CanESM5_esm-pi-cdr-pulse_r2i1p2f1_gn_54510101-56501231.nc
├──time
├──time_bnds
├──lat
├──lat_bnds
├──lon
├──lon_bnds
└──ts
   ├──ts
   ├──time
   ├──lat
   └──lon


In [5]:
print('Dataset memory user [GBs, uncompressed]: ', ds.nbytes/1e9)

Dataset memory user [GBs, uncompressed]:  2.394406144


<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Inspect single variable**



In [6]:
ts = ds['ts']

In [7]:
ts

<GridType with array 'ts' and maps 'time', 'lat', 'lon'>

<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Grid Arrays**

-  <font size="3.5"> No longer implemented in DAP4. These carry copies of dimensions/coverage (called `maps`) and the variable of interest.
-  <font size="3.5"> Attempting to download into memory only `ts` will also download `time`, `lat`, `lon`.




In [8]:
ts.attributes

{'standard_name': 'surface_temperature',
 'long_name': 'Surface Temperature',
 'comment': 'Temperature of the lower boundary of the atmosphere',
 'units': 'K',
 'original_name': 'GT',
 'cell_methods': 'area: time: mean',
 'cell_measures': 'area: areacella',
 'history': '2019-08-20T21:03:55Z altered by CMOR: Reordered dimensions, original order: lat lon time. 2019-08-20T21:03:55Z altered by CMOR: replaced missing value flag (1e+38) and corresponding data with standard missing value (1e+20).',
 'missing_value': 1e+20,
 '_FillValue': 1e+20,
 '_ChunkSizes': [1, 64, 128]}

In [9]:
ts.tree()

.ts
├──ts
├──time
├──lat
└──lon


<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **Exercise**

<font size="3.5"><span style='color:black'> Make a surface map of a variable (say `ts` in the example), for `time=0`. You can do that in two ways:


- <font size="3.5"><span style='color:black'> Download the Array `ts` into memory from the original URL via pydap (`GridType` array)
- <font size="3.5"><span style='color:black'> Append a Constraint Expression (CE) to the original `dataURL` only download the data you want. You can do this interactively in the DAP response form of the dataset. Simply paste original url `<dataURL>+'.html'` onto a browser to view the DAP response form, and then there select only a single time index value.



<font size="3.5"><span style='color:black'> **NOTE**: When making a plot, check for missing values, scale factors, units.





<span style='font-family:serif'> <font size="5.5"><span style='color:#0066cc'> **pydap approach:**

- <font size="3.5"> **NOTE** Some Data providers specify a limit to how much data can be downloaded at once. This upper value limit can be configured within any OPeNDAP server.


In [19]:
%%time
# Attempting to download the entire GridType triggers an error on the server side.
GTS = ds['ts'][:]
GTS

In [11]:
ds['ts'].shape

(73000, 64, 128)

In [12]:
%%time
# download the entire GridType, single snapshot
GTS = ds['ts'][0, :, :]
GTS

CPU times: user 166 ms, sys: 21.4 ms, total: 188 ms
Wall time: 2min 35s


<GridType with array 'ts' and maps 'time', 'lat', 'lon'>

In [20]:
%%time
# download the only Array, single snapshot
TS = ds['ts']['ts'][0, :, :]
TS

CPU times: user 31.3 ms, sys: 5.61 ms, total: 36.9 ms
Wall time: 872 ms


<BaseType with data array([[[248.65997, 248.28497, 248.15997, ..., 249.53497, 249.65997,
         249.15997],
        [251.65997, 250.78497, 250.53497, ..., 252.65997, 252.40997,
         251.65997],
        [251.03497, 250.15997, 249.03497, ..., 255.03497, 253.53497,
         253.03497],
        ...,
        [224.5918 , 224.67487, 225.22243, ..., 226.25885, 225.4502 ,
         224.54295],
        [219.77655, 220.37973, 221.11984, ..., 218.04332, 218.55197,
         219.16682],
        [220.43169, 220.94601, 221.48468, ..., 220.07048, 219.97438,
         220.07361]]], dtype='>f4')>