# Earth System Grid Federation Data Access


Earth System Grid Federation (ESGF) data is typically stored on THREDDS servers. A client named `pyesgf` exists to interact with ESGF THREDDS servers. The following shows examples of typical queries for data. 

If a login username and credentials are required, follow these [instructions](https://esgf-pyclient.readthedocs.io/en/latest/notebooks/examples/logon.html).

In [1]:
from pyesgf.search import SearchConnection

# Create a connection for distributed search on ESGF nodes.
conn = SearchConnection('https://esgf.ceda.ac.uk/esg-search',
                         distrib=True)

# Launch a search query. 
# Here we're looking for any variable related to humidity within the CMIP6 SSP2-4.5 experiment.
# Results will be stored in a dictionary with keys defined by the `facets` argument.
ctx = conn.new_context(project='CMIP6', 
                       experiment_id="ssp245", 
                       query='humidity', 
                       facets='variable_id,source_id')

print("Number of results: ", ctx.hit_count)
print("Variables related to humidity: ")
ctx.facet_counts['variable_id']

Number of results:  5571
Variables related to humidity: 


{'tnhusscpbl': 71,
 'tnhusscp': 30,
 'tnhuspbl': 30,
 'tnhusmp': 52,
 'tnhusd': 15,
 'tnhusc': 75,
 'tnhusa': 23,
 'tnhus': 43,
 'hussLut': 26,
 'huss': 1106,
 'hus850': 79,
 'hus': 1567,
 'hursmin': 217,
 'hursmax': 204,
 'hurs': 1269,
 'hur': 764}

In [2]:
# Now let's look for simulations that have the `hurs` variable and pick the first member.
ctx.constrain(variable_id='hurs', ensemble='r1i1p1f1')
ctx.facet_counts["source_id"]

{'UKESM1-0-LL': 202,
 'TaiESM1': 38,
 'NorESM2-MM': 80,
 'NorESM2-LM': 283,
 'NESM3': 16,
 'MRI-ESM2-0': 86,
 'MPI-ESM1-2-LR': 895,
 'MPI-ESM1-2-HR': 86,
 'MIROC6': 125,
 'MIROC-ES2L': 82,
 'MCM-UA-1-0': 10,
 'KIOST-ESM': 12,
 'KACE-1-0-G': 22,
 'IPSL-CM6A-LR': 347,
 'INM-CM5-0': 35,
 'INM-CM4-8': 34,
 'IITM-ESM': 33,
 'HadGEM3-GC31-LL': 151,
 'GISS-E2-1-G': 39,
 'GFDL-ESM4': 45,
 'GFDL-CM4': 58,
 'FIO-ESM-2-0': 27,
 'FGOALS-g3': 25,
 'FGOALS-f3-L': 12,
 'EC-Earth3-Veg-LR': 48,
 'EC-Earth3-Veg': 166,
 'EC-Earth3-CC': 75,
 'EC-Earth3': 964,
 'E3SM-1-1': 2,
 'CanESM5-CanOE': 13,
 'CanESM5': 418,
 'CNRM-ESM2-1': 190,
 'CNRM-CM6-1-HR': 32,
 'CNRM-CM6-1': 190,
 'CMCC-ESM2': 28,
 'CMCC-CM2-SR5': 35,
 'CIESM': 9,
 'CESM2-WACCM': 43,
 'CESM2': 20,
 'CAS-ESM2-0': 3,
 'CAMS-CSM1-0': 14,
 'BCC-CSM2-MR': 20,
 'AWI-CM-1-1-MR': 30,
 'ACCESS-ESM1-5': 440,
 'ACCESS-CM2': 88}

In [3]:
# We can now search for datasets corresponding within our search context
results = ctx.search()
r = results[0]
r.dataset_id

'CMIP6.ScenarioMIP.IPSL.IPSL-CM6A-LR.ssp245.r1i1p1f1.Amon.huss.gr.v20190119|vesg.ipsl.upmc.fr'

In [4]:
# To get file download links, there's an extra step
file_ctx = r.file_context()
file_ctx.facets = "*"
files = file_ctx.search()
[f.download_url for f in files]

['http://vesg.ipsl.upmc.fr/thredds/fileServer/cmip6/ScenarioMIP/IPSL/IPSL-CM6A-LR/ssp245/r1i1p1f1/Amon/huss/gr/v20190119/huss_Amon_IPSL-CM6A-LR_ssp245_r1i1p1f1_gr_201501-210012.nc']

In [5]:
# The same applies to get an OPENDAP link
agg_ctx = r.aggregation_context()
agg_ctx.facets = "*"
agg = agg_ctx.search()[0]
print(agg.opendap_url)

http://vesg.ipsl.upmc.fr/thredds/dodsC/CMIP6.ScenarioMIP.IPSL.IPSL-CM6A-LR.ssp245.r1i1p1f1.Amon.huss.gr.huss.20190119.aggregation.1


In [6]:
# Open the opendap link with xarray
import xarray as xr
ds = xr.open_dataset(agg.opendap_url)
ds