# Understand the Uncertainty in CMIP6 Dataset

In this notebook we demonstrate how to calculate model uncertainty to see the different temperature trend shown among the 6 members of the ssp126 experiment of CNRM-CM6-1 model in CMIP6 archive:

* access data that include multiple ensemble members  
* make plots with multiple lines

This example uses Coupled Model Intercomparison Project (CMIP6) collections. For more information, please visit [terms of use]( https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html).

---

inspired by the notebook in  https://github.com/NCI-data-analysis-platform/climate-cmip.git
- Original Authors: NCI Virtual Research Environment Team
- Keywords: CMIP, xarray, uncertainty
- Create Date: 2020-Apr
---
Adapted to DKRZ env: S.Kindermann August 2022

### Load libraries

In [None]:
import xarray as xr
import dask
%matplotlib inline

### Use xarray to open ensemble data files

In [None]:
Dir='/pool/data/CMIP6/data//ScenarioMIP/CNRM-CERFACS/CNRM-CM6-1'
Files=[Dir+'/ssp126/r1i1p1f2/Amon/tas/gr/v20190219/tas_Amon_CNRM-CM6-1_ssp126_r1i1p1f2_gr_201501-210012.nc',
      Dir+'/ssp126/r2i1p1f2/Amon/tas/gr/v20190410/tas_Amon_CNRM-CM6-1_ssp126_r2i1p1f2_gr_201501-210012.nc',
      Dir+'/ssp126/r3i1p1f2/Amon/tas/gr/v20190410/tas_Amon_CNRM-CM6-1_ssp126_r3i1p1f2_gr_201501-210012.nc',
      Dir+'/ssp126/r4i1p1f2/Amon/tas/gr/v20190410/tas_Amon_CNRM-CM6-1_ssp126_r4i1p1f2_gr_201501-210012.nc',
      Dir+'/ssp126/r5i1p1f2/Amon/tas/gr/v20190410/tas_Amon_CNRM-CM6-1_ssp126_r5i1p1f2_gr_201501-210012.nc',
      Dir+'/ssp126/r6i1p1f2/Amon/tas/gr/v20190410/tas_Amon_CNRM-CM6-1_ssp126_r6i1p1f2_gr_201501-210012.nc']

ds1=xr.open_dataset(Files[0])
ds2=xr.open_dataset(Files[1])
ds3=xr.open_dataset(Files[2])
ds4=xr.open_dataset(Files[3])
ds5=xr.open_dataset(Files[4])
ds6=xr.open_dataset(Files[5])

In [None]:
ds1.tas

### Concatenate ensemble files into one dataset

In [None]:
ds_new=xr.concat([ds1.tas, ds2.tas, ds3.tas, ds4.tas, ds5.tas, ds6.tas], 'new_dim')

Instead of reading each individual file and concatenating them, you can real them all in one dataset using an open multiple datasets function. The procedure above aims to demonstrate the concatenate function in Xarray.

In [None]:
ds_all=xr.open_mfdataset(''+Dir+'/ssp126/r*i1p1f2/Amon/tas/gr/*/tas_Amon_CNRM-CM6-1_ssp126_r*i1p1f2_gr_201501-210012.nc', concat_dim='member_id',combine='nested')
ds_all

### Data analysis and plotting

There exists uncertainty in model simulations, which is the reason that we need multiple models and multiple ensembles.

In [None]:
ds_yr=ds_all.mean(dim=('lat','lon')).resample(time='Y').mean(dim='time') #annual average data
ds_yr

### Add ensemble mean to dataset as member_id: mean

In [None]:
ds_yr_ens_mean=ds_yr.mean(dim='member_id')
ds_yr_addMean=xr.concat([ds_yr, ds_yr_ens_mean],'member_id')
ds_yr_addMean=ds_yr_addMean.assign_coords({"member_id": [1,2,3,4,5,6,'mean'] }) #change coordinates of member_id
ds_yr_addMean

In [None]:
ds_yr_addMean=xr.Dataset.to_array(ds_yr_addMean)[0,:,:]
ds_yr_addMean.plot.line(x='time', hue='member_id')

### Now we measure the average distance of individual ensemble members to the ensemble mean

In [None]:
import numpy as np
dis=np.sqrt((np.square(ds_yr-ds_yr.mean(dim='member_id'))).mean(dim='time'))
dis.tas.plot()
#dis.values()

Now we can see that the uncertainty is around 0.15 degree Celsius.

### Summary

This example shows how to concatenate multiple ensemble files and plot them all together to get the sense of model uncertainty. We can see different simulation members show different results regarding the future temperature projection under scenario ssp126. 