# Use CDO to Compare Model and Observation Data


In this notebook we demostrate how to compare model and observation data:

- Load CDO module
- Look at the data information
- Concatenate multiple files
- Data remapping
- Compare between model and obs data 

This example uses Coupled Model Intercomparison Project (CMIP6) collections. For more information, please visit [data catalogue]( https://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f6600_2266_8675_3563) and [terms of use]( https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html).

---

- Authors: NCI Virtual Research Environment Team
- Keywords: CMIP, CDO, concatenate data, data remapping
- Create Date: 2019-Oct; Update Date: 2020-Apr

### prerequisite

To run this notebook on Gadi/VDI, the following module is needed:

* CDO

You also need to be a member of the following data project to access the data:
* oi10

You can request to join the project through [NCI's user account management system](https://my.nci.org.au). 

## What is CDO?

CDO stands for "Climate Data Operators". CDO is a collection of command line Operators to manipulate and analyse Climate and NWP model Data. It support data formats: GRIB 1/2, netCDF 3/4, SERVICE, EXTRA and IEG. There are more than 600 operators available. See [CDO's homepage](https://code.mpimet.mpg.de/projects/cdo) for more information about this library.

### Load CDO module

```
$ module load cdo
```


### check data

Let's look at the near surface temperature from the 20th century all-forcing historical simulation based on CESM2.0 model of NCAR
```
$ ls /g/data/oi10/replicas/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Amon/tas/gn/v20190308/tas_Amon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc
```

### Have a look at the data file using cdo info

**Basic usage:**  
cdo info <filename> | less

**less** display only one page at a time in the termial. You can move forwards and backwards to see more. Press **q** to quit the view.

### Use **ncview** to show the data

```
$ ncview /g/data/oi10/replicas/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Amon/tas/gn/v20190308/tas_Amon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc
```

![1](images/cdo_anomoly_nino3.png)

### Let's see which year this file includes

We use function **showyear** to display all the years in this file

**Basic usage:**  
cdo showyear <file.nc>
```
$ cdo showyear /g/data/oi10/replicas/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Amon/tas/gn/v20190308/tas_Amon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc
```

![2](images/cdo_anomoly_nino2.png)

### Select only 10 year data from the original model file

**basic usage**:  
cdo selyear, stratyear/endyear <input.nc> <output.nc>

```
$ cdo selyear,1991/2000 /g/data/oi10/replicas/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Amon/tas/gn/v20190308/tas_Amon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc  tas_Amon_CESM2_historical_r1i1p1f1_gn_199101-200012.nc 
```

### show the attributes of the data

```
$ cdo showatts tas_Amon_CESM2_historical_r1i1p1f1_gn_199101-200012.nc 

```
![](images/cdo_comp1.png)

The unit of temperature is 'K'--Kelvin. We can convert the unit to Celsius in order to be consistent with observation data. First, substract 273.15 from data value, Second, change attribution units.

```
$ cdo setattribute,tas@units=degC -subc,273.15 tas_Amon_CESM2_historical_r1i1p1f1_gn_199101-200012.nc tas_Amon_CESM2_historical_r1i1p1f1_gn_199101-200012_unitC.nc 

```

### Find observational temperature data and select year 1991-2000

Delaware data :https://climatedataguide.ucar.edu/climate-data/global-land-precipitation-and-temperature-willmott-matsuura-university-delaware

```
$ ls /scratch/public/nci-data-training/air.mon.mean.v501.nc
$ cdo selyear,1991/2000 /scratch/public/nci-data-training/air.mon.mean.v501.nc   air.mon.mean.v501.199101-200012.nc
```

### Now we want to see the difference between the model data and observation data

**Basic usage:**  
cdo sub <input1.nc>  <input2.nc> <output.nc> ---substract input2.nc from input1.nc with the results being output.nc

```
$ cdo sub tas_Amon_CESM2_historical_r1i1p1f1_gn_199101-200012.nc HADCRU_tas_1991_2000.nc CESM2_HADCRU_dif.nc
```

But got error info as below:

cdo sub(Abort): Grid size of the input parameter tas do not match!

This is because the resolution of the model data is different from that of the observation data. CDO provides several ways of data interpolation, one of which being 'cdo remapcon'.

**basic usage:**
```
cdo remapcon, <input1.nc> <input2.nc> <output.nc>
```
Here input1.nc is the file that we want the resolution to be consistent with.
So, let's do remapping first and then subtraction.

```
$ cdo sub -remapcon,air.mon.mean.v501.199101-200012.nc tas_Amon_CESM2_historical_r1i1p1f1_gn_199101-200012_unitC.nc air.mon.mean.v501.199101-200012.nc CESM2_DelawareT_dif.nc
```
![ ](images/cdo_comp2.png)

### Calculate average difference and show it in ncview

```
$ cdo timavg CESM2_DelawareT_dif.nc CESM2_DelawareT_dif_avg.nc
$ ncview CESM2_DelawareT_dif_avg.nc
```
<div class="alert alert-info">
<b>Tip: </b> In CDO, an artificial distinction is made between the notions mean (e.g.timmean) and average (e.g. timavg). The mean is regarded as a statistical function, whereas the average is found simply by adding the sample members and dividing the result by the sample size. For example, the mean of 1, 2, miss and 3 is (1 + 2 + 3)/3 = 2, whereas the average is (1 + 2 + miss + 3)/4 = miss/4 = miss. If there are no missing values in the sample, the average and mean are identical.
</div>

![ ](images/cdo_comp3.png)

We can see that in some areas the model simulated temperature is higher than the observation data, whereas other areas lower than the observation, and the difference seems to be higher in the high latitude areas.

### Summary

In this example, we show how to use cdo to concatenate data files and remap data in order to change its resolution.

## Reference

https://code.mpimet.mpg.de/projects/cdo/embedded/cdo.pdf
