# cdo & python in jupyter for climate research

is *all* you need.   

## cdo := climate data operators
https://code.mpimet.mpg.de/projects/cdo
* is a command line computer software suite 
* is intuitive, fast as parallelizable, reliable and well-maintained
* retains netCDF metadata

  

## netCDF := data format with meta-data
   
Adrian Tompkins - Climate Unboxed:
   
[![Adrian Tompkins - Climate Unboxed](http://img.youtube.com/vi/UvNBnjiTXa0/0.jpg)](https://www.youtube.com/watch?v=UvNBnjiTXa0)

   

## python := the standard programming language in climate research 
https://en.wikipedia.org/wiki/Python_(programming_language)
   
* open-source
* readable and universal
* allows you to easily download from the *Climate Data Store* using the cdsapi
<!-- * Multi-paradigm: object-oriented, procedural (imperative), functional, structured, reflective -->

   

## jupyter := interactive programming environment
   
* provides file-browser, editor and computing interface 
* the *jupyter-notebook* is a collection of cells (input) that can be executed, with the output being displayed directly below.
    * a most common way to share scientific code.
    * can be exported for presentations in .html or .pdf, like this one here.
   

# Everything else you need for research
## github := a version control program and platform
* allows you to maintain and share code
## latex
* for writing theses and papers
## inkscape
* for figures and graphics

## software requirements on linux:

* sudo apt-get install cdo
* sudo apt-get install ncview
* sudo apt-get install python3
* sudo apt-get install ipython3
* sudo apt-get install python3-pip

* pip install 
    * 	netCDF4
    * 	seaborn
    * 	cartopy
    * 	xarray
    * 	...

* sudo apt-get install jupyterlab
* sudo apt-get spyder

## Let's get started!
with some model output, reanalysis and observations:

In [101]:
! ls modeldata_precipitation/timmean*

modeldata_precipitation/timmean_5Y_n90dis_IFS4_pr_0p25deg.nc
modeldata_precipitation/timmean_5Y_n90disn512con_ngc3028_pr_Nzoom9_P1D.nc
modeldata_precipitation/timmean_5Y_n90dis_ngc4008_pr_Nzoom7_P1D.nc
modeldata_precipitation/timmean_5Y_n90dis_ngc40AMIP_pr_Nzoom8_P1D.nc
modeldata_precipitation/timmean_ensmean_n90con_pr_AMIP6_oooo.nc
modeldata_precipitation/timmean_ensmean_n90dis_pr_mmday_Amon_oooo__historical_r1i1p1f1_gn_19790116-20141216_v20190710.nc
modeldata_precipitation/timmean_monmean_n90dis_ngc2013_atm_2d_3h_mean_oooo.nc
modeldata_precipitation/timmean_monmean_n90dis_rthk001_atm_2d_3h_mean_oooo.nc
modeldata_precipitation/timmean_n90con_IMERG_trop_20162020.nc
modeldata_precipitation/timmean_n90dis_IFS_4.4-FESOM_5-cycle3_2D_monthly_0.25deg_pr.nc
modeldata_precipitation/timmean_n90dis_IFS_9-FESOM_5-cycle3_2D_monthly_0.25deg_pr.nc
modeldata_precipitation/timmean_n90dis_IFS_9-NEMO_25-cycle3_2D_monthly_0.25deg_pr.nc
modeldata_precipitation/timmean_pr_n90dis_daymean_ngc2009_atm_2d_30mi

and cdo:   
Basic syntax:
    cdo *operator* infile.nc outfile.nc

In [102]:
! cdo sinfov modeldata_precipitation/timmean_5Y_n90dis_IFS4_pr_0p25deg.nc


[0;1m   File format[0m : NetCDF4
[0;1m    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name[0m
     1 : [34munknown  unknown  v instant  [0m[32m     1 [0m  1 [32m    64800 [0m  1 [34m F64  [0m: pr            
[0;1m   Grid coordinates[0m :
     1 : [34mgaussian                [0m : [32mpoints=64800 (360x180)  F90[0m
                              lon : 0 to 359 by 1 degrees_east  circular
                              lat : 89.23664 to -89.23664 degrees_north
[0;1m   Vertical coordinates[0m :
     1 : [34msurface                 [0m :[32m levels=1[0m
[0;1m   Time coordinate[0m :
                             time : [32m1 step
[0m     RefTime =  1970-01-01 00:00:00  Units = seconds  Calendar = proleptic_gregorian  Bounds = true
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
[35m  2022-07-16 12:00:00[0m
[32mcdo    sinfon: [0mProcessed 1 variable over 1 timestep [0.01s 74MB].


In [103]:
! cdo fldmean modeldata_precipitation/timmean_5Y_n90dis_IFS4_pr_0p25deg.nc modeldata_precipitation/fldtimmean_5Y_n90dis_IFS4_pr_0p25deg.nc
! cdo infov modeldata_precipitation/fldtimmean_5Y_n90dis_IFS4_pr_0p25deg.nc


cdo    fldmean:                        1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 910[32mcdo    fldmean: [0mProcessed 64800 values from 1 variable over 1 timestep [0.06s 74MB].
[0;1m    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
[0m     1 :[35m 2022-07-16 12:00:00 [0m[32m      0        1       0 [0m:[34m                0.090228            [0m : pr            
[32mcdo    infon: [0mProcessed 1 value from 1 variable over 1 timestep [0.01s 74MB].


Piping commands:   
cdo *operator1* *-operator2* *-operator3* ...  *-operatorN* infile.nc outfile.nc

In [104]:
! cdo infov -mulc,1000. -divc,30. modeldata_precipitation/fldtimmean_5Y_n90dis_IFS4_pr_0p25deg.nc #modeldata_precipitation/mmday_fldtimmean_5Y_n90dis_IFS4_pr_0p25deg.nc


[32mcdo(1) mulc: [0mProcess started
[32mcdo(2) divc: [0mProcess started
[0;1m    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter name
[0m     1 :[35m 2022-07-16 12:00:00 [0m[32m      0        1       0 [0m:[34m                  3.0076            [0m : pr            
[32mcdo(2) divc: [0mProcessed 1 value from 1 variable over 1 timestep.
[32mcdo(1) mulc: [0mProcessed 1 value from 1 variable over 1 timestep.
[32mcdo    infon: [0mProcessed 1 value from 1 variable over 1 timestep [0.01s 74MB].


### Let's get started pythonially!

In [105]:
file = 'modeldata_precipitation/timmean_5Y_n90dis_IFS4_pr_0p25deg.nc'


In [106]:
import subprocess
def spct(comm):
    subprocess.call(comm,shell=True)
    
spct('cdo sinfov '+ file)


   File format : NetCDF4
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : unknown  unknown  v instant       1   1     64800   1  F64  : pr            
   Grid coordinates :
     1 : gaussian                 : points=64800 (360x180)  F90
                              lon : 0 to 359 by 1 degrees_east  circular
                              lat : 89.23664 to -89.23664 degrees_north
   Vertical coordinates :
     1 : surface                  : levels=1
   Time coordinate :
                             time : 1 step
     RefTime =  1970-01-01 00:00:00  Units = seconds  Calendar = proleptic_gregorian  Bounds = true
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  2022-07-16 12:00:00
cdo    sinfon: Processed 1 variable over 1 timestep [0.01s 46MB].


### ncview for a quick look at netcdf files

In [107]:
# spct('ncview '+file)


![Image](ncview_file.png)

### have a closer look using python module netCDF4

In [108]:
import netCDF4
nc_file = netCDF4.Dataset(file)
print(nc_file.variables['pr'])

<class 'netCDF4._netCDF4.Variable'>
float64 pr(time, lat, lon)
    units: m
    CDI_grid_type: gaussian
    CDI_grid_num_LPE: 90
    _FillValue: 9999.0
    missing_value: 9999.0
    cell_methods: time: mean
    paramId: 228
    dataType: fc
    numberOfPoints: 1038240
    typeOfLevel: surface
    stepUnits: 1
    stepType: accum
    gridType: regular_ll
    shortName: tp
    name: Total precipitation
    cfVarName: tp
    missingValue: 9999
    NV: 0
    gridDefinitionDescription: Latitude/longitude
unlimited dimensions: time
current shape = (1, 180, 360)
filling off


## CDSAPI
[![Adrian Tompkins - Climate Unboxed](http://img.youtube.com/vi/AmF1nn7o6Hc/0.jpg)](https://www.youtube.com/watch?v=AmF1nn7o6Hc)


In [112]:
import cdsapi
import os
c = cdsapi.Client()
for year in range(2016,2020):
    print(year)
    if not os.path.isfile('downloadERA5_oooo_'+str(year)+'.nc'):
        c.retrieve(
        'reanalysis-era5-single-levels-monthly-means',
            {
                'product_type': 'monthly_averaged_reanalysis',            
                'variable': [
    		    'surface_latent_heat_flux', 
    		    'surface_sensible_heat_flux', 
    		    'surface_net_solar_radiation',
    		    'surface_net_thermal_radiation',
    		    'top_net_solar_radiation',
    		    'top_net_thermal_radiation',
    		    'total_column_cloud_ice_water', 
    		    'total_column_cloud_liquid_water',
    		    'total_precipitation',
                ],
                'year': str(year),
                'month': [
                    '01', '02', '03',
                    '04', '05', '06',
                    '07', '08', '09',
                    '10', '11', '12',
                ],
                'time': '00:00',
                'format': 'netcdf',
            },
            'downloadERA5_oooo_'+str(year)+'.nc')
    else:
        print('file already exists!')

2024-05-08 11:16:35,646 INFO Welcome to the CDS
2024-05-08 11:16:35,649 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels-monthly-means


2016


2024-05-08 11:16:35,760 INFO Request is queued
2024-05-08 11:16:36,777 INFO Request is running
2024-05-08 11:16:49,033 INFO Request is completed
2024-05-08 11:16:49,034 INFO Downloading https://download-0002-clone.copernicus-climate.eu/cache-compute-0002/cache/data6/adaptor.mars.internal-1715159805.0285292-16619-16-b76a9a9a-ec11-4a6a-89bf-4f37fe19961b.nc to downloadERA5_oooo_2016.nc (213.9M)
2024-05-08 11:17:12,749 INFO Download rate 9M/s                                                                                                 
2024-05-08 11:17:12,788 INFO Welcome to the CDS
2024-05-08 11:17:12,790 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels-monthly-means
2024-05-08 11:17:12,842 INFO Request is queued


2017


2024-05-08 11:17:13,861 INFO Request is running
2024-05-08 11:17:26,120 INFO Request is completed
2024-05-08 11:17:26,121 INFO Downloading https://download-0003-clone.copernicus-climate.eu/cache-compute-0003/cache/data9/adaptor.mars.internal-1715159840.826419-28386-3-fae4d674-30d7-497e-a45c-b223140723f7.nc to downloadERA5_oooo_2017.nc (213.9M)
2024-05-08 11:17:50,694 INFO Download rate 8.7M/s                                                                                               
2024-05-08 11:17:50,721 INFO Welcome to the CDS
2024-05-08 11:17:50,722 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels-monthly-means
2024-05-08 11:17:50,754 INFO Request is queued


2018


2024-05-08 11:17:51,771 INFO Request is running
2024-05-08 11:18:04,029 INFO Request is completed
2024-05-08 11:18:04,030 INFO Downloading https://download-0015-clone.copernicus-climate.eu/cache-compute-0015/cache/data9/adaptor.mars.internal-1715159879.735495-23781-1-41478cc0-c0a2-45a5-b6e0-bc75698a2e58.nc to downloadERA5_oooo_2018.nc (213.9M)
2024-05-08 11:18:26,731 INFO Download rate 9.4M/s                                                                                               
2024-05-08 11:18:26,760 INFO Welcome to the CDS
2024-05-08 11:18:26,761 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels-monthly-means
2024-05-08 11:18:26,797 INFO Request is queued


2019


2024-05-08 11:18:40,064 INFO Request is running
2024-05-08 11:18:59,098 INFO Request is completed
2024-05-08 11:18:59,100 INFO Downloading https://download-0015-clone.copernicus-climate.eu/cache-compute-0015/cache/data2/adaptor.mars.internal-1715159929.0780303-20576-7-580f40c6-c49f-4a04-b5b5-125cea238142.nc to downloadERA5_oooo_2019.nc (213.9M)
2024-05-08 11:19:23,030 INFO Download rate 8.9M/s                                                                                               


### using cdo within python

In [109]:
import glob
import os
def spct(comm):
    spc(comm,shell=True)

in_files = glob.glob('modeldata_precipitation/timmean_*.nc')

for in_file in in_files:
    out_file = in_file.replace('modeldata_precipitation/timmean','modeldata_precipitation/fldmean_timmean')
    
    if not os.path.isfile(out_file):
        spct('cdo fldmean '+in_file+' '+out_file)
    if     os.path.isfile(out_file):
        print(netCDF4.Dataset(out_file).variables['pr'])

<class 'netCDF4._netCDF4.Variable'>
float64 pr(time, lat, lon)
    _FillValue: nan
    missing_value: nan
    cell_methods: time: mean
unlimited dimensions: time
current shape = (1, 1, 1)
filling off
<class 'netCDF4._netCDF4.Variable'>
float32 pr(time, lat, lon)
    standard_name: pr
    long_name: precipitation flux
    units: kg m-2 s-1
    param: 52.1.0
    cell_methods: time: mean
    _QuantizeBitRoundNumberOfSignificantBits: 13
unlimited dimensions: time
current shape = (1, 1, 1)
filling off
<class 'netCDF4._netCDF4.Variable'>
float32 pr(time, lat, lon)
    long_name: precipitation flux
    units: kg m-2 s-1
    cell_methods: time: mean cell: mean
    component: atmo
    vgrid: surface
unlimited dimensions: time
current shape = (1, 1, 1)
filling off
<class 'netCDF4._netCDF4.Variable'>
float32 pr(time, lat, lon)
    long_name: Total precipitation
    units: m
    _FillValue: -32767.0
    missing_value: -32767.0
    cell_methods: time: mean
unlimited dimensions: time
current shape =

In [110]:
! ls modeldata_precipitation/timm*

modeldata_precipitation/timmean_5Y_n90dis_IFS4_pr_0p25deg.nc
modeldata_precipitation/timmean_5Y_n90disn512con_ngc3028_pr_Nzoom9_P1D.nc
modeldata_precipitation/timmean_5Y_n90dis_ngc4008_pr_Nzoom7_P1D.nc
modeldata_precipitation/timmean_5Y_n90dis_ngc40AMIP_pr_Nzoom8_P1D.nc
modeldata_precipitation/timmean_ensmean_n90con_pr_AMIP6_oooo.nc
modeldata_precipitation/timmean_ensmean_n90dis_pr_mmday_Amon_oooo__historical_r1i1p1f1_gn_19790116-20141216_v20190710.nc
modeldata_precipitation/timmean_monmean_n90dis_ngc2013_atm_2d_3h_mean_oooo.nc
modeldata_precipitation/timmean_monmean_n90dis_rthk001_atm_2d_3h_mean_oooo.nc
modeldata_precipitation/timmean_n90con_IMERG_trop_20162020.nc
modeldata_precipitation/timmean_n90dis_IFS_4.4-FESOM_5-cycle3_2D_monthly_0.25deg_pr.nc
modeldata_precipitation/timmean_n90dis_IFS_9-FESOM_5-cycle3_2D_monthly_0.25deg_pr.nc
modeldata_precipitation/timmean_n90dis_IFS_9-NEMO_25-cycle3_2D_monthly_0.25deg_pr.nc
modeldata_precipitation/timmean_pr_n90dis_daymean_ngc2009_atm_2d_30mi

### Tasks:
1. Install the programs and python modules listed in slide 2. 
2. Interpret the model output given in folder "modeldata_precipitation" using cdo and ncview. 
3. Produce plots of the zonal average of precipitation flux in units of mm/day using python and its modules matplotlib.
4. Produce maps of precipitation flux in units of mm/day using python and its modules matplotlib and cartopy.
5. Produce maps of precipitation biases of models with respect to observations "IMERG: timmean_n90con_IMERG_trop_20162020.nc".



### *have fun!*

In [113]:
!jupyter nbconvert jupy-cdo_tutorial.ipynb --theme=Jupyterlab-light --to slides --SlidesExporter.reveal_scroll=True  --SlidesExporter.reveal_theme=dark --SlidesExporter.theme=dark

#--SlidesExporter.reveal_theme=white  --template nbconvert_template.tpl #slides#--post serve


[NbConvertApp] Converting notebook jupy-cdo_tutorial.ipynb to slides
[NbConvertApp] Writing 387338 bytes to jupy-cdo_tutorial.slides.html
