# dev_climapy_data01.ipynb
## Purpose
1. Generate data01.nc, to use as test data for climapy.
2. Use CDO to calculate "truths" for testing of climapy.

## Dependencies
- Python packages listed below.
- [Climate Data Operators (CDO)](https://code.zmaw.de/projects/cdo) - required for testing purposes.

## History
- 2017-04 - Benjamin S. Grandey.

In [1]:
from glob import glob
import os

## Input data location
In order to generate data01.nc, I will use NetCDF data files from [*Data for "Radiative effects of interannually varying vs. interannually invariant aerosol emissions from fires"* (doi: 10.6084/m9.figshare.3497705.v5)](https://doi.org/10.6084/m9.figshare.3497705.v5).

In [2]:
# Location of gunzipped input NetCDF files - CHANGE IF NECESSARY
in_dir = '$HOME/data/figshare/figshare3497705v5/'
in_dir = os.path.expandvars(in_dir)
# Select specific file - bb0_o2000.nc
in_filename = in_dir + 'bb0_o2000.nc'

## Output data location

In [3]:
# Directory in which to write output data01.nc
data_dir = '$HOME/data/temp/'  # store here temporarily; move elsewhere later
data_dir = os.path.expandvars(data_dir)
data_filename = data_dir + 'data01.nc'

In [4]:
# Directory in which to write "truths" calculated by CDO
cdo_dir = data_dir + 'cdo_results/'

## Clean up previously created files?
If previously created files are found, user input is required. The recommended answer is 'y'. But proceed with cautioun - this cannot be undone.

In [5]:
# Clean up data files previously created? Recommended, but be cautious.
filename_list = glob(data_filename) + glob(cdo_dir+'*')
if filename_list:
    s = input('Files found: {}\n'.format(filename_list) +
              '*Would you like to permanently delete these files?* Type "y" if so.\n' +
              'Your response: ')
    if s == 'y':
        for filename in filename_list:
            print('Deleting {}'.format(filename))
            os.remove(filename)
    else:
        print('Files not deleted.')
else:
    print('No files found.')

No files found.


## Generate data01.nc

In [6]:
# Select data from two variables over a 5-year period using CDO
!cdo selyear,3,4,5,6,7 -selname,TS,PRECL {in_filename} {data_filename} &> /dev/null

## Calculate grid-cell area using CDO

In [7]:
out_filename = cdo_dir + 'data01_gridarea.nc'
!cdo gridarea {data_filename} {out_filename} &> /dev/null

## Select data for regions using CDO

In [8]:
# Region bounds dictionary for development/testing purposes
region_bounds_dict = {'EAs': [(94, 156), (20, 65)],  # longitude tuple, latitude tuple
                      'SEAs': [(94, 161), (-10, 20)],
                      'ANZ': [(109, 179), (-50, -10)],
                      'SAs': [(61, 94), (0, 35)],
                      'AfME': [(-21, 61), (-40, 35)],
                      'Eur': [(-26, 31), (35, 75)],
                      'CAs': [(31, 94), (35, 75)],
                      'NAm': [(-169, -51), (15, 75)],
                      'SAm': [(266, 329), (-60, 15)],
                      'Zon': [None, (-75.5, -65.5)],
                      'Mer': [(175.5, 185.5), None]}

In [9]:
# Use CDO to select data for regions
for region, bounds in region_bounds_dict.items():
    out_filename = cdo_dir + 'data01_' + region + '.nc'
    lon_bounds, lat_bounds = bounds
    if lon_bounds == None:
        lon_bounds = (-180, 180)
    if lat_bounds == None:
        lat_bounds = (-90, 90)
    bounds_str = ','.join([str(x) for x in lon_bounds+lat_bounds])
    !cdo sellonlatbox,{bounds_str} {data_filename} {out_filename} &> /dev/null

## Calculate sum of area for each region

In [10]:
# Globe
out_filename = cdo_dir + 'data01_Glb_area.nc'
!cdo fldsum -gridarea {data_filename} {out_filename} &> /dev/null

In [11]:
# Regions
for region, bounds in region_bounds_dict.items():
    out_filename = cdo_dir + 'data01_' + region + '_area.nc'
    lon_bounds, lat_bounds = bounds
    if lon_bounds == None:
        lon_bounds = (-180, 180)
    if lat_bounds == None:
        lat_bounds = (-90, 90)
    bounds_str = ','.join([str(x) for x in lon_bounds+lat_bounds])
    !cdo fldsum -gridarea -sellonlatbox,{bounds_str} {data_filename} {out_filename} &> /dev/null

## Calculate area-weighted mean for each region

In [12]:
# Globe
out_filename = cdo_dir + 'data01_Glb_fldmean.nc'
!cdo fldmean {data_filename} {out_filename} &> /dev/null

In [13]:
# Regions
for region, bounds in region_bounds_dict.items():
    out_filename = cdo_dir + 'data01_' + region + '_fldmean.nc'
    lon_bounds, lat_bounds = bounds
    if lon_bounds == None:
        lon_bounds = (-180, 180)
    if lat_bounds == None:
        lat_bounds = (-90, 90)
    bounds_str = ','.join([str(x) for x in lon_bounds+lat_bounds])
    !cdo fldmean -sellonlatbox,{bounds_str} {data_filename} {out_filename} &> /dev/null

## Summary of data in cdo_dir

In [14]:
glob(cdo_dir+'*')

['/Users/grandey/data/temp/cdo_results/data01_AfME.nc',
 '/Users/grandey/data/temp/cdo_results/data01_AfME_area.nc',
 '/Users/grandey/data/temp/cdo_results/data01_AfME_fldmean.nc',
 '/Users/grandey/data/temp/cdo_results/data01_ANZ.nc',
 '/Users/grandey/data/temp/cdo_results/data01_ANZ_area.nc',
 '/Users/grandey/data/temp/cdo_results/data01_ANZ_fldmean.nc',
 '/Users/grandey/data/temp/cdo_results/data01_CAs.nc',
 '/Users/grandey/data/temp/cdo_results/data01_CAs_area.nc',
 '/Users/grandey/data/temp/cdo_results/data01_CAs_fldmean.nc',
 '/Users/grandey/data/temp/cdo_results/data01_EAs.nc',
 '/Users/grandey/data/temp/cdo_results/data01_EAs_area.nc',
 '/Users/grandey/data/temp/cdo_results/data01_EAs_fldmean.nc',
 '/Users/grandey/data/temp/cdo_results/data01_Eur.nc',
 '/Users/grandey/data/temp/cdo_results/data01_Eur_area.nc',
 '/Users/grandey/data/temp/cdo_results/data01_Eur_fldmean.nc',
 '/Users/grandey/data/temp/cdo_results/data01_Glb_area.nc',
 '/Users/grandey/data/temp/cdo_results/data01_G