# Copyright Netherlands eScience Center <br>
** Function     : Packing the subdaily radiation and turbulent flux fields into weekly fields from ERA-Interim as netCDF** <br>
** Author       : Yang Liu ** <br>
** First Built  : 2019.05.17 ** <br>
** Last Update  : 2019.08.17 ** <br>
Description     : This notebook aims to pack the subdaily SIC fields to weekly SIC fields from ERA-Interim.<br>
Return Values   : netCDF4 <br>
Caveat          : The data is subdaily spatial distribution of radiation and turbulent flux fields from 40N to 90N, dated from 1979 to 2017. It has sampling times as:<br>
0:00 + 12:00 <br>

he radiations fields are forecast fields instead of analysis fields. It is accumulated from a certain forecast time and thus for the values at certain time we should subtract the accumulated values. The prediction is 12 hrs. The starting time for each prediction as well as the predicted time are given below: <br>

00:00 : 3:00 + 6:00 + 9:00 + 12:00 <br>
12:00 : 3:00 + 6:00 + 9:00 + 12:00 <br>

For the calculation of weekly fields, we assume each month consist of 4 weeks. The first 3 weeks including 7 days. The 4th week contain the rest of the days in that month.<br>

For all the flux generated by ECMWF model, downward is positive (regardless of the standard name). So it is with ERA-Interim. !!! Note that the standard name in netcdf file doesn't not account for the direction in this case!! (upward for some variables, but actually with downward positively).
* Net shortwave radiation at surface - downward 
* Net shortwave radiation at TOA - downward  
* Net longwave radiation at surface - downward 
* Upwelling longwave radiation at TOA - downward 
* Latent energy flux - downward 
* Sensible heat flux - downward 

In [4]:
import numpy as np
import scipy as sp
import time as tttt
from netCDF4 import Dataset,num2date
import os

In [5]:
################################   Input zone  #########################################
# specify starting and ending time
start_year = 1979
end_year = 2017
# specify data path
# ERAI 3D fields on pressure level
datapath = '/home/ESLT0068/WorkFlow/Core_Database_DeepLearn/ERA-Interim/rad_daily'
# specify output path for figures
output_path = '/home/ESLT0068/WorkFlow/Core_Database_DeepLearn/ERA-Interim'
########################################################################################

In [6]:
def var_key_retrieve(datapath, year, month):
    # get the path to each datasets
    print ("Start retrieving datasets {} (y) {} (m)".format(year,month))
    # The shape of each variable is (67,480)
    datapath_full = os.path.join(datapath, 'era{}'.format(year),'pressure_daily_075_diagnostic_{}_{}_rad.nc'.format(year,month))
    # get the variable keys
    var_key = Dataset(datapath_full)
    
    print ("Retrieving datasets successfully and return the variable key!")
    return var_key

In [7]:
def retriver(key):
    print ('Extract subdaily fields and calculate weekly fields.')
    sshf_accumulate = var_key.variables['sshf'][:]
    slhf_accumulate = var_key.variables['slhf'][:]
    ssr_accumulate = var_key.variables['ssr'][:]
    str_accumulate = var_key.variables['str'][:]
    tsr_accumulate = var_key.variables['tsr'][:]
    ttr_accumulate = var_key.variables['ttr'][:]
    # create arrays to store the values after removing accumulation
    sshf_synoptic = np.zeros(sshf_accumulate.shape)
    slhf_synoptic = np.zeros(slhf_accumulate.shape)
    ssr_synoptic = np.zeros(ssr_accumulate.shape)
    str_synoptic = np.zeros(str_accumulate.shape)
    tsr_synoptic = np.zeros(tsr_accumulate.shape)
    ttr_synoptic = np.zeros(ttr_accumulate.shape)
    # remove the accumulation and take the monthly mean
    sshf_synoptic[0::4,:,:] = sshf_accumulate[0::4,:,:]
    slhf_synoptic[0::4,:,:] = slhf_accumulate[0::4,:,:]
    ssr_synoptic[0::4,:,:] = ssr_accumulate[0::4,:,:]
    str_synoptic[0::4,:,:] = str_accumulate[0::4,:,:]
    tsr_synoptic[0::4,:,:] = tsr_accumulate[0::4,:,:]
    ttr_synoptic[0::4,:,:] = ttr_accumulate[0::4,:,:]
    for i in np.arange(3):
        sshf_synoptic[i+1::4,:,:] = sshf_accumulate[i+1::4,:,:] - sshf_accumulate[i::4,:,:]
        slhf_synoptic[i+1::4,:,:] = slhf_accumulate[i+1::4,:,:] - slhf_accumulate[i::4,:,:]
        ssr_synoptic[i+1::4,:,:] = ssr_accumulate[i+1::4,:,:] - ssr_accumulate[i::4,:,:]
        str_synoptic[i+1::4,:,:] = str_accumulate[i+1::4,:,:] - str_accumulate[i::4,:,:]
        tsr_synoptic[i+1::4,:,:] = tsr_accumulate[i+1::4,:,:] - tsr_accumulate[i::4,:,:]
        ttr_synoptic[i+1::4,:,:] = ttr_accumulate[i+1::4,:,:] - ttr_accumulate[i::4,:,:]
    # create the arrays for daily mean
    lat = var_key.variables['latitude'][:]
    lon = var_key.variables['longitude'][:]
    time = var_key.variables['time'][:]
    sshf_daily = np.zeros((len(time)//8, len(lat), len(lon)),dtype=float)
    slhf_daily = np.zeros((len(time)//8, len(lat), len(lon)),dtype=float)
    ssr_daily = np.zeros((len(time)//8, len(lat), len(lon)),dtype=float)
    str_daily = np.zeros((len(time)//8, len(lat), len(lon)),dtype=float)
    tsr_daily = np.zeros((len(time)//8, len(lat), len(lon)),dtype=float)
    ttr_daily = np.zeros((len(time)//8, len(lat), len(lon)),dtype=float)
    # take the mean per month and change the unit to W/m2
    for i in np.arange(len(time)//8):
        sshf_daily[i,:,:] = np.mean(sshf_synoptic[i*8:i*8+8,:,:], 0) / (3 * 3600)
        slhf_daily[i,:,:] = np.mean(slhf_synoptic[i*8:i*8+8,:,:], 0) / (3 * 3600)
        ssr_daily[i,:,:] = np.mean(ssr_synoptic[i*8:i*8+8,:,:], 0) / (3 * 3600)
        str_daily[i,:,:] = np.mean(str_synoptic[i*8:i*8+8,:,:], 0) / (3 * 3600)
        tsr_daily[i,:,:] = np.mean(tsr_synoptic[i*8:i*8+8,:,:], 0) / (3 * 3600)
        ttr_daily[i,:,:] = np.mean(ttr_synoptic[i*8:i*8+8,:,:], 0) / (3 * 3600)
    SFlux_daily = sshf_daily[:] + slhf_daily[:] + ssr_daily[:] + str_daily[:]
    TOAFlux_daily = tsr_daily[:] + ttr_daily[:]
    # take weekly mean
    SFlux_weekly = np.zeros((4,len(lat),len(lon)),dtype=float)
    TOAFlux_weekly = np.zeros((4,len(lat),len(lon)),dtype=float)
    for i in np.arange(4):
        if i < 3:
            SFlux_weekly[i,:,:] = np.mean(SFlux_daily[i*7:i*7+7,:,:],axis=0)
            TOAFlux_weekly[i,:,:] = np.mean(TOAFlux_daily[i*7:i*7+7,:,:],axis=0)
        else:
            SFlux_weekly[i,:,:] = np.mean(SFlux_daily[i*7:,:,:],axis=0)
            TOAFlux_weekly[i,:,:] = np.mean(TOAFlux_daily[i*7:,:,:],axis=0)

    return SFlux_weekly, TOAFlux_weekly

In [8]:
# save output datasets
# we only pack our timeseries from 1979 to 2017
def create_netcdf_point (SFlux_weekly, TOAFlux_weekly, period, week,
                         latitude, longitude, output_path):
    print ('*******************************************************************')
    print ('*********************** create netcdf file*************************')
    print ('*******************************************************************')
    print("Start creating netcdf file for UV10M from 1979 to 2017.")
    # wrap the datasets into netcdf file
    # 'NETCDF3_CLASSIC', 'NETCDF3_64BIT', 'NETCDF4_CLASSIC', and 'NETCDF4'
    data_wrap = Dataset(output_path + os.sep + 'rad_flux_weekly_erai_1979_2017.nc','w',format = 'NETCDF4')
    # create dimensions for netcdf data
    year_wrap_dim = data_wrap.createDimension('year', len(period))
    week_wrap_dim = data_wrap.createDimension('week', len(week))
    lat_wrap_dim = data_wrap.createDimension('latitude', len(latitude))
    lon_wrap_dim = data_wrap.createDimension('longitude', len(longitude))
    # create coordinate variables for 3-dimensions
    year_wrap_var = data_wrap.createVariable('year',np.int32,('year',))
    week_wrap_var = data_wrap.createVariable('week',np.int32,('week',))
    lat_wrap_var = data_wrap.createVariable('latitude',np.float64,('latitude',))
    lon_wrap_var = data_wrap.createVariable('longitude',np.float64,('longitude',))    
    # create the actual 4-d variable
    SFlux_wrap_var = data_wrap.createVariable('SFlux',np.float64,('year','week','latitude','longitude'))
    TOAFlux_wrap_var = data_wrap.createVariable('TOAFlux',np.float64,('year','week','latitude','longitude'))

    # global attributes
    data_wrap.description = 'Weekly Net Surface/TOA Flux'
    # variable attributes
    lat_wrap_var.units = 'degree_north'
    lon_wrap_var.units = 'degree_east'
    SFlux_wrap_var.units = 'W/m2'
    TOAFlux_wrap_var.units = 'W/m2'
    SFlux_wrap_var.long_name = 'Net Surface Flux'
    TOAFlux_wrap_var.long_name = 'Net TOA Flux'

    # writing data
    year_wrap_var[:] = period
    week_wrap_var[:] = week
    lat_wrap_var[:] = latitude
    lon_wrap_var[:] = longitude
    SFlux_wrap_var[:] = SFlux_weekly
    TOAFlux_wrap_var[:] = TOAFlux_weekly

    # close the file
    data_wrap.close()
    print ("Create netcdf file successfully")

In [None]:
if __name__=="__main__":
    ####################################################################
    ######  Create time namelist matrix for variable extraction  #######
    ####################################################################
    # date and time arrangement
    # namelist of month and days for file manipulation
    namelist_month = ['01','02','03','04','05','06','07','08','09','10','11','12']
    # index of months
    period = np.arange(start_year,end_year+1,1)
    index_month = np.arange(1,13,1)
    index_week = np.arange(1,49,1)
    ####################################################################
    ######       Extract invariant and calculate constants       #######
    ####################################################################
    # get invariant from benchmark file
    Dim_year = len(period)
    Dim_month = len(index_month)
    Dim_week = len(index_week)
    Dim_latitude = 67
    Dim_longitude = 480
    #############################################
    #####   Create space for stroing data   #####
    #############################################
    # data pool
    pool_SFlux = np.zeros((Dim_year,Dim_week,Dim_latitude,Dim_longitude),dtype = float)
    pool_TOAFlux = np.zeros((Dim_year,Dim_week,Dim_latitude,Dim_longitude),dtype = float)
    # loop for calculation
    for i in period:
        for j in index_month:
            var_key = var_key_retrieve(datapath,i,j)
            # get the key of each variable
            latitude = var_key.variables['latitude'][:]
            longitude = var_key.variables['longitude'][:]
            SFlux_weekly, TOAFlux_weekly = retriver(var_key)
            pool_SFlux[i-1979,j*4-4:j*4,:,:] = SFlux_weekly
            pool_TOAFlux[i-1979,j*4-4:j*4,:,:] = TOAFlux_weekly
    ####################################################################
    ######                 Data Wrapping (NetCDF)                #######
    ####################################################################
    # round off the values in case of leaking
    #pool_sic = np.around(pool_sic,decimals=6)
    create_netcdf_point(pool_SFlux, pool_TOAFlux, period, index_week,
                        latitude, longitude, output_path)
    print ('Packing 2D fields of ERA-Interim on surface level is complete!!!')
    print ('The output is in sleep, safe and sound!!!')

Start retrieving datasets 1979 (y) 1 (m)
Retrieving datasets successfully and return the variable key!
Extract subdaily fields and calculate weekly fields.
Start retrieving datasets 1979 (y) 2 (m)
Retrieving datasets successfully and return the variable key!
Extract subdaily fields and calculate weekly fields.
Start retrieving datasets 1979 (y) 3 (m)
Retrieving datasets successfully and return the variable key!
Extract subdaily fields and calculate weekly fields.
Start retrieving datasets 1979 (y) 4 (m)
Retrieving datasets successfully and return the variable key!
Extract subdaily fields and calculate weekly fields.
Start retrieving datasets 1979 (y) 5 (m)
Retrieving datasets successfully and return the variable key!
Extract subdaily fields and calculate weekly fields.
Start retrieving datasets 1979 (y) 6 (m)
Retrieving datasets successfully and return the variable key!
Extract subdaily fields and calculate weekly fields.
Start retrieving datasets 1979 (y) 7 (m)
Retrieving datasets suc