# Copyright Netherlands eScience Center <br>
** Function     : Packing netCDF for the radiation and turbulent flux fields (TOA & surface) from MERRA2** <br>
** Author       : Yang Liu ** <br>
** First Built  : 2018.10.03 ** <br>
** Last Update  : 2018.10.03 ** <br>
Description     : This notebook aims to pack the TOA/surface fields of radiations from MERRA2.<br>
Return Values   : netCDF4 <br>
Caveat          : The radiations fields are forecast fields instead of analysis fields. Unlike ERA-Interim, those forecast fields are unpacked by NASA and thus they are not accumulated values. They can be used directly. However, attention should be paid on the sign.<br>
In contrast to ERA-Interim, for the flux in MERRA2, the **positive sign** for each variable varies:<br>
* Net shortwave radiation at surface - downward <br>
* Net shortwave radiation at TOA - downward <br>
* Net longwave radiation at surface - downward <br>
* Upwelling longwave radiation at TOA - upward <br>
* Total latent energy flux - upward <br>
* Sensible heat flux - upward <br>

In [1]:
import numpy as np
from netCDF4 import Dataset
import os

Initialization - Start with location of input and extraction of variables
Time span of each product:
- **ERA-Interim** 1979 - 2016
- **MERRA2**      1980 - 2016
- **JRA55**       1979 - 2015
- **ORAS4**       1958 - 2017
- **GLORYS2V3**   1993 - 2014
- **SODA3**       1980 - 2015

In [2]:
################################   Input zone  ######################################
# specify starting and ending time
start_year = 1980
end_year = 2017
# specify data path
# MERRA2 2D fields - radiations
datapath_Rad = '/home/yang/workbench/Core_Database_AMET_OMET_reanalysis/MERRA2/regression/MERRA2_tavgM_2d_rad_Nx'
# MERRA2 2D fields - turbulent flux
datapath_SFlux = '/home/yang/workbench/Core_Database_AMET_OMET_reanalysis/MERRA2/regression/MERRA2_tavgM_2d_sflx_Nx'
# specify output path for figures
output_path = '/home/yang/workbench/Core_Database_AMET_OMET_reanalysis/MERRA2/regression'
####################################################################################

In [3]:
def var_key_retrieve(datapath_Rad, datapath_SFlux, year, month):
    # get the path to each datasets
    print ("Start retrieving datasets {} (y) - {} (m)".format(year,namelist_month[month-1]))
    # The shape of each variable is (361,576)
    # Sea Level Pressure
    if year < 1992:
        datapath_Rad_full = os.path.join(datapath_Rad,
                                         'MERRA2_100.tavgM_2d_rad_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    elif year < 2001:
        datapath_Rad_full = os.path.join(datapath_Rad,
                                         'MERRA2_200.tavgM_2d_rad_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    elif year < 2011:
        datapath_Rad_full = os.path.join(datapath_Rad,
                                         'MERRA2_300.tavgM_2d_rad_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    else:
        datapath_Rad_full = os.path.join(datapath_Rad,
                                         'MERRA2_400.tavgM_2d_rad_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    # get the variable keys
    var_key_Rad = Dataset(datapath_Rad_full)

    #Sea Surface Temperature and Sea Ice Concentration
    if year < 1992:
        datapath_SFlux_full = os.path.join(datapath_SFlux,
                                           'MERRA2_100.tavgM_2d_flx_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    elif year < 2001:
        datapath_SFlux_full = os.path.join(datapath_SFlux,
                                           'MERRA2_200.tavgM_2d_flx_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    elif year < 2011:
        datapath_SFlux_full = os.path.join(datapath_SFlux,
                                           'MERRA2_300.tavgM_2d_flx_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    else:
        datapath_SFlux_full = os.path.join(datapath_SFlux,
                                           'MERRA2_400.tavgM_2d_flx_Nx.{}{}.SUB.nc4'.format(year,namelist_month[month-1]))
    # get the variable keys
    var_key_SFlux = Dataset(datapath_SFlux_full)

    print ("Retrieving datasets successfully and return the variable key!")
    return var_key_Rad, var_key_SFlux

In [11]:
def create_netcdf_point(pool_EFLUX,pool_HFLUX,pool_LWGNT,pool_LWTUP,
                          pool_SWGNT, pool_SWTNT, output_path):
    print ('*******************************************************************')
    print ('*********************** create netcdf file*************************')
    print ('*******************************************************************')
    # wrap the datasets into netcdf file
    # 'NETCDF3_CLASSIC', 'NETCDF3_64BIT', 'NETCDF4_CLASSIC', and 'NETCDF4'
    data_wrap = Dataset(os.path.join(output_path, 'surface_merra_monthly_regress_1980_2017_SFlux_Rad.nc'),'w',format = 'NETCDF4')
    # create dimensions for netcdf data
    year_wrap_dim = data_wrap.createDimension('year',Dim_year)
    month_wrap_dim = data_wrap.createDimension('month',Dim_month)
    lat_wrap_dim = data_wrap.createDimension('latitude',Dim_latitude)
    lon_wrap_dim = data_wrap.createDimension('longitude',Dim_longitude)
    # create coordinate variables for 3-dimensions
    year_wrap_var = data_wrap.createVariable('year',np.int32,('year',))
    month_wrap_var = data_wrap.createVariable('month',np.int32,('month',))
    lat_wrap_var = data_wrap.createVariable('latitude',np.float32,('latitude',))
    lon_wrap_var = data_wrap.createVariable('longitude',np.float32,('longitude',))
    # create the actual 3-d variable
    EFLUX_wrap_var = data_wrap.createVariable('EFLUX',np.float64,('year','month','latitude','longitude'),zlib=True)
    HFLUX_wrap_var = data_wrap.createVariable('HFLUX',np.float64,('year','month','latitude','longitude'),zlib=True)
    LWGNT_wrap_var = data_wrap.createVariable('LWGNT',np.float64,('year','month','latitude','longitude'),zlib=True)
    LWTUP_wrap_var = data_wrap.createVariable('LWTUP',np.float64,('year','month','latitude','longitude'),zlib=True)
    SWGNT_wrap_var = data_wrap.createVariable('SWGNT',np.float64,('year','month','latitude','longitude'),zlib=True)
    SWTNT_wrap_var = data_wrap.createVariable('SWTNT',np.float64,('year','month','latitude','longitude'),zlib=True)
    # global attributes
    data_wrap.description = 'Monthly mean 2D fields of radiation and turbulent flux from MERRA2 at each grid point'
    # variable attributes
    lat_wrap_var.units = 'degree_north'
    lon_wrap_var.units = 'degree_east'

    EFLUX_wrap_var.units = 'W/m2'
    HFLUX_wrap_var.units = 'W/m2'
    LWGNT_wrap_var.units = 'W/m2'
    LWTUP_wrap_var.units = 'W/m2'
    SWGNT_wrap_var.units = 'W/m2'
    SWTNT_wrap_var.units = 'W/m2'

    EFLUX_wrap_var.long_name = 'latent energy flux'
    HFLUX_wrap_var.long_name = 'sensible heat flux'
    LWGNT_wrap_var.long_name = 'surface net downward longwave flux'
    LWTUP_wrap_var.long_name = 'upwelling longwave flux at toa'
    SWGNT_wrap_var.long_name = 'surface net downward shortwave flux'
    SWTNT_wrap_var.long_name = 'toa net downward_shortwave flux'

    # writing data
    lat_wrap_var[:] = latitude
    lon_wrap_var[:] = longitude
    month_wrap_var[:] = index_month
    year_wrap_var[:] = period

    EFLUX_wrap_var[:] = pool_EFLUX
    HFLUX_wrap_var[:] = pool_HFLUX
    LWGNT_wrap_var[:] = pool_LWGNT
    LWTUP_wrap_var[:] = pool_LWTUP
    SWGNT_wrap_var[:] = pool_SWGNT
    SWTNT_wrap_var[:] = pool_SWTNT

    # close the file
    data_wrap.close()
    print ("Create netcdf file successfully")

In [12]:
if __name__=="__main__":
    ####################################################################
    ######  Create time namelist matrix for variable extraction  #######
    ####################################################################
    # date and time arrangement
    # namelist of month and days for file manipulation
    namelist_month = ['01','02','03','04','05','06','07','08','09','10','11','12']
    # index of months
    period = np.arange(start_year,end_year+1,1)
    index_month = np.arange(1,13,1)
    ####################################################################
    ######       Extract invariant and calculate constants       #######
    ####################################################################
    # get invariant from benchmark file
    Dim_year = len(period)
    Dim_month = len(index_month)
    Dim_latitude = 361
    Dim_longitude = 576
    #############################################
    #####   Create space for stroing data   #####
    #############################################
    # data pool for zonal integral
    pool_EFLUX = np.zeros((Dim_year,Dim_month,Dim_latitude,Dim_longitude),dtype = float)
    pool_HFLUX = np.zeros((Dim_year,Dim_month,Dim_latitude,Dim_longitude),dtype = float)
    pool_LWGNT = np.zeros((Dim_year,Dim_month,Dim_latitude,Dim_longitude),dtype = float)
    pool_LWTUP = np.zeros((Dim_year,Dim_month,Dim_latitude,Dim_longitude),dtype = float)
    pool_SWGNT = np.zeros((Dim_year,Dim_month,Dim_latitude,Dim_longitude),dtype = float)
    pool_SWTNT = np.zeros((Dim_year,Dim_month,Dim_latitude,Dim_longitude),dtype = float)
    latitude = np.zeros(Dim_latitude,dtype=float)
    longitude = np.zeros(Dim_longitude,dtype=float)
    # loop for calculation
    for i in period:
        for j in index_month:
                # get the key of each variable
            var_key_Rad, var_key_SFlux = var_key_retrieve(datapath_Rad,datapath_SFlux,i,j)
            pool_EFLUX[i-1980,j-1,:,:] = var_key_SFlux.variables['EFLUX'][0,:,:]
            pool_HFLUX[i-1980,j-1,:,:] = var_key_SFlux.variables['HFLUX'][0,:,:]
            latitude = var_key_SFlux.variables['lat'][:]
            longitude = var_key_SFlux.variables['lon'][:]
            pool_LWGNT[i-1980,j-1,:,:] = var_key_Rad.variables['LWGNT'][0,:,:]
            pool_LWTUP[i-1980,j-1,:,:] = var_key_Rad.variables['LWTUP'][0,:,:]
            pool_SWGNT[i-1980,j-1,:,:] = var_key_Rad.variables['SWGNT'][0,:,:]
            pool_SWTNT[i-1980,j-1,:,:] = var_key_Rad.variables['SWTNT'][0,:,:]
    ####################################################################
    ######                 Data Wrapping (NetCDF)                #######
    ####################################################################
    create_netcdf_point(pool_EFLUX,pool_HFLUX,pool_LWGNT,pool_LWTUP,
                        pool_SWGNT, pool_SWTNT, output_path)
    print ('Packing 2D fields of MERRA2 is complete!!!')
    print ('The output is in sleep, safe and sound!!!')

Start retrieving datasets 1980 (y) - 01 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 02 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 03 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 04 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 05 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 06 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 07 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 08 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 09 (m)
Retrieving datasets successfully and return the variable key!
Start retrieving datasets 1980 (y) - 10 (m)
Re