### Process and save CLM daily averages
This script takes hourly CLM outputs as PFB files and computes the daily averages to be saved as PFB files.

Inputs:
- Directory where CLM outputs are and directory where you want to save output
- Hourly PFB files of CLM outputs
- water year and day start/end

Outputs:
- PFB files for daily average of each variable:  
    - Latent heat (LH) – CLM out layer 1 [W/m^2]
    - Sensible heat flux (SH) – CLM out layer 3 [W/m^2]
    - ground evaporation without condensation (qflx_grnd) – CLM out layer 6 [mm/s]
    - Vegetation transpiration (qflx_trans) – CLM out layer 9 [mm/s]
    - Snow water equivalent (SWE) – CLM out layer 11 [mm]
    - Ground temperature (Tgrnd) – CLM out layer 12 [K] skin temp
    - Soil temperature (Tsoil) – CLM out layer 14 [K] @5cm
    - Evapotranspiration calculation from qflx_evap_tot [m/h]


Notes (10/23/22):  
- Figure out which vars are sum (accumulated) and which are averages
- UNITS!


In [1]:
import numpy as np
from parflow import Run
import sys
from parflow.tools.io import read_pfb,write_pfb
import parflow.tools.hydrology as hydro

In [21]:
NCLMOUTPUTS = 13 + 4 #13 (number variables) + number of layers over which CLM is active, NZ root

#these 3 entries (year, day start and day end) will eventually be argv to the script so that it can be run from bash script
water_year = 2003
day_start = 0
day_end = 2

# water_year = int(sys.argv[1])
# day_start = int(sys.argv[1])
# day_end = int(sys.argv[1])

# path to CLM outputs and pfidb
path_outputs = '/glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/'#f'/WY{water_year}/'
runname = 'spinup.wy2003' #f'CONUS2_{water_year}'

# directory to save averages to
directory_out = f'/glade/scratch/tijerina/CONUS2/spinup_WY2003/averages'

In [22]:
run = Run.from_definition(f'{path_outputs}/{runname}.pfidb')
data = run.data_accessor

# porosity = data.computed_porosity 
# specific_storage = data.specific_storage 
# mannings = data.mannings

# ## remove input filenames for TopoSlopes to force the data accessor to read the output slopes
# ## this fixes a windows issue
# run.TopoSlopesX.FileName = None
# run.TopoSlopesY.FileName = None

# slopex = data.slope_x 
# slopey = data.slope_y 
mask = data.mask

# dz_3d = data.dz

# formatting the mask so that values outside the domain are NA and inside the domain are 1
# check with mask that has 0 and 1
active_mask=mask.copy()
active_mask[active_mask > 0] = 1

Solver: Field BinaryOutDir is not part of the expected schema <class 'parflow.tools.database.generated.Solver'>
Solver.OverlandKinematic: Field SeepageOne is not part of the expected schema <class 'parflow.tools.database.generated.OverlandKinematic'>
Solver.OverlandKinematic: Field SeepageTwo is not part of the expected schema <class 'parflow.tools.database.generated.OverlandKinematic'>
 => Error during CLM import - CLM specific key have been skipped


In [26]:
data.et.shape

(3256, 4442)

In [28]:
###READING ALL STATIC VARIABLES NEEDED
# Read in porosity data
#porosity = read_pfb(f'{path_outputs}{runname}.out.porosity.pfb')
#...
#etc.

#nz,ny,nx = porosity.shape()

# set data accessor time to 1 for reading CLM files
data.time = 1

nz = 10
ny = 3256
nx = 4442

dx = 1000
dy = 1000
dz = 200
dz_3d = data.dz

#apparently it's good to use high numbers when saving files to speed up reading?
p = 48
q = 36
r = 1

#list of clm variables you want
variables_clm = ['eflx_lh_tot', 'eflx_sh_tot', 'qflx_evap_grnd','qflx_tran_veg'] #,'swe_out','t_grnd','t_soil'
#indication whether you want the mean (1) or the sum (0)
variables_clm_mean = [0,0,0,1,1,1] #?????????????????????????????????????????????????????????

# eflx_lh_tot    # CLM 1 # latent heat flux total [W/m2] using the silo variable LatentHeat;
# eflx_lwrad_out # CLM 2 # outgoing long-wave radiation [W/m2] using the silo variable LongWave;
# eflx_sh_tot    # CLM 3 # sensible heat flux total [W/m2] using the silo variable SensibleHeat;
# eflx_soil_grnd # CLM 4 # ground heat flux [W/m2] using the silo variable GroundHeat;
# qflx_evap_tot  # CLM 5 # total evaporation [mm/s] using the silo variable EvaporationTotal;
# qflx_evap_grnd # CLM 6 # ground evaporation without condensation [mm/s] using the silo variable EvaporationGround- NoSublimation;
# qflx_evap_soi  # CLM 7 # soil evaporation [mm/s] using the silo variable EvaporationGround;
# qflx_evap_veg  # CLM 8 # vegetation evaporation (canopy) and transpiration (mms-1) using the silo variable EvaporationCanopy;
# qflx_tran_veg  # CLM 9 # vegetation transpiration [mm/s] using the silo variable Transpiration;
# qflx_infl      # CLM 10 # soil infiltration [mm/s] using the silo variable Infiltration;
# swe_out        # CLM 11 # snow water equivalent [mm] using the silo variable SWE;
# t_grnd         # CLM 12 # ground surface temperature [K] using the silo variable TemperatureGround;
# qflx_qirr      # CLM 13 # irrigation flux
# t_soil         # CLM 

ALL_CLM = ['eflx_lh_tot','eflx_lwrad_out','eflx_sh_tot','eflx_soil_grnd','qflx_evap_tot','qflx_evap_grnd','qflx_evap_soi','qflx_evap_veg','qflx_tran_veg','qflx_infl','swe_out','t_grnd','qflx_qirr','t_soil']

In [29]:
for day in range(day_start,day_end):

    timestamp_day_out = str(int(day+1)).rjust(3, '0')

    ##INITIALIZE WHATEVER DYNAMIC VARIABLES THAT NEED HOURLY READING 
    et = np.zeros((ny,nx)) 
    
    if not variables_clm == False:
        clm_output = np.zeros((NCLMOUTPUTS,ny,nx))
    for h in range(day*24+1,(day+1)*24+1):
        timestamp_reading = str(int(h)).rjust(5, '0')

        #CLM Variables
        clm_output += read_pfb(f'{path_outputs}{runname}.out.clm_output.{timestamp_reading}.C.pfb')
        print(f'reading {path_outputs}{runname}.out.clm_output.{timestamp_reading}.C.pfb')
        
        # Calculate evapotranspiration from qflx_evap_tot
        qflx_evap_total += clm_output[4, ...]
        et += qflx_evap_total * 3600 / 1000 * dx * dy #mm/s > mm/h > m/d

    # ### compute average for average variables
    et /= 24

    # ### SAVE VARIABLES
    write_pfb(f'{directory_out}/ET.{water_year}.daily.{timestamp_day_out}.pfb',et,dx=dx,dy=dy,dz=dz,P=p,Q=q,R=r,dist=False)
   

    #Compute averages CLM outputs
    for ind_clm in range(len(variables_clm)):
        #Check if it's t_soil, then it's 3D!
        if variables_clm[ind_clm] == 't_soil':
            clm_save = clm_output[14:,:,:]
        else:
            ind_in_clmoutput = ALL_CLM.index(variables_clm[ind_clm])
            clm_save = clm_output[ind_in_clmoutput,:,:]
        if variables_clm_mean[ind_clm]==1:
            clm_save/=24
        
        #SAVE VARIABLES CLM outputs
        write_pfb(f'{directory_out}/{variables_clm[ind_clm]}.daily.{timestamp_day_out}.pfb',clm_save,dx=dx,dy=dy,dz=dz,P=p,Q=q,R=r,dist=False)

    

reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00001.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00002.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00003.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00004.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00005.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00006.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00007.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup.wy2003.out.clm_output.00008.C.pfb
reading /glade/scratch/tijerina/CONUS2/spinup_WY2003/run_inputs/output-pf/spinup

In [15]:
clm_output.shape

(17, 3256, 4442)

In [None]:
day1_sm = read_pfb(f'{directory_out}/spinup.wy2003.out.SM.001.pfb')

In [None]:
day1_sm.shape

In [None]:
day1_sm[9,2000:2005,2000:2005]