# Lesson 1 - Calibration Overview

### Calibration Overview:



![WorkFlow_Image_Placeholder](imagename.png "Image Placeholder for now")


Calibration workflow is run through a series of seven individual python scripts, typically run from the command line (terminal). The python scripts must be run in order, and only once the previous script has completed. Brief descriptions of each script are provided below:

### Programs & Workflow:

*initDB.py:* 

This program is used one time to initialize the calibration database and associated tables used during the experiment. The upcoming section will describe the database and associated tables in more detail.

*inputDomainMeta.py:* 

This program reads in a CSV file you will need to fill out that describes modeling domains to be used for calibration. This information is entered into the database for later workflow use. More description on the CSV will occur during the setup section.

*jobInit.py:* 

This program is run to establish a calibration ‘experiment’. The program reads a config file 
(explained later in the setup section) and sets up the necessary run directories, paths to necessary files, and inputs associated metadata into the database. Upon successful completion, the program will return a unique job ID value which you will use in subsequent programs to run the calibration.

*getJobID.py:*

This program will return the job ID to you in the cases where you have forgotten your unique job ID for the calibration experiment. The calibration config file is used as input into this program.
spinup.py: This is the first program that is run to initialize the calibration experiment. The only mandatory argument to this program is the unique job ID for the calibration experiment. The main purpose of this program is to run the NWM/WRF-Hydro spinup for all domains being calibrated. This program needs to successfully complete before moving onto the next step.

*calib.py:* 

This is the second program that is run in the calibration workflow. As with spinup.py, the only mandatory argument is the job ID value. This program runs the main workflow to adjust parameter values, execute interim model simulations, evaluate model output against observations, and further adjust parameter values. This program must be completed successfully before moving onto the next step.

*validation.py:*

This is the third and final main program in the calibration workflow. The only mandatory argument is the unique job ID value associated with the calibration experiment. This program manages running the model with the final calibrated parameters over a specified evaluation period for the evaluation of the parameters.

### Step 1: Configure Setup File: "setup.parm"

User: Navigate to the PyWrfHydroCalib 'setup_files' directory:

In [3]:
%%bash
cd /glade/work/arezoo/RFC_Training/RFC_CalibTrain2021/PyWrfHydroCalib/setup_files/
ls

calib_params.tbl
domainMetaTemplate.csv
gage_list_template.csv
sens_params.tbl
setup.parm


In [None]:
# optional code to run:
%%bash
cd /glade/work/arezoo/RFC_Training/RFC_CalibTrain2021/PyWrfHydroCalib/setup_files/
# cat setup.parm # remove this comment to run the shell command to view the setup.parm file contents.

note file: 'setup.parm' 
(Users can open a terminal session in jupyter notebook and copy/paste the following command to view this file-->
*vi setup.parm*


### Editing the setup.parm file:

The primary file you will be editing in preparation for setting up a calibration workflow job is the ‘setup.parm’ file.

It is best to think of this file as a master configuration file to guide the workflow. A template file is located under /setup_files/setup.parm in the PyNWMCalib code repository . 

This file contains multiple options that define how the workflow will submit jobs for models/analysis, which basins to calibrate from the database, methods for reporting errors to the user, model physics options, and paths to general parameter files and executables.


### logistics section
**outDir** - Req. Where do you want your calibration experiment to be constructed? <br>
**expName** - Req. What is the name of your experiment. This can be anything you want. <br>
**acctKey** - Opt. If you are running on a que system that requires credentials to submit a job, specify your account key here. <br>
**optQueName** - Opt. If you need to direct model simulations/jobs to a specific que, you can specify that here. <br>
**nCoresModel** - Req. How many CPUs are you running your model simulations over? <br>
**nNodesModel** - Req. If you are running across multiple nodes on a que system, specify here.<br> 
**nCoresPerNode** - Req. If you are running on a que system, specify how many CPU cores you have available per node. <br>
**runSens** - Req. Are we running sensitivity analysis? 0 - No, 1 - Yes <br>
**sensParmTbl** - Req. Path to the table of parameter values to use in sensitivity analysis. Same format as the calibration parameter table. <br>
**runCalib** - Req. Are we running calibration? 0 - No, 1 - Yes <br>
**calibParmTbl** - Req. Path to the table of parameter values to use in calibration. <br>
**dailyStats** - Req. Flag to direct worfklow to calculate error statistics on a daily scale, instead of hourly by default. Specify 1 to activate. <br>
**coldStart** - Req. Flag to direct calibration workflow to cold start your model simulation for each iteration during calibration. Specify 1 to activate. <br>
**optSpinFlag** - Req. Flag to direct the workflow to use an alternative spinup file already in place in the input directory for the basins being calibrated. This allows the user to bypass the spinup step. Specify 1 to activate. <br>
**stripCalibOutputs** - Req. If you desire to ommit outputs during an intial window for each model iteration, you can specify 1 here to activate this feature. This was designed to minimize I/O model burdens. <br>
**stripCalibHours** - Req. Specify an initial window in hours to strip outputs. This is only used if stripCalibOutputs has been activated. <br>
**jobRunType** - Req. How are the model simulations and calibration code being executed? For this exercise we will be using 4, which is MPI with no job scheduler. <br>
**mpiCmd** - Req. What is the MPI command being used to execute the model simulations. This is required, as the MPI command is also used in job scheduler scripts. <br>
**cpuPinCmd** - Opt. If you are running on a job scheduler, how do you want to pin specific model simulations on a compute node? This put in to allow multiple basins on one node. <br>
**numIter** - Req. How many model iterations would you like to run your calibration experiment over? <br>
**calibMethod** - Req. Right now only DDS is allowed. Future upgrades will incorporate additional calibration methods. <br>
**objectiveFunction** - Req. What error metric do you wish to minimize during the calibraion experiment. <br>
**ddsR** - Req. This is a DDS-specific parameter that tunes how random values are generated for each iteration. <br>
**email** - Req. Where do you want status and error messages to be directed to? <br>
**wrfExe** - Req. Path to the WRF-Hydro executable to be used in the workflow. <br>
**genParmTbl** - Req. Path to the GENPARM.TBL file used by the model. <br>
**mpParmTbl** - Req. Path to the MPARM.TBL file used by the model. <br>
**urbParmTbl** - Req. Path to the URBPARM.TBL file used by the model. <br>
**vegParmTbl** - Req. Path to the VEGPARM.TBL file used by the model. <br>
**soilParmTbl** - Req. Path to the SOILPARM.TBL file used by the model. <br>
**bSpinDate** - Opt. Beginning date for the spinup. <br>
**eSpinDate** - Opt. Ending date for the spinup. <br>
**bCalibDate** - Req. Beginning date for each calibration iteration. <br>
**eCalibDate** - Req. Ending date for each calibration iteration. <br>
**bCalibEvalDate** - Req. The date within each calibration iteration to begin analysis. <br>
**bValidDate** - Opt. Beginning date for the validation simulation. <br>
**eValidDate** - Opt. Ending date for the validation simulation. <br>
**bValidEvalDate** - Opt. The date within the validation simulation to begin analysis. <br>

### Sensitivity
**sensParmSample** - Opt. Sensitivity parameter. <br>
**sensBatchNum** - Opt. How many sensitivity simulations to run at once. <br>
**bSensDate** - Opt. Beginning date for the sensitivity model simulation. <br>
**eSensDate** - Opt. Ending date for the sensitivity model simulation. <br>
**bSensEvalDate** - Opt. The date within the sensitivity simulation to begin analysis. <br>

### gageInfo
**gageListSQL** - Req. SQL command to extrat basins for calibration out of the database file. <br>
**gageListFile** - Opt. Alternative list of basins to calibrate instead of using an SQL command. <br>

### lsmPhysics
**dynVegOption** - Req. DYNAMIC_VEG_OPTION for NoahMP. <br>
**canStomResOption** - Req. CANOPY_STOMATAL_RESISTANCE_OPTION for NoahMP. <br>
**btrOption** - Req. BTR_OPTION for NoahMP. <br>
**runoffOption** - Req. RUNOFF_OPTION for NoahMP. <br>
**sfcDragOption** - Req. SURFACE_DRAG_OPTINO for NoahMP. <br>
**frzSoilOption** - Req. FROZEN_SOIL_OPTION for NoahMP. <br>
**supCoolOption** - Req. SUPERCOOLED_WATER_OPTION for NoahMP. <br>
**radTransferOption** - Req. RADIATIVE_TRANSFER_OPTION for NoahMP. <br>
**snAlbOption** - Req. SNOW_ALBEDO_OPTION for NoahMP. <br>
**pcpPartOption** - Req. PCP_PARTITION_OPTION. <br>
**tbotOption** - Req. TBOT_OPTINO for NoahMP. <br>
**tempTimeScOption** - Req. TEMP_TIME_SCHEME_OPTION for NoahMP. <br>
**sfcResOption** - Req. SURFACE_RESISTANCE_OPTION for NoahMP. <br>
**glacierOption** - Req. GLACIER_OPTION for NoahMP. <br>
**soilThick** - Req. Soil thicknesses for specified soil layers in NoahMP. <br>
**zLvl** - Req. Level of wind speeds in NoahMP. <br>

### forcing
**forceType** - Req. Specified forcing type. <br>

### modelTime
**forceDt** - Req. Input forcing timestep in seconds. <br>
**lsmDt** - Req. NoahMP timestep in seconds. <br>
**lsmOutDt** - Req. NoahMp output timestep in seconds. <br>
**lsmRstFreq** - Req. NoahMp restart frequency in seconds. <br>
**hydroRstFreq** - Req. WRF-Hydro restart frequency in seconds. <br>
**hydroOutDt** - Req. WRF-Hydro output frequency in seconds. <br>

### hydroIO
**rstType** - Req. Flag for overwritting accumulation vars in restart file. <br>
**ioConfigOutputs** - Req. Output flag for varible grouping in WRF-Hydro. <br>
**ioFormOutputs** - Req. Flag to for specifying output format. <br>
**chrtoutDomain** - Req. Flag to turn on CHRTOUT_DOMAIN files. <br>
**chanObsDomain** - Req. Flag to turn on CHANOBS_DOMAIN files. <br>
**chrtoutGrid** - Req. Flag to turn on CHRTOUT_GRID files. <br>
**lsmDomain** - Req. Flag to turn on LSMOUT_DOMAIN files. <br>
**rtoutDOmain** - Req. Flag to turn on RTOUT_DOMAIN files. <br>
**gwOut** - Req. Flag to turn on GWOUT_DOMAIN files. <br>
**lakeOut** - Req. Flag to turn on LAKEOUT_DOMAIN files. <br>
**frxstOut** - Req. Flag to turn on FRXST output text files. <br>
**resetHydroAcc** - Req. Flag to reset accumulation variables in the restart files. <br>
**streamOrderOut** - Req. Flag to specify the minimum Strahler order to output. <br>

### hydroPhysics
**dtChSec** - Req. Channel routing timesetp in seconds. <br>
**dtTerSec** - Req. Surface and subsurface routing timestep in seconds. <br>
**subRouting** - Req. Flag to turn on/off subsurface routing. <br>
**ovrRouting** - Req. Flag to turn on/off overland flow routing. <br>
**channelRouting** - Req. Flag to turn on/off channel routing. <br>
**rtOpt** - Req. Overland/subsurface routing option. <br>
**chanRtOpt** - Req. Channel routing option. <br>
**udmpOpt** - Req. User-defined spatial mapping flag to turn on/off. <br>
**gwBaseSw** - Req. Groundwater option. <br>
**gwRestart** - Req. Flag to use restart states in groundwater scheme. <br>
**enableCompoundChannel** - Req. Flag to activate compound channel in the hydro.namelist. 1 - on, 0 - off. <br>
**compoundChannel** - Req. Activation flag for compound channel. enableCompoundChannel must be on. <br>
**enableGwBucketLoss** - Req. Flag to activate groundwater bucket loss in hydro.namelist. 1 - on, 0 - off. <br>
**bucket_loss** - Req. Activation flag for groundwater bucket loss. enableGwBucketLoss must be on. <br>

### Step 2: Configure Calibration Parameter Selection "calib_parms.tbl"

In addition to the ‘setup.parm’ file, the ‘calib_parms.tbl’ file is needed to direct the workflow to determine which model parameters will be calibrated, along with the range of parameter values. A template table is located under /setup_files/calib_parms.tbl which you can copy and edit for your own calibration workflow experiment.


In [16]:
%%bash
cd /glade/work/arezoo/RFC_Training/RFC_CalibTrain2021/PyWrfHydroCalib/setup_files
cat calib_params.tbl

parameter,calib_flag,minValue,maxValue,ini
bexp,          1,      0.40,      1.90,      1.0
smcmax,        1,      0.80,      1.20,      1.0
dksat,         1,      0.20,      10.0,      1.0
refkdt,        1,      0.10,      4.00,      0.6
slope,         1,      0.00,      1.00,      0.1
retdeprtfac,   1,      0.10,      10.0,      1.0
lksatfac,      1,      10.0,      10000.0,   1000.0
zmax,          1,      10.0,      250.0,     25.0
expon,         1,      1.00,      8.0,       1.75
Coeff,         1,      0.0001,    0.1,       0.001
cwpvt,         1,      0.50,      2.0,       1.0
vcmx25,        1,      0.60,      1.4,       1.0
mp,            1,      0.60,      1.4,       1.0
hvt,           1,      0.25,      1.5,       1.0
mfsno,         1,      0.50,      2.0,       1.0
rsurfexp,      0,      1.0,       6.0,       5.0
Bw,            0,      0.1,       10.0,      1.0
HLINK,         0,      0.1,       10.0,      1.0
ChSSlp,        0,      0.1,       10.0,      1.0
MannN,         0,  


Within this table, you will find all the potential parameters to calibrate, along with a ‘calib_flag’ of 1 or 0. This flag will turn calibration on (1) for that parameter or off (0). The ‘minValues’ and ‘maxValues’ specify the range of potential parameter values to calibrate over. This template has a broad range of values. The ‘ini’ column specifies the default values to be used for either default un-calibrated values, or the initial values going into the calibration workflow. It is up to you to determine what range is best for your calibration experiment. It is highly encouraged to perform a sensitivity analysis over your region of interest to help determine which parameters have significant impact on hydrologic response.



### logistics section
**outDir** - Req. Where do you want your calibration experiment to be constructed? <br>
**expName** - Req. What is the name of your experiment. This can be anything you want. <br>
**acctKey** - Opt. If you are running on a que system that requires credentials to submit a job, specify your account key here. <br>
**optQueName** - Opt. If you need to direct model simulations/jobs to a specific que, you can specify that here. <br>
**nCoresModel** - Req. How many CPUs are you running your model simulations over? <br>
**nNodesModel** - Req. If you are running across multiple nodes on a que system, specify here.<br> 
**nCoresPerNode** - Req. If you are running on a que system, specify how many CPU cores you have available per node. <br>
**runSens** - Req. Are we running sensitivity analysis? 0 - No, 1 - Yes <br>
**sensParmTbl** - Req. Path to the table of parameter values to use in sensitivity analysis. Same format as the calibration parameter table. <br>
**runCalib** - Req. Are we running calibration? 0 - No, 1 - Yes <br>
**calibParmTbl** - Req. Path to the table of parameter values to use in calibration. <br>
**dailyStats** - Req. Flag to direct worfklow to calculate error statistics on a daily scale, instead of hourly by default. Specify 1 to activate. <br>
**coldStart** - Req. Flag to direct calibration workflow to cold start your model simulation for each iteration during calibration. Specify 1 to activate. <br>
**optSpinFlag** - Req. Flag to direct the workflow to use an alternative spinup file already in place in the input directory for the basins being calibrated. This allows the user to bypass the spinup step. Specify 1 to activate. <br>
**stripCalibOutputs** - Req. If you desire to ommit outputs during an intial window for each model iteration, you can specify 1 here to activate this feature. This was designed to minimize I/O model burdens. <br>
**stripCalibHours** - Req. Specify an initial window in hours to strip outputs. This is only used if stripCalibOutputs has been activated. <br>
**jobRunType** - Req. How are the model simulations and calibration code being executed? For this exercise we will be using 4, which is MPI with no job scheduler. <br>
**mpiCmd** - Req. What is the MPI command being used to execute the model simulations. This is required, as the MPI command is also used in job scheduler scripts. <br>
**cpuPinCmd** - Opt. If you are running on a job scheduler, how do you want to pin specific model simulations on a compute node? This put in to allow multiple basins on one node. <br>
**numIter** - Req. How many model iterations would you like to run your calibration experiment over? <br>
**calibMethod** - Req. Right now only DDS is allowed. Future upgrades will incorporate additional calibration methods. <br>
**objectiveFunction** - Req. What error metric do you wish to minimize during the calibraion experiment. <br>
**ddsR** - Req. This is a DDS-specific parameter that tunes how random values are generated for each iteration. <br>
**email** - Req. Where do you want status and error messages to be directed to? <br>
**wrfExe** - Req. Path to the WRF-Hydro executable to be used in the workflow. <br>
**genParmTbl** - Req. Path to the GENPARM.TBL file used by the model. <br>
**mpParmTbl** - Req. Path to the MPARM.TBL file used by the model. <br>
**urbParmTbl** - Req. Path to the URBPARM.TBL file used by the model. <br>
**vegParmTbl** - Req. Path to the VEGPARM.TBL file used by the model. <br>
**soilParmTbl** - Req. Path to the SOILPARM.TBL file used by the model. <br>
**bSpinDate** - Opt. Beginning date for the spinup. <br>
**eSpinDate** - Opt. Ending date for the spinup. <br>
**bCalibDate** - Req. Beginning date for each calibration iteration. <br>
**eCalibDate** - Req. Ending date for each calibration iteration. <br>
**bCalibEvalDate** - Req. The date within each calibration iteration to begin analysis. <br>
**bValidDate** - Opt. Beginning date for the validation simulation. <br>
**eValidDate** - Opt. Ending date for the validation simulation. <br>
**bValidEvalDate** - Opt. The date within the validation simulation to begin analysis. <br>

### Sensitivity
**sensParmSample** - Opt. Sensitivity parameter. <br>
**sensBatchNum** - Opt. How many sensitivity simulations to run at once. <br>
**bSensDate** - Opt. Beginning date for the sensitivity model simulation. <br>
**eSensDate** - Opt. Ending date for the sensitivity model simulation. <br>
**bSensEvalDate** - Opt. The date within the sensitivity simulation to begin analysis. <br>

### gageInfo
**gageListSQL** - Req. SQL command to extrat basins for calibration out of the database file. <br>
**gageListFile** - Opt. Alternative list of basins to calibrate instead of using an SQL command. <br>

### lsmPhysics
**dynVegOption** - Req. DYNAMIC_VEG_OPTION for NoahMP. <br>
**canStomResOption** - Req. CANOPY_STOMATAL_RESISTANCE_OPTION for NoahMP. <br>
**btrOption** - Req. BTR_OPTION for NoahMP. <br>
**runoffOption** - Req. RUNOFF_OPTION for NoahMP. <br>
**sfcDragOption** - Req. SURFACE_DRAG_OPTINO for NoahMP. <br>
**frzSoilOption** - Req. FROZEN_SOIL_OPTION for NoahMP. <br>
**supCoolOption** - Req. SUPERCOOLED_WATER_OPTION for NoahMP. <br>
**radTransferOption** - Req. RADIATIVE_TRANSFER_OPTION for NoahMP. <br>
**snAlbOption** - Req. SNOW_ALBEDO_OPTION for NoahMP. <br>
**pcpPartOption** - Req. PCP_PARTITION_OPTION. <br>
**tbotOption** - Req. TBOT_OPTINO for NoahMP. <br>
**tempTimeScOption** - Req. TEMP_TIME_SCHEME_OPTION for NoahMP. <br>
**sfcResOption** - Req. SURFACE_RESISTANCE_OPTION for NoahMP. <br>
**glacierOption** - Req. GLACIER_OPTION for NoahMP. <br>
**soilThick** - Req. Soil thicknesses for specified soil layers in NoahMP. <br>
**zLvl** - Req. Level of wind speeds in NoahMP. <br>

### forcing
**forceType** - Req. Specified forcing type. <br>

### modelTime
**forceDt** - Req. Input forcing timestep in seconds. <br>
**lsmDt** - Req. NoahMP timestep in seconds. <br>
**lsmOutDt** - Req. NoahMp output timestep in seconds. <br>
**lsmRstFreq** - Req. NoahMp restart frequency in seconds. <br>
**hydroRstFreq** - Req. WRF-Hydro restart frequency in seconds. <br>
**hydroOutDt** - Req. WRF-Hydro output frequency in seconds. <br>

### hydroIO
**rstType** - Req. Flag for overwritting accumulation vars in restart file. <br>
**ioConfigOutputs** - Req. Output flag for varible grouping in WRF-Hydro. <br>
**ioFormOutputs** - Req. Flag to for specifying output format. <br>
**chrtoutDomain** - Req. Flag to turn on CHRTOUT_DOMAIN files. <br>
**chanObsDomain** - Req. Flag to turn on CHANOBS_DOMAIN files. <br>
**chrtoutGrid** - Req. Flag to turn on CHRTOUT_GRID files. <br>
**lsmDomain** - Req. Flag to turn on LSMOUT_DOMAIN files. <br>
**rtoutDOmain** - Req. Flag to turn on RTOUT_DOMAIN files. <br>
**gwOut** - Req. Flag to turn on GWOUT_DOMAIN files. <br>
**lakeOut** - Req. Flag to turn on LAKEOUT_DOMAIN files. <br>
**frxstOut** - Req. Flag to turn on FRXST output text files. <br>
**resetHydroAcc** - Req. Flag to reset accumulation variables in the restart files. <br>
**streamOrderOut** - Req. Flag to specify the minimum Strahler order to output. <br>

### hydroPhysics
**dtChSec** - Req. Channel routing timesetp in seconds. <br>
**dtTerSec** - Req. Surface and subsurface routing timestep in seconds. <br>
**subRouting** - Req. Flag to turn on/off subsurface routing. <br>
**ovrRouting** - Req. Flag to turn on/off overland flow routing. <br>
**channelRouting** - Req. Flag to turn on/off channel routing. <br>
**rtOpt** - Req. Overland/subsurface routing option. <br>
**chanRtOpt** - Req. Channel routing option. <br>
**udmpOpt** - Req. User-defined spatial mapping flag to turn on/off. <br>
**gwBaseSw** - Req. Groundwater option. <br>
**gwRestart** - Req. Flag to use restart states in groundwater scheme. <br>
**enableCompoundChannel** - Req. Flag to activate compound channel in the hydro.namelist. 1 - on, 0 - off. <br>
**compoundChannel** - Req. Activation flag for compound channel. enableCompoundChannel must be on. <br>
**enableGwBucketLoss** - Req. Flag to activate groundwater bucket loss in hydro.namelist. 1 - on, 0 - off. <br>
**bucket_loss** - Req. Activation flag for groundwater bucket loss. enableGwBucketLoss must be on. <br>

## Conclusion:

Once the setup.parm file and calibration parameters are set and verified, we can begin the process of initializing databases for calibration. Proceed to *Lesson 2 - Create Database*

In [None]:
### DEV_ END