# Run WRFHydro model on HPC resources using CyberGIS-Compute V2

**(<span style="color: red"> This notebook requires user interaction. Please 'Run Cell by Cell'. 'Run All' may cause errors. </span>)**

This notebook demonstrates how to prepare a WRFHydro model on CyberGIS-Jupyter for Water (CJW) for execution on a supported High-Performance Computing (HPC) resource via the CyberGIS-Compute service. First-time users are highly encouraged to go through the [NCAR WRFHydro Hands-on Training on CJW](https://www.hydroshare.org/resource/d2c6618090f34ee898e005969b99cf90/) to get familiar WRFHydro model basics including compilation of source code, preparation of forcing data and typical model configurations. This notebook will not cover those topics and assume users already have hands-on experience with local model runs.

CyberGIS-Compute is a CyberGIS-enabled web service sits between CJW and HPC resources. It acts as a middleman that takes user requests (eg. submission of a model) originated from CJW, carries out the actual job submission of model on the target HPC resource, monitors job status, and retrieves outputs when the model execution has completed. The functionality of CyberGIS-Compute is exposed as a series of REST APIs. A Python client, [CyberGIS-Compute SDK](https://github.com/cybergis/cybergis-compute-python-sdk), has been developed for use in the CJW environment that provides a simple GUI to guide users through the job submission process. Prior to job submission, model configuration and input data should be prepared and arranged in a certain way that meets specific requirements, which vary by models and their implementation in CyberGIS-Compute. We will walk through the requirements for WRFHydro below.

The general workflow for WRFHydro in CyberGIS-Compute works as follows:

1. User picks a Model_Version of WRFHydro to use;
2. User prepares configuration files and data for the model on CJW;
3. User submits configuration files and data to CyberGIS-Compute;
4. CyberGIS-Compute transfers configuration files and data to target HPC;
5. CyberGIS-Compute downloads the chosen Model_Version of WRFhydro codebase on HPC;
6. CyberGIS-Compute applies compile-time configuration files to the codebase, and compiles the source code on the fly;
7. CyberGIS-Compute applies run-time configuration files and data to the model;
8. CyberGIS-Compute submits the model job to HPC scheduler for model execution;
9. CyberGIS-Compute monitors job status;
10. CyberGIS-Compute transfers model outputs from HPC to CJW upon user request;
11. User performs post-processing work on CJW;

Current implementation for WRFHydro in CyberGIS-Compute is compatible with WRFHydro 5.x official releases hosted on the NCAR github repo (https://github.com/NCAR/wrf_hydro_nwm_public). It requires users to provide configurations & data in 3 categories: **compile-time configuration, run-time configuration, and model data**. The following table lists details for each category. **For more in-depth description on each configuration and data file, please refer to the official [WRFHydro Technical Documention](https://ral.ucar.edu/projects/wrf_hydro/technical-description-user-guide) at NCAR**.

| Category   |      Parameter/Configuration/Data     | Required|  Comments | Submission|
|:----------|-------------:|:-----:|------:|--:|
| Compile-Time |  Model_Version (string) | Y |release, tag, branch or commit id from [WRFHydro official repo](https://github.com/NCAR/wrf_hydro_nwm_public) | GUI or API |
| Compile-Time |   "setEnvar.sh" (file)          | Y |[example "setEnvar.sh"](https://github.com/NCAR/wrf_hydro_nwm_public/blob/refs/tags/v5.2.0/trunk/NDHMS/template/setEnvar.sh)  | Upload_Folder|
| Compile-Time |   LSM_Type  (string)      |   Y |"NoahMP" (default) or "Noah" | GUI or API |
| Run-Time |  "namelist.hrldas" (file) |   Y | [example "namelist.hrldas" for LSM_Type="NoahMP"](https://github.com/NCAR/wrf_hydro_nwm_public/blob/refs/tags/v5.2.0/trunk/NDHMS/template/NoahMP/namelist.hrldas) | Upload_Folder|
| Run-Time |  "hydro.namelist" (file) |   Y | [example "hydro.namelist"](https://github.com/NCAR/wrf_hydro_nwm_public/blob/refs/tags/v5.2.0/trunk/NDHMS/template/HYDRO/hydro.namelist) |Upload_Folder|
| Model Data | "DOMAIN" (folder)|  Y | contain domain files | Upload_Folder |
| Model Data | "FORCING" (folder) |  Y | contain forcing files | Upload_Folder|
| Model Data | "RESTART" (folder)|  N | contain restart files | Upload_Folder|

As listed above, the items marked as "GUI or API" under "Submission" should be provided to the job submission GUI, and the ones marked as "Upload_Foler" should be put in a local folder that will be submitted to HPC through the CyberGIS-Compute.

Since WRFHydro codebase has been under active development, "mix-matching" configuration files or data that are made for different WRFHydro versions may cause model failure and/or tricky issues. For example, the file 'namelist.hrldas' from Version_A may not work with WRFHydro Version_B, or some domain files worked with WRFHydro 5.0 may not work with WRFHydro 5.2. **<span style="color: red"> User should make sure the configuration files and data provided are compatible with the chosen Model_Version of WRFHydro codebase.</span>** An quick way to check the compatibility  would be try out a small-scale local run on CJW. Please refer to [NCAR WRFHydro Hands-on Training on CJW](https://www.hydroshare.org/resource/d2c6618090f34ee898e005969b99cf90/) for examples.  We highly recommend the "test small first" approach before submitting  any job to HPC using CyberGIS-Compute.

In this demo, we will use the testcase (Croton, NY) shipped with [WRFHydro v5.2.0](https://github.com/NCAR/wrf_hydro_nwm_public/releases/tag/v5.2.0) official release (Dec, 2021), which guarantees the compatibility between model codebase and data/parameters. Here is a [resource on HydroShare](https://www.hydroshare.org/resource/f2632892a18a4aafaa5b540db6403046/) that serves as a backup source in case the github release gets changed or removed in the future (possible but unlikely).

<img src="statics/release_v520.png" width="600">

FYI: The support for WRFHydro model in CyberGIS-Compute is added through the "community contribution" approach (introduced in CyberGIS-Compute V2), and the implementation can be accessible at https://github.com/cybergis/cybergis-compute-v2-wrfhydro. This work is based on a close collaboration between [CyberGIS Center](https://cybergis.illinois.edu/) at UIUC and [Dr. Ayman Nassar](https://uwrl.usu.edu/people/research-staff/nassar-ayman) at Utah Water Research Laboratory, USU under the [HydroShare](https://www.hydroshare.org/) project.

## Setup workspace

We are creating necessary working folders (and removing existing ones to support repeated runs). 

In [None]:
# scratch folder to hold codebase and testcase data
scratch_folder = "./scratch"
# upload_folder to hold data and files that will be submitted to HPC through CyberGIS-Compute
upload_folder = "./upload2hpc"

In [None]:
# create folders (and remove old ones if exist)
!rm -rf {scratch_folder} && mkdir -p {scratch_folder}
!rm -rf {upload_folder} && mkdir -p {upload_folder}

## Compile-Time configuration (Model_Version=v5.2.0)

We clone the WRFHydro source code repo to scratch_folder and checkout codebase to version "v5.2.0" (Model_Version). This version matches the Croton, NY testcase data we will be using. And we will should set this version number in the job submission GUI later. 

The reason we downloads the source code is we want to get a copy of the Compile-Time configuration file "setEnvar.sh" that matches the chosen version (v5.2.0).

In [None]:
# remove old repos (if exists)
!rm -rf {scratch_folder}/wrf_hydro_nwm_public
# clone WRFHydro repo and checkout version "v5.2.0"
!git clone -b v5.2.0 https://github.com/NCAR/wrf_hydro_nwm_public.git {scratch_folder}/wrf_hydro_nwm_public

## Compile-Time configuration ("setEnvar.sh")

Here we prepare the Compile-Time configuration file "setEnvar.sh". We first make a copy on the original "setEnvar.sh" from the WRFHydro codebase, and save it to upload_folder. You may go ahead and change the "setEnvar.sh" file as needed here.

In this particular case, we will follow the **Lesson 1** from the [NCAR WRFHydro Hands-on Training on CJW](https://www.hydroshare.org/resource/d2c6618090f34ee898e005969b99cf90/) to enable HYDRO_D and SPATIAL_SOIL flags.

In [None]:
# copy original "setEnvar.sh" to {upload_folder}
!cp -f {scratch_folder}/wrf_hydro_nwm_public/trunk/NDHMS/template/setEnvar.sh {upload_folder}
!ls {upload_folder} -l

In [None]:
# Edit "setEnvar.sh" to enable flags for HYDRO_D and SPATIAL_SOIL
!sed -i 's/HYDRO_D=0/HYDRO_D=1/'  {upload_folder}/setEnvar.sh
!sed -i 's/SPATIAL_SOIL=0/SPATIAL_SOIL=1/' {upload_folder}/setEnvar.sh
!cat {upload_folder}/setEnvar.sh

## Retrieve testcase (Croton,NY) data

We download the Croton, NY testcase made for WRFHydro v5.2.0 from NCAR repo and unzip it to scratch_folder

In [None]:
!rm -rf {scratch_folder}/*.tar.gz {scratch_folder}/example_case
!wget https://github.com/NCAR/wrf_hydro_nwm_public/releases/download/v5.2.0/croton_NY_training_example_v5.2.tar.gz -P {scratch_folder}
!cd {scratch_folder} && tar xzf croton_NY_training_example_v5.2.tar.gz

## Run-Time configurations ("namelist.hrldas", "hydro.namelist") 
## & Model Data ("DOMAIN", "RESTART")

In this specific case, we follow the **Lession 2** from the [NCAR WRFHydro Hands-on Training on CJW](https://www.hydroshare.org/resource/d2c6618090f34ee898e005969b99cf90/) to set up a *Gridded* model. Users are encouraged to try out other model configurations such *National Water Model (NWM)* and *Reach*.

The testcase for *Gridded* model already provides 2 required Run-Time configuration files ("namelist.hrldas", "hydro.namelist"),  1 required Model Data - "DOMAIN" and 1 optional Model Data - "RESTART". We copy them over the upload_folder.

FYI, [CUAHSI Domain Subsetter](https://subset.cuahsi.org/) allows you subset different versions of domain files for NWM configurations.

In [None]:
# list data for Gridded configuration
!ls {scratch_folder}/example_case/Gridded -l

In [None]:
# copy both Run-Time configuration files and Model Data to upload_folder
!cp -rf {scratch_folder}/example_case/Gridded/* {upload_folder}
!ls {upload_folder} -l

## Model Data ("FORCING")

Here we copy the last required Model Data - "FORCING" from the testcase directory to the upload_folder.  At this point, we have all required Run-Time configuration files and Model Data ready at upload_folder. 

In [None]:
# copy over FORCING data
!cp -rf {scratch_folder}/example_case/FORCING {upload_folder}
!ls {upload_folder} -l

## Job Submission

Here we are about to use CyberGIS-Compute SDK to submit the configured WRFHydro model to a HPC resource. We first establish a connection to the CyberGIS-Compute service and get a "cybergis" object. 

In [None]:
# import Compute-SDK and establish connection to Compute service
from cybergis_compute_client import CyberGISCompute
cybergis = CyberGISCompute(url="cgjobsup.cigi.illinois.edu", isJupyter=True, protocol="HTTPS", port=443, suffix="v2")

Call fucntion cybergis.create_job_by_ui() to show the job submission GUI, where job type is set to "wrfhydro-5.x" and we want to upload some data from upload_folder which contains "setEnvar.sh", "namelist.hrldas", "hydro.namelist", "DOMAIN", "RSTART" and "FORCING". You may have noticed there are 2 parameters still missing according to the table above: "Model_Version" and "LSM_Type". We will set them on the job submission GUI under section "Input Parameters". In this case, Model_Version should be "v5.2.0" and LSM_Type should be "NoahMP".

There are some optional settings you may want to tweak with. For example, you may choose target Computing Resource between "keeling_community" (a HPC hosted also known ["Virtual Roger"](https://cybergis.illinois.edu/infrastructures/) hosted at UIUC)  and "expanse_community" (a XSEDE HPC resource hosted at SDSC).

Once job is submitted, the job status is updated on the tabpage "Your Job Status". When job has finished, you could download results back from HPC under the "Download Job Results" tabpage. 

In [None]:
# display job submission GUI;
cybergis.create_job_by_ui(defaultJob="wrfhydro-5.x", defaultDataFolder=upload_folder)

## Check out model outputs

Here we take a quick look at the model outputs.

In [None]:
!echo {cybergis.recentDownloadPath}/Simulation

In [None]:
!ls {cybergis.recentDownloadPath}/Simulation

## Visualization

For a quick demo, we just plot hydrograph for 1 gauge point as what is presented in **Lesson 3** from the [NCAR WRFHydro Hands-on Training on CJW](https://www.hydroshare.org/resource/d2c6618090f34ee898e005969b99cf90/)

In [None]:
import xarray as xr
chanobs = xr.open_mfdataset('{}/Simulation/*CHANOBS*'.format(cybergis.recentDownloadPath),
                            combine='by_coords')

In [None]:
chanobs.sel(feature_id = 2).streamflow.plot()

## Done