### Sample RHESSys workflow

RHESSysWorkflows provides a series of Python tools for performing [RHESSys](https://github.com/RHESSys/RHESSys) data preparation workflows. These tools build on the workflow system defined by [EcohydroLib](https://github.com/selimnairb/EcohydroLib) and [RHESSysWorkflows](https://github.com/selimnairb/RHESSysWorkflows).

In [2]:
import os
import logging
from rhessys_wf import *
%matplotlib inline

The `RHESSysWorkflow` class is provided as part of the `rhessys_wf` library (imported above) to streamline your interaction with the `RHESSysWorkflows` [core functions](https://github.com/selimnairb/RHESSysWorkflows).  More information about this class can be obtained by executing the `help(RHESSysWorkflow)` command.

Create an instance of the `RHESSysWorkflow` class and assign it to the variable `w`, using USGS gage **01589312** [(DEAD RUN NEAR CATONSVILLE, MD)](http://waterdata.usgs.gov/usa/nwis/uv?01589312) , a start date of **2008-01-01**, and an end date of **2010-01-01**.  This command will create a clean directory for the given project name of **myRHESSysSimulation** in the JupyterHub's default data directory.

In [3]:
w = RHESSysWorkflow(project_name='myRHESSysSimulation', 
                    gageid='01589312',
                    start_date='2008-01-01',
                    end_date='2010-01-01'
                    )

Project [mysim] already exists, would you like to remove it [Y/n]? Y
Creating a clean directory for project [mysim]
Log file location: /home/jovyan/work/notebooks/data/mysim/mysim.log


The `RHESSysWorkflow` class uses a logging library to document output and errors.  We can display these messages in our notebook by attaching to the logger and redirecting the output to stdout.  While this step is not necessary, it will provide us with more verbose output.

In [4]:
# Get root logger (all other loggers will be derived from this logger's properties)
logger = logging.getLogger()

# assuming only a single handler has been setup (seems to be default in notebook), set that handler to go to stdout.
logger.handlers[0].stream = sys.stdout

Using the watershed parameters defined within the `RHESSysWorkflow object` (***w***), `get_NHDStreamflowGageIdentifiersAndLocation` retrieves NHDPlus2 streamflow gage identifiers (reachcode, measure along reach in percent) for a USGS gage. The function `get_CatchmentShapefileForNHDStreamflowGage` generates a shapefile for the drainage area of an NHDPlus2 streamflow gage using web services from [Horizon Systems NHDPlus Version 2](http://www.horizon-systems.com/NHDPlus/NHDPlusV2_home.php). The `get_BoundingboxFromStudyareaShapefile` function calculates the bounding box (also known as envelope and extent) of the catchment.

In [5]:
w.get_NHDStreamflowGageIdentifiersAndLocation(w.sub_project_folder,w.gageid)
w.get_CatchmentShapefileForNHDStreamflowGage(w.sub_project_folder)
w.get_BoundingboxFromStudyareaShapefile(w.sub_project_folder)
extent = w.get_Extent_from_RHESSysWorkflows_Metadata_File()

INFO:myApp:GetNHDStreamflowGageIdentifiersAndLocation.py -p /home/jovyan/work/notebooks/data/mysim/mysim -g 01589312
INFO:myApp:GetCatchmentShapefileForNHDStreamflowGage.py --overwrite -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:GetBoundingboxFromStudyareaShapefile.py -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:/home/jovyan/work/notebooks/data/mysim/mysim/metadata.txt
INFO:myApp:
min x = -76.769782
min y = 39.273610
max x = -76.717498
max y = 39.326008



Using the generated bounding box and user defined start/end dates for RHESSys simulation period, an Essential Terrestrial Variable (***ETV***) RHESSys data bundle (with climate forcing, soils, elevation) are retrieved from the [HydroTerre](http://www.hydroterre.psu.edu) cyberinfrastructure services using the function `HydroTerre_RHESSys_ByExtent`. The zipped data bundle is unzipped for the remaining workflow steps. Please note, the earliest start date is 1979-01-01 and the latest end date is 2009-12-31 for climate forcing (defined at the first step).

In [6]:
w.HydroTerre_RHESSys_ByExtent(extent, w.ht_start_date, w.ht_end_date, w.sub_project_folder)
zipfolder = w.sub_project_folder + '/RHESSys_ETV'
w.create_path(zipfolder)
zipfilepathname = w.sub_project_folder + '/RHESSys_ETV_Data.zip'
w.unzip_etv_zip_file_at_path(zipfilepathname, zipfolder)

checking to see if HydroTerre job is completed...
checking to see if HydroTerre job is completed...
checking to see if HydroTerre job is completed...
checking to see if HydroTerre job is completed...
checking to see if HydroTerre job is completed...
--------Download Start-------------
Retrieving result from: http://hydroterre.psu.edu/HydroTerre_Rhessys_ByExtent/j3e134c823fbd44fe9ded1e0541006f6f/scratch/RHESSys_ETV_Data.zip
--------Download End-------------
--------HUC12 Area-------------
412.045597076416 SQKM


The function `get_USGSDEMForBoundingbox` downloads 1/3 arcsecond Digital Elevation Model (***DEM***) data from the 
National Elevation Dataset and NHDPlus hydro-conditioned coverages hosted by U.S. Geological Survey (***USGS***) Web Coverage Service (WCS) interface at the [Center for Integrated Data Analytics group](http://cida.usgs.gov/). `get_USGSNLCDForDEMExtent` retrieves the National Landcover Dataset (***NLCD***) within the watershed extent from data services at [RENCI](http://renci.org/). `get_SSURGOFeaturesForBoundingbox` processes ***SSURGO*** soil data from the United States Department of Agriculture [USDA](http://sdmdataaccess.nrcs.usda.gov/) and the `GenerateSoilPropertyRastersFromSSURGO` function rasterizes these soil attributes for [GRASS](https://grass.osgeo.org/) project manipulation. The [HydroTerre](http://www.hydroterre.psu.edu) RHESSys service generates a Leaf Area Index (***LAI***) raster dataset for each month, users can choose which month they are interested in by modifying the `w.lai_fullpathname_with_ext` variable. The `Register_LAI_Raster` function registers the LAI raster dataset for [GRASS](https://grass.osgeo.org/) project manipulation. The [GRASS](https://grass.osgeo.org/) `CreateGRASSLocationFromDEM` setups up the GRASS environment and imports the DEM raster dataset.     


In [7]:
output = w.get_USGSDEMForBoundingbox(w.sub_project_folder)
output = w.get_USGSNLCDForDEMExtent(w.sub_project_folder)
output = w.get_SSURGOFeaturesForBoundingbox(w.sub_project_folder)
output =w.GenerateSoilPropertyRastersFromSSURGO(w.sub_project_folder)
w.lai_fullpathname_with_ext = w.sub_project_folder + '/RHESSys_ETV/RHESSys_LAI/LAI_Month0.tif'
output = w.Register_LAI_Raster(w.sub_project_folder, w.lai_fullpathname_with_ext, w.publisher)
output = w.CreateGRASSLocationFromDEM(w.sub_project_folder, '"RHESSys model for Dead Run 5 watershed near Catonsville, MD"')

INFO:myApp:GetUSGSDEMForBoundingbox.py --overwrite -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:GetUSGSNLCDForDEMExtent.py -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:GetSSURGOFeaturesForBoundingbox.py -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:GenerateSoilPropertyRastersFromSSURGO.py -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:RegisterRaster.py -p /home/jovyan/work/notebooks/data/mysim/mysim -t lai -r /home/jovyan/work/notebooks/data/mysim/mysim/RHESSys_ETV/RHESSys_LAI/LAI_Month0.tif -b "RHESSysWorkflow" --force 
INFO:myApp:CreateGRASSLocationFromDEM.py -p /home/jovyan/work/notebooks/data/mysim/mysim -d "RHESSys model for Dead Run 5 watershed near Catonsville, MD"


The `ImportRHESSysSource` function downloads and compiles the latest [RHESSys](https://github.com/RHESSys/RHESSys) source code in the user defined project folder.    

In [8]:
output = w.ImportRHESSysSource(w.sub_project_folder)
print 'Finished'

INFO:myApp:ImportRHESSysSource.py --overwrite -p /home/jovyan/work/notebooks/data/mysim/mysim
Finished


The [HydroTerre](http://www.hydroterre.psu.edu) RHESSys service generates climate data within the user defined start and end period. Users can download data from other services (i.e. [HydroShare](https://www.hydroshare.org/)) and modify the `climate_data_fullpathname` variable. The `ImportClimateData` function imports the ***climate*** data already in RHESSys file formats into the [RHESSys](https://github.com/RHESSys/RHESSys) project directory. If the user is using climate station data, the `station_data_fullpathname` variable should be defined. The `DelineateWatershed` script uses data retrieved from the above services and delineates a watershed with user defined `dem_cell_threshold` and `areaEstimate` parameters. The `GeneratePatchMap` function creates [RHESSys](https://github.com/RHESSys/RHESSys) patches (geometry) in the [GRASS](https://grass.osgeo.org/) project. 

In [9]:
output = w.climate_data_fullpathname = w.sub_project_folder + '/RHESSys_ETV/RHESSys_Climate'

output = w.ImportClimateData(w.sub_project_folder, w.climate_data_fullpathname)

output = w.station_data_fullpathname = w.sub_project_folder + '/RHESSys_ETV/RHESSys_Climate'

output = w.DelineateWatershed(w.sub_project_folder, w.dem_cell_threshold, w.areaEstimate)

output = w.GeneratePatchMap(w.sub_project_folder)

INFO:myApp:ImportClimateData.py -p /home/jovyan/work/notebooks/data/mysim/mysim -s /home/jovyan/work/notebooks/data/mysim/mysim/RHESSys_ETV/RHESSys_Climate
INFO:myApp:DelineateWatershed.py -p /home/jovyan/work/notebooks/data/mysim/mysim -t 500 -a 412.045597076416
INFO:myApp:GeneratePatchMap.py -p /home/jovyan/work/notebooks/data/mysim/mysim -t clump -c elevation


The `GenerateSoilTextureMap` function imports ***soil*** data (sand and clay percentages) into the [GRASS](https://grass.osgeo.org/) project and generate soil texture map using the GRASS addon [r.soils.texture](https://grass.osgeo.org/grass70/manuals/addons/r.soils.texture.html). The `ImportRasterMapIntoGRASS_LAI` and `ImportRasterMapIntoGRASS_LANDCOVER` functions import the ***LAI*** and ***Landcover*** datasets into the GRASS project.

In [10]:
output = w.GenerateSoilTextureMap(w.sub_project_folder, options='--overwrite')

output = w.ImportRasterMapIntoGRASS_LAI(w.sub_project_folder)

output = w.ImportRasterMapIntoGRASS_LANDCOVER(w.sub_project_folder)

INFO:myApp:GenerateSoilTextureMap.py --overwrite -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:ImportRasterMapIntoGRASS.py -p /home/jovyan/work/notebooks/data/mysim/mysim -t lai -m nearest
INFO:myApp:ImportRasterMapIntoGRASS.py -p /home/jovyan/work/notebooks/data/mysim/mysim -t landcover -m nearest


These steps consume the [GRASS](https://grass.osgeo.org/) project datasets and prepares the [RHESSys](https://github.com/RHESSys/RHESSys) model data structures. The `RegisterLandcoverReclassRules` function generates landcover raster map reclassification rules (*** stratum, landuse, impervious, road, lai-recode ***) which are then used by the `GenerateLandcoverMaps` function to generate derived landscover data for RHESSys. The `GenerateWorldTemplate` function prepares the [RHESSys ***world template file*** ](http://fiesta.bren.ucsb.edu/~rhessys/setup/setup.html) with the user defined datasets and the `CreateWorldfile` function generates the [RHESSys ***world  file*** ](http://fiesta.bren.ucsb.edu/~rhessys/setup/setup.html). The `CreateFlowtable` script is used to describe the [connectivity between patches](https://github.com/RHESSys/RHESSys/wiki/Flowtable) with the user defined data to describe landscape partitioning, topology and soil characteristics of a basin. The `RunLAIRead` utility initializes vegetation carbon stores by initializing the world file. The `RunCmd` tool examines the project meta data to check that the RHESSys model is ready.    

In [11]:
output = w.RegisterLandcoverReclassRules(w.sub_project_folder)
output = w.GenerateLandcoverMaps(w.sub_project_folder)
output = w.GenerateWorldTemplate(w.sub_project_folder)
output = w.CreateWorldfile(w.sub_project_folder)
output = w.CreateFlowtable(w.sub_project_folder)
output = w.RunLAIRead(w.sub_project_folder)
output = w.RunCmd(w.sub_project_folder, 3)            

INFO:myApp:RegisterLandcoverReclassRules.py -p /home/jovyan/work/notebooks/data/mysim/mysim -k
INFO:myApp:GenerateLandcoverMaps.py -p /home/jovyan/work/notebooks/data/mysim/mysim
INFO:myApp:GenerateWorldTemplate.py -p /home/jovyan/work/notebooks/data/mysim/mysim -c HT_RHESSys
INFO:myApp:CreateWorldfile.py -p /home/jovyan/work/notebooks/data/mysim/mysim -v
INFO:myApp:CreateFlowtable.py -p /home/jovyan/work/notebooks/data/mysim/mysim --routeRoads
INFO:myApp:RunLAIRead.py -p /home/jovyan/work/notebooks/data/mysim/mysim -v
INFO:myApp:RunCmd.py -p /home/jovyan/work/notebooks/data/mysim/mysim echo "2008 10 1 1 print_daily_on" > /home/jovyan/work/notebooks/data/mysim/mysim/rhessys/tecfiles/tec_daily.txt


The `RunModel` function run the [RHESSys](https://github.com/RHESSys/RHESSys) simulation!

In [12]:
output = w.RunModel(w.sub_project_folder)

INFO:myApp:RunModel.py -v -p /home/jovyan/work/notebooks/data/mysim/mysim -d "Test model run" --basin -pre test -st 2008 1 1 1 -ed 2010 1 1 1 -w world -t tec_daily.txt -r world.flow -- -s 0.07041256017 133.552915269 1.81282283058 -sv 4.12459677088 78.3440566535 -gw 0.00736592779294 0.340346799457


To aid in analyzing the RHESSys simulation results, the `plot_rhessys_results` function plots model results (***data ***) against user supplied observed data (***obs_data***).

In [13]:
data =  os.path.join(w.sub_project_folder, 'rhessys/output/test/rhessys_basin.daily')
obs_data =  '/home/jovyan/work/notebooks/data/test_obs.txt'

plot_rhessys_results(
    outfileSuffix = 'test_plot', 
     obs = obs_data, 
     column = 'streamflow',
     data = [data], 
     legend = ['Test simulation'],
     title = 'DR5 streamflow',
     y = 'Streamflow (mm/day)')


plot_rhessys_results('test_plot', 
     obs_data, 
     'streamflow', 
     ['Test simulation'],
        #plottype=PLOT_DEFAULT,
        data=[data], 
        #behavioralData=None,
        #color=['magenta'],
        #linewidth=None,
        #linestyle=None,
        title='DR5 streamflow',
        #x=None,
        y='Streamflow (mm/day)',
        #titlefontsize=12,
        #scatterwidth=1,
        #fontweight='regular',
        #legendfontsize=6,
        #axesfontsize=12,
        #ticklabelfontsize=12,
        #figureX=8,
        #figureY=5,
        #supressObs=False,
        #secondaryData=obs_data,
        #secondaryPlotType='bar',
        #secondaryColumn='precip',
        #secondaryLabel='Rainfall (mm/day)'
                    )

IOError: [Errno 2] No such file or directory: '/home/jovyan/work/notebooks/data/test_obs.txt'

---
## 3. Save the results back into HydroShare

Using the `hs_utils` library, the results of our time series analysis can be saved back into HydroShare.  First, define all of the required metadata for resource creation, i.e. *title*, *abstract*, *keywords*, and *content files*.  In addition, we must define the type of resource that will be created, in this case *genericresource*.  

***Optional*** : define the resource from which this "new" content has been derived.  This is one method for tracking resource provenance.

In [29]:
import hs_utils

# establish a secure connection to HydroShare
hs = hs_utils.hydroshare()

Adding the following system variables:
   HS_USR_NAME = TonyCastronova
   HS_RES_ID = 466d5b3de8a543808d96ded08d861dc5
   HS_RES_TYPE = genericresource
   JUPYTER_HUB_IP = jupyter.uwrl.usu.edu

These can be accessed using the following command: 
   os.environ[key]

   (e.g.)
   os.environ["HS_USR_NAME"]  => TonyCastronova


In [26]:
# compress the simulation data
!tar -zcf $DATA/mysim.tar.gz $DATA/mysim 

tar: Removing leading `/' from member names


In [30]:
# define HydroShare required metadata
title = 'RHESSys Example'
abstract = 'This is a sample RHESSys simulation'
keywords = ['Jupyterhub', 'RHESSys']

# set the resource type that will be created.
rtype = 'genericresource'

# create a list of files that will be added to the HydroShare resource.
files = [os.path.join(os.environ['DATA'], 'mysim.tar.gz'), # the compressed simulation data
         os.path.join(os.getcwd(), 'rhessys.ipynb')  # this notebook
        ]  

In [33]:
# save the state of the current notebook
from IPython.display import display,Javascript 
display(Javascript('IPython.notebook.save_checkpoint();'))

# create a hydroshare resource containing these data
resource_id = hs.createHydroShareResource(abstract, 
                                          title, 
                                          derivedFromId=None,
                                          keywords=keywords, 
                                          resource_type=rtype, 
                                          content_files=files, 
                                          public=False)

<IPython.core.display.Javascript object>

You have indicated that this resource is NOT derived from any existing HydroShare resource.  Are you sure that this is what you intended? [Y/n]y
                                       


                                     


## 4. Known Limitations and Future Additions

* We are ***missing user parameters*** to control the RHESSys simulation. For example, different ways to generate patches. As well as controlling the data workflows, such as metadata for `CreateGRASSLocationFromDEM`.
* The ***areaEstimate*** input in `DelineateWatershed` function requires automation. The area returned by the HydroTerre RHESSys workflow uses USGS HUC-12s that are often larger than the watershed generated by the `DelineateWatershed` function.
* New tools are required to process the ***LAI*** raster datasets for RHESSys simulations.
* The RHESSysWorkflows do allow users to specify and/or upload ***climate station data***. We envision a new web-service that can automatically detect the climate stations within the watershed and prepare the data. Here, we are using climate data from HydroTerre which has one climate normal from 1979-01-01 to 2009-12-31. Users interested in data from the year 2010 to present will be required to upload RHESSys climate files.
* The ***ETV***  bundle has additional data not used by this workflow and future tools will allow users the option where to retrieve data services from with graphic user interface (GUI) control.
* This notebook retrieves RHESSys code from the github repository. Future notebooks will empower users to upload their code version and/or import specific model versions.

## 5. Resources

* RHESSys
  * [Setup](http://fiesta.bren.ucsb.edu/~rhessys/setup/setup.html)
  * [Wiki](http://fiesta.bren.ucsb.edu/~rhessys/)
* Data
  * [HydroShare](https://www.hydroshare.org/)
  * [USGS Data and Tools](https://www.usgs.gov/products/data-and-tools/data-and-tools-topics)
  * [USDA Data gateway](https://gdg.sc.egov.usda.gov/)
  * [HydroTerre](http://hydroterre.psu.edu/)
  
