Skip to content

Publishing Model Instances Using HydroShare

Mohamed M. Morsy edited this page Apr 21, 2016 · 17 revisions

Why Publish Your Model Online?

Computer models are an integral part of hydrology and water resource engineering. Producing a useful model instance (the input and output data for a specific model program) requires a significant investment of time and effort. Publishing a model instance enhances this investment by:

  • Sharing your model instance with collaborators and others interested in your work
  • Creating a 'frozen' copy of the model instance that can be cited using a DOI
  • Making the model instance publicly available as a product of publicly funded research
  • Facilitating scientific reproducibility and reuse of model instances

Why use HydroShare to Publish Your Model?

  • HydroShare supports the sharing of a model instances along with the model program version used for its execution. This encourages the reuse of the model instances and reproducibility of the model instance output by tracking the exact version of a model program used to execute the model instance.

  • HydroShare supports metadata and resource types designed specifically for sharing hydrologic data and models. Hydrologic-specific metadata allows scientists to properly annotate their resources so that they can be more effecitvely discovered by other hydrologists.

  • HydroShare generates a unique identifier for your model instance that can be used for citation. Your model instance therefore becomes a independent product of your research. Your model instance can also be assigned a permanent digital identifier (i.e. DOI).

  • HydroShare allows for larger files common with hydrologic modeling with a file upload limit of up to 2 GB using the web interface and up to 10 GB using iRODS for file transfer.

Table of content

Example use case

This example use case walks through a scenario where a hydrologists uses HydroShare to publish the results of a modeling study.

Use case description

  • The hydrologists has created a model of the Rocky Branch watershed located in Columbia, SC. The model is used to understand how green infrastructure could be used to mitigate urban flooding within the watershed.
  • The EPA stormwater management model (SWMM v.5) was applied to model the Rocky Branch watershed. The model consists of 134 subwatersheds and 188 river channels and conduits. River cross-sections were defined using surveyed measurements and conduit dimension were defined using measurements collected by site visits.
  • The data used by the model were collected from local and national organizations and then processed using Geographic Information System (GIS) to construct input files for the model. These input files for the model are considered the 'model instance' whereas the SWMM v. 5 engine is considered the 'model program' used to execute the model instance.
  • For this study, two model instances were created. The first instance is a well calibrated and evaluated model that simulates flooding events in the Rocky Branch watershed. The second model instance extends the first model instance by including additional (hypothetical) stormwater controls to test if these controls can mitigate flooding in the watershed.

What is the Problem?

We submitted a journal paper describing our study and wish to also publish the model instances resulting from the study. We have already received requests for the model instances used in our study and we expect these requests to increase once the paper is published. We believe the model instance is an important product from our work and would like it to be published and citable as well. The research was funded by NSF and we would like to publish the model instance as part of our data management plan.

Why HydroShare?

We publish the model instances in HydroShare because (1) we can control access to the files, (2) it is a repostiory tailored for hydrologists (e.g., we can add hydrology-specific metadata to the instances), (3) the model instances can be their own citable products, rather than just a supplementary document of a journal publication.

How to implement the use case in HydroShare

To share the use case instance through HydroShare, three resources should be created:

  • Model Program resource for the hydrologic model used (EPA SWMM)
  • Model Instance resource for the first instance “Rocky Branch watershed simulation without rain gardens implementation”
  • Model Instance resource for the second instance “Rocky Branch watershed simulation with rain gardens implementation”

Step-by-step instructions for publishing the Model Instance resource type

1. Go to www.hydroshare.org and login using your username and password then click on “MY RESOURCES.”

2. Click on “+ Create new” to create new resource in HydroShare.

3. From the dropdown list in front of “Select a resource type,” select the required resource type. In our case select “Model Instance.”

4. Add a title for the Model Instance resource in front of “Title.” Click on “Choose files” tab and choose the required files to be uploaded to this Model Instance resource. In our case there are five files to be uploaded. Select all the five files and then click “Open.”

5. Now, we have a resource type, title, and selected files to be uploaded. Once we click on the “Create Resource” button the resource will be created and will be assigned by a Unique Identifier. Note: You can also create the resource by selecting the resource type and then later add the title and upload the associated files.

6. Now, you can see the landing page of the created resource. It includes all of the information related to the resource like type, owner, unique identifier, file content, and the resource specific metadata. Click on the metadata edit button to start inserting and editing the resource metadata.

7. Modify the resource title if needed. Add an abstract. Add Subject. Select a license.

8. Select “Contact” tab and add any author or contributor.

9. Select “Coverage” tab and add the spatial and temporal coverage for the created resource.

10. Select “Resource specific” tab which includes the specific metadata for the created resource. For the Model Instance resource there are two more metadata terms which are ModelOutput and ExecutedBy. For the ModelOutput term, if the uploaded files include the output files select “Yes” otherwise select “No.”

11. For the ExecutedBy term, select the Model Program that is used to execute the Model Instance resource from the drop list under “Model name”.

12. Once the Model Program is selected, some related metadata appear to verify that you select the correct Model Program. Click “Save changes” to save the selected Model Program resource.

There are two places for the resource specific metadata terms:

  • The first place is through the “Resource Specific” landing page.

  • The second place is located in downloaded resource content package in the file called “resourcemetadata.xml”

Here is the finished product! Morsy, M.M. (2015). Rocky Branch watershed simulation, HydroShare, http://www.hydroshare.org/resource/12d195906f2c41918cb24e11a5c3ab60

##Specific Model Instance Resource Types: The basic model instance resource type metadata has been expanded to support specific, more common, model instance metadata: SWAT Model Instance (ready), MODFLOW Model Instance (in progress), RHESSys Model Instance (in progress) and UEB Snow Model Instance (in progress).

What if you did not find the required Model Program resource for specific Model Instance resource in HydroShare?

In this case you need to create a new Model Program resource by using the following steps, and then link it back to the Model Instance resource by using the ExecutedBy metadata element.

Step-by-step instructions for publishing the Model Program resource type

Steps 1 to 8 are the same as the creation of Model Instance resource type but rather than select Model Instance select Model Program as the resource type.

9. Select “Resource specific” which includes the specific metadata for the created resource. For the Model Program resource there are ten more metadata terms.

10. Under “Computational Engine” select from the drop list the uploaded files related to the model computational engine (e.g., source code, and binary, etc.)

11. Under "Software" select from the drop list the uploaded files related to the model software (e.g., executable, utilities software, etc.)

12. Under "Documentation select from the drop list the uploaded files related to the model documentation (e.g., User manual, theoretical manual, reports, etc.)

13. Under “Release Notes” select from the drop list the uploaded files related to the model release notes.

14. Under the “Release Date” insert the date of the software release which is 2015-04-30.

15. Under “Version” insert SWMM model version which is 5.1.009.

16. Under the “Website” insert a URL that provides additional information about the software which is http://www2.epa.gov/water-research/storm-water-management-model-swmm?#downloads.

17. Under “Language” insert the programming language(s) that the model was written in. In this case the programming language is C.

18. Under “Operating System” insert the compatible operating system with the hydrologic model which is Windows XP, and 7.

19. Under “Software Repository” insert a URL to the source code repository (e.g., Github, Bitbucket, etc.). The SWMM model does not have any.

20. Click “Save changes” button to save all entries.

There are two places for the resource specific metadata terms:

  • The first place is through the “Resource Specific” landing page.

  • The second place is located in downloaded resource content package in file called “resourcemetadata.xml”

Here is the finished product! Rossman, L., T.Schade, D.Sullivan, R.Dickinson, C.Chan, E.Burgess (2015). Storm Water Management Model (SWMM), HydroShare,http://www.hydroshare.org/resource/2cddae40e9594c21b947fdbbe4225439

Appendix

Architecture of the Model Program and Instance resource types for HydroShare

Model Program resource type describes the software component of a generic model within the water domain. This resource consists of specific metadata that enables scientists to retrieve all the content and information required to get a model up-and-running. This resource consists of uploaded content such as source code, compiled binaries, and documentation.

Model Instance resource defines the input and output data for a generic hydrological model, for a specific time and place. This resource consists of specific metadata to describe the model content as well as the Model Program resource that is used to execute a simulation.

The Model Program resource can be related to many Model Instance resources to completely describe a simulation and the exact software version used for a particular study, to make data replication possible.

Hydroshare resources and their metadata terms definition

Clone this wiki locally
You can’t perform that action at this time.