Analysis of daily stream temperature data for the northeastern US using the conteStreamTemperature package
HTML TeX R Other
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Old_Rmd
Rmd
TMB_Code
code
formatting
localData_med_res
manuscripts
pdf
reports
vignettes
warm_sites
.DS_Store
.gitignore
Derived_Metric_Summary.Rmd
Derived_Metric_Summary.html
License.md
Project_Web_Page_Summary.Rmd
Project_Web_Page_Summary.html
Project_Web_Page_Summary.md
README.md
Tables_Figures.Rmd
conteStreamTemperature_northeast.Rproj
current_model_run.txt
model_config.json
predictions_summary.Rmd
run_model.sh
run_model2.sh

README.md

Conte Stream Temperature Model for Northeastern Headwater Streams

This is the project folder for the stream temperature work underway at the USGS S.O. Conte Anadromous Fish Research Center in Turners Falls, MA.

The stream temperature model estimates effects of landscape variables (% forest cover, % agriculture, drainage area, etc.) and time varying variables (solar radiation, air temperature, precipitation, etc.) on daily stream water temperature. For each site/year combination, the estimates are limited to the times of the year where air temperature and water temperature are synchronized to avoid issues with ice-cover and phase changes.

More documentation can be found at http://conte-ecology.github.io/conteStreamTemperature_northeast/

The model and R package used in this analysis can be found at: https://github.com/Conte-Ecology/conteStreamTemperature

Automated Model Run

requires:

  1. log into GNU screen via command line on local computer (optional - see tutorial https://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/)
  2. Connect to osensei via ssh to run the model (optional: assumes running model on osensei)
  3. Start persistent screen session
  4. Go to conteStreamTemperature_northeast/ directory (or clone from GitHub if first use)
  5. Set validate, debug, and MAD options in the model_config.json file
  6. Run run_model.sh bash script

Model Configuration Details (model_config.json)

  • validate - boolean indicating whether to hold out data for validation. Defaults to true.
  • debug - boolean incdicating whether to turn on debug features. Defaults to true.
  • mad_tf - boolean indicating whether to use the Median Absolute Deviance in the automated QAQC. Defaults to false. It adds 12+ hours to the model run, increasing non-linearly with increasing data, and the cutoffs are somewhat arbitraty and largely unexplored. It also only removes single days and not full out of water segments.

Example:

$ screen -S osensei
$ ssh dan@osensei.cns.umass.edu
# $ screen -S temperature
$ cd conteStreamTemperature_northeast
$ screen -d -m -S temperature bash run_model.sh

*$ represents commandline prompt

This script will create a new directory called "modelRun_" followed by the date initiated (e.g. modelRun_2016-06-27). Within this directory, a log file will be created (status_log.txt) which will give an update each time a new part of the script is started.

Manual Model Run

requires:

Follow these steps to run the model manually:

CL = command line in the conteStreamTemperature_northeast directory on osensei or local machine. These should be run in screen session otherwise the connection to the database will time out (more info: https://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/)

  1. Set the model configuration (model_config.json)
  2. Set the model run directory in the file "current_model_run.txt". This should be a single line naming the subdirectory (e.g. modelRun/modelRun_2016-06-30). You will also need to great this subdirectory in the folder.
  3. determine what locations are near impoundments (CL: bash id_impoundment_sites.sh sheds <subdirecotry/>)
  4. determine what locations are potentially tidally influenced (e.g. CL: bash id_tidal_sites.sh sheds <username> <subdirectory/>)
  5. Fetch data that are reviewed (retrieve_db.R)
  6. Fetch daymet data (i.e. CL: psql -f <subdirectory>/code/daymet_query.sql -d sheds -w > <subdirectory>/daymet_results.csv)
  7. Determine breakpoints of synchronized season (breakpoints.R)
  8. Prepare data for use in model (prepare_model_data.R)
  9. Run the statistical model (run_model.R)
  10. Get mcmc and model diagnostics (mcmc_diagnostics.R)
  11. Summarize the model (summarize_iterations.R)
  12. Validate model (validate_model.R)
  13. Calculate derived metrics for all catchments (predict_temperature_parallel.R)

Covariates

Variable Description Source Processing GitHub Repository
Total Drainage Area The total contributing drainage area from the entire upstream network The SHEDS Data project The individual polygon areas are summed for all of the catchments in the contributing network NHDHRDV2
Riparian Forest Cover The percentage of the upstream 61 m (200 ft) riparian buffer area that is covered by trees taller than 5 meters The National LandCover Database (NLCD) All of the NLCD forest type classifications are combined and attributed to each riparian buffer polygon using GIS tools. All upstream polygon values are then aggregated. nlcdLandCover
Daily Precipition The daily precipitation record for the individual local catchment Daymet Daily Surface Weather and Climatological Summaries Daily precipitation records are spatially assigned to each catchment based on overlapping grid cells using the zonalDaymet R package daymet
Upstream Impounded Area The total area in the contributing drainage basin that is covered by wetlands, lakes, or ponds that intersect the stream network U.S. Fish & Wildlife Service (FWS) National Wetlands Inventory All freshwater surface water bodies are attributed to each catchment using GIS tools. All upstream polygon values are then aggregated. fwsWetlands
Percent Agriculture The percentage of the contributing drainage area that is covered by agricultural land (e.g. cultivated crops, orchards, and pasture) including fallow land. The National LandCover Database All of the NLCD agricutlural classifications are combined and attributed to each catchment polygon using GIS tools. All upstream polygon values are then aggregated. nlcdLandCover
Percent High Intensity Developed The percentage of the contributing drainage area covered by places where people work or live in high numbers (typically defined as areas covered by more than 80% impervious surface) The National LandCover Database The NLCD high intensity developed classification is attributed to each catchment polygon using GIS tools. All upstream polygon values are then aggregated. nlcdLandCover

Derived Metrics

Object Metric Description
meanMaxTemp Mean maximum temperature Maximum daily mean water temperature (C) averaged over 36 years (1980 - 2015)
maxMaxTemp Max maximum temperature Maximum over years of the maximum daily mean temperature
meanJulyTemp Mean July temperature Mean daily July temperature over years
meanAugTemp Mean August temperature Mean daily August temperature over years
meanSummerTemp Mean summer temperature Mean daily summer temperature over years
mean30DayMax Mean 30-day maximum temperature Maximum 30-day temperature for each year averaged over years
meanDays.18 Mean number of days over 18 C Mean number of days per year the mean daily temperature exceeds 18 C
meanDays.22 Mean number of days over 22 C Mean number of days per year the mean daily temperature exceeds 22 C
freqMaxTemp.18 Annual frequency of exceeding 18 C Frequency of years the mean daily temperature ever exceeds 18 C
freqMaxTemp.22 Annual frequency of exceeding 22 C Frequency of years the mean daily temperature ever exceeds 22 C
meanResist Mean annual resistance Mean annual resistance of water temperature to peak (summer) air temperature
TS Thermal sensitivity Thermal sensitivity of water temperature to changes in air temperature