# Stream Temperature Retrospective (1990-2021) 

## Introduction

Stream temperature is a key driver of performance in aquatic species and directly reflects a changing climate. Siegel et al. (2023) developed a statistical stream temperature model and produced predictions for over 260,000 free-flowing stream reaches throughout the Pacific Northwest (PNW; Washington, Oregon, and Idaho) every day from 1990 through 2021. The model combined and enhanced elements of existing stream temperature models to reflect mechanistic processes using publicly available climate and landscape covariates in a Generalized Additive Model framework. The model was fit to empirical stream temperatures from the NorWeST database (1993–2013) and covariates interacted to allow nonlinear relationships between temporal and spatial covariates to capture seasonal patterns. The model used a moving average of antecedent air temperatures linked to streamflow; the moving average window size was longer for reaches with snow-dominated hydrology, especially at higher flows, whereas window size was relatively constant and low for reaches with rain-dominated hydrology. The daily model fit well (RMSE=1.76; MAE=1.32°C) and cross-validation suggested that the model produced useful predictions at unsampled locations across diverse landscapes and environmental conditions.

We applied the Siegel et al. (2023) model, using covariates of future air temperature from an ensemble of 10 Global Circulation Models, and snowpack and streamflow from a hydrological model, to predict future stream temperatures across streams in the PNW at a coarser spatial resolution (i.e., fewer smaller streams) every day from 1950 through 2099. These predictions can be used to model potential freshwater thermal habitats for Pacific salmon and other species, and can inform conservation and restoration decisions.  These future datasets are found in "Stream Temperature Projections" projects.

### Project data

The project includes the following informational and data files:
- readme.html
- project_bounds.geojson
- project.rs.xml
- processing.log
- **daily_strem_temperature.db.zip**, a zipped database of the full set of daily stream temperature predictions (1990-2021) for each stream reach (i.e., each comid in the 1:100,000 National Hydrography Dataset).
- **daily_coviarates.db.zip**, a zipped database of the full set of daily temporal covariate values, by reach, for covariates used in the temperature prediction model, as well as many others that were computed but not used in the final model.
- **covariate_metadata.csv**, a table of the temporal and spatial covariates (see below) considered or used in the final temperature predication model, with their symbology, definitions, and sources.
- **seasonal_anomalies_spatial_covariates.gpkg**, a geopackage containing the following layers (i.e. tables) of spatial features and attributes:


| Geopackage layer name | Data type | Description | Available views[<sup>1</sup>](#fn1) |
| --- | --- | --- | --- |
| flowlines | LineString | flowline geometries for each stream reach by comid, from the National Hydrography Dataset (NHD) V2 | |
| contributing area | MultiPolygon | lateral contributing area polygons for each stream reach, from NHDV2  | |
| HUC12_boundaries | Polygon | 12-digit hydrologic unit (HUC 12) polygons, from NHDV2 |  |
| spatial covariates | attribute | values of spatial attributes used or considered for the stream temperature prediction model | |
| stream_temperature_2010s | attribute | median seasonal[<sup>2</sup>](#fn2) stream temperature ($°C$) predictions across available data[<sup>3</sup>](#fn3) from years 2010-2019, by stream reach (i.e., *baseline temperature*)| polygons, lines |
| stream_temperature_2010s_byHUC12 | attribute | contributing-area$-$weighted[<sup>4</sup>](#fn4) seasonal median stream temperature predictions across calendar years 2010-2019, by HUC 12 (i.e., *baseline temperature*)| polygons |
| stream_temperature_anomalies_2000s | attribute | difference from *baseline temperature* (i.e., years 2010-2019) in seasonal median stream temperature ($°C$) prediction across calendar years 2000-2009, by stream reach | polygons, lines |
| stream_temperature_anomalies_2000s_byHUC12 | attribute | contributing-area$-$weighted difference from *baseline temperature* in seasonal median stream temperature ($°C$) prediction across calendar years 2000-2009, by HUC 12 | polygons |
| stream_temperature_anomalies_1990s | attribute | difference from *baseline temperature* in seasonal median stream temperature ($°C$) prediction across calendar years 1990-1999, by stream reach | polygons, lines |
| stream_temperature_anomalies_1990s_byHUC12 | attribute | contributing-area$-$weighted difference from *baseline temperature* in median stream temperature ($°C$) prediction across calendar years 1990-1999, by HUC 12 | polygons |
| temporal_covariates_2010s | attribute | median seasonal values for covariates $air\_temp\_ws$ ($°C$), $NWM\_flow\_log$ ($log(m^3/s)$), and $SwS$ ($m\cdot days$)[<sup>5</sup>](#fn5) across water years[<sup>6</sup>](#fn6) 2010-2019, by stream reach (i.e., *baseline values*).| polygons, lines |
| temporal_covariates_2010s_byHUC12 | attribute | contributing-area$-$weighted median seasonal values for covariates $air\_temp\_ws$ ($°C$), $NWM\_flow\_log$ ($log(m^3/s)$), $SwS$ ($m\cdot days$) across water years 2010-2019, by HUC 12 (i.e., *baseline values*).| polygons |
| temporal_covariates_anomalies_2000s | attribute | difference from *baseline values* for covariates $air\_temp\_ws$ ($°C$), $NWM\_flow\_log$ ($log(m^3/s)$), $SwS$ ($m\cdot days$) across water years 2000-2009, by stream reach.| polygons, lines |
| temporal_covariates_anomalies_2000s_byHUC12 | attribute | contributing-area$-$weighted difference from *baseline values* for covariates $air\_temp\_ws$ ($°C$), $NWM\_flow\_log$ ($log(m^3/s)$), $SwS$ ($m\cdot days$) across water years 2000-2009, by HUC 12.| polygons |
| temporal_covariates_anomalies_1990s | attribute | difference from *baseline values* for covariates $air\_temp\_ws$ ($°C$), $NWM\_flow\_log$ ($log(m^3/s)$), $SwS$ ($m\cdot days$) across water years 1990-1999, by stream reach.| polygons, lines |
| temporal_covariates_anomalies_1990s_byHUC12 | attribute | contributing-area$-$weighted difference from *baseline values* for covariates $air\_temp\_ws$ ($°C$), $NWM\_flow\_log$ ($log(m^3/s)$), $SwS$ ($m\cdot days$) across water years 1990-1999, by HUC 12. | polygons |

## Notes
<sup>1</sup><span id="fn1"> SQL views for given attribute tables appear as additional geopackage layers with "vw_" prepended to the original table name, followed by the shape (e.g., "vw_stream_temperature_2010s_polygons.")  For attributes by reach, polygon views  are the contriubting areas and line views are flowlines. For attriubtes by HUC 12, polygon views are HUC 12 boundaries.</span>

<sup>2</sup><span id="fn2"> Seasons were assigned based on calendar day of year (doy): $sp$ = spring (doy 80$-$171), $su$ = summer (doy 172$-$263), $fa$ = fall (doy 264$-$356), $wi$ = winter (doy $\leq$ 79 or doy $\geq$ 357).</span>

<sup>3</sup><span id="fn3"> Missing data are coded as -999 in the databases and geopackage.  Sample sizes ($n$) are reported for all aggregate metrics in the geopackage layers (i.e., baseline and anomaly values and area-weighted medians) to aid users in understanding the input data and evaluating the output.</span>

<sup>4</sup><span id="fn4"> Area-weighted medians for metric $x$ by HUC 12, where $med(x)_j$ is the median seasonal value of $x$ across years, and $a_j$ is the area, for contributing area $j$ in $HUC12_i$, were computed as: 
$$med({X})_{HUC12_i} = \frac{\sum_{j=1}^{n}{med(x)_j*a_j}}{\sum{a_j}}$$</span>

<sup>5</sup><span id="fn5"> Covariates are as defined in 'covariate_metadata.csv' with the exception of snow water storage ($SwS$), defined as the integrated area under the snow water equivalent ($SWE$) curve (Aragon and Hill, 2024).  Here, we summed daily values of $SWE$ ($m$) over each season $s$ to compute seasonal $SwS_s$ ($m\cdot days$).</span>

<sup>6</sup><span id="fn6"> Water years begin on 1 October and are named for the year in which they conclude (e.g., 15 November 2020 is water year 2021). So that a water year comprised four continuous seasons, we re-classified the last several days in September (doy 267-274) as "summer." Because the first prediction is 1 January 1990, water year 1990 is truncated at the top. </span>


## Caveats
The stream temperature predictions should be used with appropriate caution. Although predictions are made at high spatial and temporal resolution, they are only as good as the spatiotemporal resolution of the covariates, which varied and may have been coarser. As well, the model was not designed to capture or reflect effects of dams and thus any predictions near dams or reservoirs should be used cautiously. Any temperatures above 35°C should be regarded as suspicious; these could have been produced at boundaries where covariate values were artificially truncated, or in cases where GAM curves predicted anomalous outcomes.

## References

Siegel JE, Fullerton AH, FitzGerald AM, Holzer D, Jordan CE (2023). Daily stream temperature predictions for free-flowing streams in the Pacific Northwest, USA. PLOS Water 2(8): e0000119. [https://doi.org/10.1371/journal.pwat.0000119](https://doi.org/10.1371/journal.pwat.0000119).

Aragon, C. M. and Hill, D. F. (2024). Changing snow water storage in natural snow reservoirs, Hydrol. Earth Syst. Sci., 28, 781–800. [https://doi.org/10.5194/hess-28-781-2024](https://doi.org/10.5194/hess-28-781-2024).