Edit rescaled columns and add unit tests #337

lekoenig · 2023-03-09T22:42:53Z

This PR addresses comment (4) in #303 (comment) and for split catchments, only uses the rescaled characteristic values for columns with requested statistics "area-weighted mean" and "sum." I suggest this change because I think if a user requests "min," they expect to get the minimum value from the COMIDs that comprise the aggregated id (and not the minimum rescaled value).

Using the same example as linked above, here's what the output looks like for characteristic CAT_ELEV_MIN (previously, CAT_ELEV_MIN_min = 88.5 for id = 10024048 because 88.5 was the minimum rescaled value).

library(tidyverse)
 
d <- readRDS(list.files(pattern = "rescale_data.rds", recursive = TRUE, full.names = TRUE))
vars <- data.frame(characteristic_id = c("CAT_ELEV_MIN"),
                   summary_statistic = c("min"))

get_catchment_characteristics(varname = unique(vars$characteristic_id), 
                              ids = unique(d$lookup_table$comid)) |>
    left_join(d$lookup_table, by = "comid", multiple = "all") |>
    arrange(id)
#>    characteristic_id   comid characteristic_value percent_nodata member_comid       id LevelPathID
#> 1:      CAT_ELEV_MIN 4146596               796.09              0      4146596 10012268     2132667
#> 2:      CAT_ELEV_MIN 4147382               530.76              0      4147382 10012268     2132667
#> 3:      CAT_ELEV_MIN 4147370               616.66              0      4147370 10012979     2130154
#> 4:      CAT_ELEV_MIN 4147378               574.88              0      4147378 10012979     2130154
#> 5:      CAT_ELEV_MIN 4147396               487.49              0    4147396.1 10024047     2123206
#> 6:      CAT_ELEV_MIN 4147396               487.49              0    4147396.2 10024048     2123206
#> 7:      CAT_ELEV_MIN 4147380               530.24              0    4147380.1 10024049     2123206
#> 8:      CAT_ELEV_MIN 4147380               530.24              0    4147380.2 10024050     2123206
 
rescale_catchment_characteristics(vars, d$lookup_table, d$split_divides)
#> # A tibble: 6 x 5
#>         id areasqkm_sum lengthkm_sum percent_nodata_CAT_ELEV_MIN_area_wtd CAT_ELEV_MIN_min
#>      <dbl>        <dbl>        <dbl>                                <dbl>            <dbl>
#> 1 10012268        12.9          6.30                                    0             531.
#> 2 10012979         4.78         3.31                                    0             575.
#> 3 10024047         6.51         4.31                                    0             487.
#> 4 10024048         1.44         4.31                                    0             487.
#> 5 10024049         5.84         5.10                                    0             530.
#> 6 10024050         7.56         5.10                                    0             530.

I've also added some unit tests to test the rescale calculations for aggregated and split catchments. These tests aren't exhaustive but help me think about how we rescale characteristics for split catchments. I realize that the doc for rescale_catchment_characteristics contains a note of caution about the rescaling (copied below), so let me know if you think these changes are out of scope.

#' @details
#' NOTE: Since this algorithm works on catchment characteristics that are
#' spatial averages, when splitting, the average condition is apportioned evenly
#' to each split. In some cases, such as with land cover or elevation, this may
#' not be appropriate and source data should be used to derive new characteristics.

lekoenig · 2023-03-09T22:46:01Z

R/rescale_catchments.R

@@ -38,7 +47,6 @@ rescale_characteristics <- function(vars, lookup_table) {
 #' "areasqkm." Used to retrieve adjusted catchment areas in the case of split
 #' catchments.
 #'
-#' @importFrom sf st_drop_geometry


I don't see sf::st_drop_geometry used in get_catchment_areas() which is why I suggested omitting this line from the documentation.

I have a utility function in hydroloom called "drop_geometry"

nhdplusTools/R/A_nhdplusTools.R

Line 395 in a7446dd

drop_geometry <- function(x) {

lkoenig-usgs added 3 commits March 9, 2023 15:12

add unit tests to verify rescale calculations

b59c4dc

update cols used for rescaling attributes

51b0b5d

make minor documentation edits in rescale_catchment_characteristics

997549c

lekoenig commented Mar 9, 2023

View reviewed changes

dblodgett-usgs approved these changes Mar 13, 2023

View reviewed changes

dblodgett-usgs merged commit 970ddf6 into DOI-USGS:hydroloom Mar 13, 2023

lekoenig mentioned this pull request Mar 27, 2023

Ability to split / aggregate characteristics based on hydrofab refactor workflow. #303

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Edit rescaled columns and add unit tests #337

Edit rescaled columns and add unit tests #337

lekoenig commented Mar 9, 2023

lekoenig Mar 9, 2023

dblodgett-usgs Mar 13, 2023

Edit rescaled columns and add unit tests #337

Edit rescaled columns and add unit tests #337

Conversation

lekoenig commented Mar 9, 2023

lekoenig Mar 9, 2023

Choose a reason for hiding this comment

dblodgett-usgs Mar 13, 2023

Choose a reason for hiding this comment