Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit rescaled columns and add unit tests #337

Merged
merged 3 commits into from Mar 13, 2023

Conversation

lekoenig
Copy link

@lekoenig lekoenig commented Mar 9, 2023

This PR addresses comment (4) in #303 (comment) and for split catchments, only uses the rescaled characteristic values for columns with requested statistics "area-weighted mean" and "sum." I suggest this change because I think if a user requests "min," they expect to get the minimum value from the COMIDs that comprise the aggregated id (and not the minimum rescaled value).

Using the same example as linked above, here's what the output looks like for characteristic CAT_ELEV_MIN (previously, CAT_ELEV_MIN_min = 88.5 for id = 10024048 because 88.5 was the minimum rescaled value).

library(tidyverse)
 
d <- readRDS(list.files(pattern = "rescale_data.rds", recursive = TRUE, full.names = TRUE))
vars <- data.frame(characteristic_id = c("CAT_ELEV_MIN"),
                   summary_statistic = c("min"))

get_catchment_characteristics(varname = unique(vars$characteristic_id), 
                              ids = unique(d$lookup_table$comid)) |>
    left_join(d$lookup_table, by = "comid", multiple = "all") |>
    arrange(id)
#>    characteristic_id   comid characteristic_value percent_nodata member_comid       id LevelPathID
#> 1:      CAT_ELEV_MIN 4146596               796.09              0      4146596 10012268     2132667
#> 2:      CAT_ELEV_MIN 4147382               530.76              0      4147382 10012268     2132667
#> 3:      CAT_ELEV_MIN 4147370               616.66              0      4147370 10012979     2130154
#> 4:      CAT_ELEV_MIN 4147378               574.88              0      4147378 10012979     2130154
#> 5:      CAT_ELEV_MIN 4147396               487.49              0    4147396.1 10024047     2123206
#> 6:      CAT_ELEV_MIN 4147396               487.49              0    4147396.2 10024048     2123206
#> 7:      CAT_ELEV_MIN 4147380               530.24              0    4147380.1 10024049     2123206
#> 8:      CAT_ELEV_MIN 4147380               530.24              0    4147380.2 10024050     2123206
 
rescale_catchment_characteristics(vars, d$lookup_table, d$split_divides)
#> # A tibble: 6 x 5
#>         id areasqkm_sum lengthkm_sum percent_nodata_CAT_ELEV_MIN_area_wtd CAT_ELEV_MIN_min
#>      <dbl>        <dbl>        <dbl>                                <dbl>            <dbl>
#> 1 10012268        12.9          6.30                                    0             531.
#> 2 10012979         4.78         3.31                                    0             575.
#> 3 10024047         6.51         4.31                                    0             487.
#> 4 10024048         1.44         4.31                                    0             487.
#> 5 10024049         5.84         5.10                                    0             530.
#> 6 10024050         7.56         5.10                                    0             530. 

I've also added some unit tests to test the rescale calculations for aggregated and split catchments. These tests aren't exhaustive but help me think about how we rescale characteristics for split catchments. I realize that the doc for rescale_catchment_characteristics contains a note of caution about the rescaling (copied below), so let me know if you think these changes are out of scope.

#' @details
#' NOTE: Since this algorithm works on catchment characteristics that are
#' spatial averages, when splitting, the average condition is apportioned evenly
#' to each split. In some cases, such as with land cover or elevation, this may
#' not be appropriate and source data should be used to derive new characteristics.

@@ -38,7 +47,6 @@ rescale_characteristics <- function(vars, lookup_table) {
#' "areasqkm." Used to retrieve adjusted catchment areas in the case of split
#' catchments.
#'
#' @importFrom sf st_drop_geometry
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see sf::st_drop_geometry used in get_catchment_areas() which is why I suggested omitting this line from the documentation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a utility function in hydroloom called "drop_geometry"

drop_geometry <- function(x) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants