Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

waywiser: Ergonomic Methods for Assessing Spatial Models #571

Closed
12 of 20 tasks
mikemahoney218 opened this issue Jan 11, 2023 · 69 comments
Closed
12 of 20 tasks

waywiser: Ergonomic Methods for Assessing Spatial Models #571

mikemahoney218 opened this issue Jan 11, 2023 · 69 comments

Comments

@mikemahoney218
Copy link
Member

mikemahoney218 commented Jan 11, 2023

Date accepted: 2023-02-27
Submitting Author Name: Mike Mahoney
Submitting Author Github Handle: @mikemahoney218
Repository: https://github.com/mikemahoney218/waywiser
Version submitted:
Submission type: Stats
Badge grade: silver
Editor: @Paula-Moraga
Reviewers: @becarioprecario, @jakub_nowosad, @Nowosad

Due date for @becarioprecario: 2023-02-04

Due date for @jakub_nowosad: 2023-02-06
Due date for @Nowosad: 2023-02-06
Archive: TBD
Version accepted: TBD
Language: en

  • Paste the full DESCRIPTION file inside a code block below:
Type: Package
Package: waywiser
Title: Ergonomic Methods for Assessing Spatial Models
Version: 0.2.0.9000
Authors@R: c(
    person("Michael", "Mahoney", , "mike.mahoney.218@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0003-2402-304X")),
    person("Lucas", "Johnson", , "lucas.k.johnson03@gmail.com", role = c("ctb"),
           comment = c(ORCID = "0000-0002-7953-0260")),
    person("RStudio", role = c("cph", "fnd"))
  )
Description: Assessing predictive models of spatial data can be challenging, 
    both because these models are typically built for extrapolating outside the
    original region represented by training data and due to potential spatially
    structured errors, with "hot spots" of higher than expected error
    clustered geographically due to spatial structure in the underlying
    data. Methods are provided for assessing models fit to spatial data, 
    including approaches for measuring the spatial structure of model errors,
    assessing model predictions at multiple spatial scales, and evaluating where 
    predictions can be made safely. Methods are particularly useful for models 
    fit using the 'tidymodels' framework. Methods include Moran's I
    ('Moran' (1950) <doi:10.2307/2332142>), Geary's C 
    ('Geary' (1954) <doi:10.2307/2986645>), Getis-Ord's G
    ('Ord' and 'Getis' (1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>),
    agreement coefficients from 'Ji' and Gallo (2006) 
    (<doi: 10.14358/PERS.72.7.823>), agreement metrics from 'Willmott' (1981)
    (<doi: 10.1080/02723646.1981.10642213>) and 'Willmott' 'et' 'al'. (2012)
    (<doi: 10.1002/joc.2419>), an implementation of the area of applicability 
    methodology from 'Meyer' and 'Pebesma' (2021) 
    (<doi:10.1111/2041-210X.13650>), and an implementation of
    multi-scale assessment as described in 'Riemann' 'et' 'al'. (2010)
    (<doi:10.1016/j.rse.2010.05.010>).
License: MIT + file LICENSE
URL: https://github.com/mikemahoney218/waywiser,
    https://mikemahoney218.github.io/waywiser/
BugReports: https://github.com/mikemahoney218/waywiser/issues
Depends: 
    R (>= 3.6)
Imports: 
    dplyr,
    fields,
    FNN,
    glue,
    hardhat,
    Matrix,
    purrr,
    rlang,
    rsample,
    sf (>= 1.0-0),
    spdep (>= 1.1-9),
    stats,
    tibble,
    tidyselect,
    yardstick
Suggests: 
    applicable,
    caret,
    CAST,
    covr,
    ggplot2,
    knitr,
    modeldata,
    recipes,
    rmarkdown,
    spatialsample,
    spelling,
    testthat (>= 3.0.0),
    tidymodels,
    tidyr,
    tigris,
    vip,
    whisker,
    withr
Config/testthat/edition: 3
Config/testthat/parallel: true
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE, roclets = c("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.2.3
Language: en-US
VignetteBuilder: knitr

Scope

  • Please indicate which of our statistical package categories this package falls under. (Please check one appropriate box below):

    Statistical Packages

    • Bayesian and Monte Carlo Routines
    • Dimensionality Reduction, Clustering, and Unsupervised Learning
    • Machine Learning
    • Regression and Supervised Learning
    • Exploratory Data Analysis (EDA) and Summary Statistics
    • Spatial Analyses
    • Time Series Analyses

Pre-submission Inquiry

  • A pre-submission inquiry has been approved in issue#565

General Information

  • Who is the target audience and what are scientific applications of this package?

Anyone fitting models to spatial data, particularly (but not exclusively) people working within the tidymodels ecosystem. This includes a number of domains, and we've already been using it in our modeling practice.

  • Paste your responses to our General Standard G1.1 here, describing whether your software is:

    • The first implementation of a novel algorithm; or
    • The first implementation within R of an algorithm which has previously been implemented in other languages or contexts; or
    • An improvement on other implementations of similar algorithms in R.

    Please include hyperlinked references to all other relevant software.

The waywiser R package makes it easier to measure the performance of models fit to 2D spatial data by implementing a number of well-established assessment methods in a consistent, ergonomic toolbox; features include new yardstick metrics for measuring agreement and spatial autocorrelation, functions to assess model predictions across multiple scales, and methods to calculate the area of applicability of a model.

Relevant software implementing similar algorithms include CAST for ww_area_of_applicability(). Several yardstick metrics implemented directly wrap spdep in a more consistent interface. Willmott's D is also implemented in hydroGOF. Other functions have (as far as I am aware) not been implemented elsewhere, such as ww_multi_scale() which implements the procedure from Riemann et al 2010, or ww_agreement_coefficient() which implements metrics from Ji and Gallo 2006.

N/A

Badging

Silver

Have a demonstrated generality of usage beyond one single envisioned use case. Software is frequently developed for one particular use case envisioned by the authors themselves. Generalising the utility of software so that it is readily applicable to other use cases, and satisfactorily documenting such generality of usage, represents another aspect which may be considered sufficient for software to attain a silver grade.

This is the primary aspect which I believe merits the silver status. The waywiser package implements routines which are useful for a wide variety of spatial models and integrates well with the tidymodels ecosystem, making it (hopefully!) of interdisciplinary interest.

Depending on what the editors think, I'd also potentially submit this for gold, based upon the following two aspects:

Compliance with a good number of standards beyond those identified as minimally necessary. This will require reviewers and authors to agree on identification of both a minimal subset of necessary standards, and a full set of potentially applicable standards. This aspect may be considered fulfilled if at least one quarter of the additional potentially applicable standards have been met, and should definitely be considered fulfilled if more than one half have been met.

Internal aspects of package structure and design. Many aspects of the internal structure and design of software are too variable to be effectively addressed by standards. Packages which are judged by reviewers to reflect notably excellent design choices, especially in the implementation of core statistical algorithms, may also be considered worthy of a silver grade.

But I'm not familiar enough with the system to know if waywiser is likely to be in compliance with these two aspects, and am comfortable submitting for "silver" status if waywiser does not obviously meet both.

Technical checks

Confirm each of the following by checking the box.

  • I have read the rOpenSci packaging guide.
  • I have read the author guide and I expect to maintain this package for at least 2 years or have another maintainer identified.
  • I/we have read the Statistical Software Peer Review Guide for Authors.
  • I/we have run autotest checks on the package, and ensured no tests fail. (Sorry, both the release and CRAN versions of autotest fail immediately on my machine with internal errors -- that is, from autotest itself and not from my package -- and therefore I have not been able to use it).
  • The srr_stats_pre_submit() function confirms this package may be submitted.
  • The pkgcheck() function confirms this package may be submitted - alternatively, please explain reasons for any checks which your package is unable to pass.

This package:

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you intend for this package to go on Bioconductor?

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

The following problem was found in your submission template:

  • 'statsgrade' variable must be one of [bronze, silver, gold]
    Editors: Please ensure these problems with the submission template are rectified. Package checks have been started regardless.

👋

@mikemahoney218
Copy link
Member Author

Sorry! The instructions at the top of the issue told me to not change anything other than the repo URL and GitHub handles -- this might be something to update in the issue template:

Below, please enter values for (1) submitting author GitHub handle (replacing "@github_handle@); and (2) Repository URL (replacing "https://repourl"). Values for additional package authors may also be specified, replacing "@github_handle1", "@github_handle2" - delete these if not needed. DO NOT DELETE HTML SYMBOLS (everything between "<!" and ">"). Replace only "@github_handle" and "https://repourl". This comment may be deleted once it has been read and understood.

@ropensci-review-bot
Copy link
Collaborator

Note: The following R packages were unable to be installed/upgraded on our system: [tigris, spatialsample, spdep]; some checks may be unreliable.

@ropensci-review-bot
Copy link
Collaborator

Oops, something went wrong with our automatic package checks. Our developers have been notified and package checks will appear here as soon as we've resolved the issue. Sorry for any inconvenience.

@ropensci-review-bot
Copy link
Collaborator

ropensci-review-bot commented Jan 12, 2023

Checks for waywiser (v0.2.0.9000)

git hash: b8816249

  • ✔️ Package is already on CRAN.
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✖️ Package coverage failed
  • ✖️ R CMD check process failed with message: 'Build process failed'.

Important: All failing checks above must be addressed prior to proceeding

Package License: MIT + file LICENSE


1. rOpenSci Statistical Standards (srr package)

This package is in the following category:

  • Spatial

✖️ Package can not be submitted because the following standards [v0.2.0] are missing from your code:

SP2.1
SP2.2
SP2.2a
SP2.2b

Click to see the report of author-generated standards compliance of the package with links to associated lines of code, which can be generated locally by running the srr_report() function from within a local clone of the repository.


2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal waywiser 81
internal base 80
internal utils 33
internal graphics 2
imports stats 42
imports yardstick 23
imports rlang 11
imports purrr 7
imports spdep 4
imports hardhat 3
imports sf 3
imports glue 2
imports rsample 2
imports tidyselect 2
imports dplyr 1
imports fields 1
imports FNN 1
imports Matrix 1
imports tibble 1
suggests applicable NA
suggests caret NA
suggests CAST NA
suggests covr NA
suggests ggplot2 NA
suggests knitr NA
suggests modeldata NA
suggests recipes NA
suggests rmarkdown NA
suggests spatialsample NA
suggests spelling NA
suggests testthat NA
suggests tidymodels NA
suggests tidyr NA
suggests tigris NA
suggests vip NA
suggests whisker NA
suggests withr NA
linking_to NA NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

waywiser

calc_ssd (4), check_for_missing (3), gmfr (3), calc_spdu (2), calc_spod (2), ww_area_of_applicability (2), calc_aoa (1), calc_d_bar (1), calc_di (1), calc_spds (1), check_di_columns_numeric (1), check_di_importance (1), check_di_testing (1), create_aoa (1), expand_grid (1), is_longlat (1), predict.ww_area_of_applicability (1), print.ww_area_of_applicability (1), spatial_yardstick_df (1), spatial_yardstick_vec (1), standardize_and_weight (1), tidy_importance (1), tidy_importance.data.frame (1), tidy_importance.default (1), tidy_importance.vi (1), ww_agreement_coefficient_impl (1), ww_agreement_coefficient_vec (1), ww_agreement_coefficient.data.frame (1), ww_area_of_applicability.data.frame (1), ww_area_of_applicability.default (1), ww_area_of_applicability.formula (1), ww_area_of_applicability.rset (1), ww_build_neighbors (1), ww_build_weights (1), ww_global_geary_c_impl (1), ww_global_geary_c_vec (1), ww_global_geary_c.data.frame (1), ww_global_geary_pvalue_impl (1), ww_global_geary_pvalue_vec (1), ww_global_geary_pvalue.data.frame (1), ww_global_moran_i_impl (1), ww_global_moran_i_vec (1), ww_global_moran_i.data.frame (1), ww_global_moran_pvalue_impl (1), ww_global_moran_pvalue_vec (1), ww_global_moran_pvalue.data.frame (1), ww_local_geary_c_impl (1), ww_local_geary_c_vec (1), ww_local_geary_c.data.frame (1), ww_local_geary_pvalue_impl (1), ww_local_geary_pvalue_vec (1), ww_local_geary_pvalue.data.frame (1), ww_local_getis_ord_g_impl (1), ww_local_getis_ord_g_pvalue_vec (1), ww_local_getis_ord_g_pvalue.data.frame (1), ww_local_getis_ord_g_vec (1), ww_local_getis_ord_g.data.frame (1), ww_local_getis_ord_pvalue_impl (1), ww_local_moran_i_impl (1), ww_local_moran_i_vec (1), ww_local_moran_i.data.frame (1), ww_local_moran_pvalue_impl (1), ww_local_moran_pvalue_vec (1), ww_local_moran_pvalue.data.frame (1), ww_make_point_neighbors (1), ww_make_polygon_neighbors (1), ww_multi_scale (1), ww_systematic_agreement_coefficient_impl (1), ww_systematic_agreement_coefficient_vec (1), ww_systematic_agreement_coefficient.data.frame (1), ww_systematic_mpd.data.frame (1)

base

c (10), call (7), data.frame (7), mean (7), list (6), class (4), sum (4), if (3), nrow (3), abs (2), all (2), identical (2), inherits (2), is.na (2), names (2), unlist (2), any (1), character (1), drop (1), get (1), integer (1), length (1), missing (1), ncol (1), paste0 (1), round (1), seq_len (1), setdiff (1), sign (1), sqrt (1), unique (1)

stats

resid (20), dt (10), na.fail (6), lm (2), predict (2), complete.cases (1), cor (1)

utils

data (33)

yardstick

new_numeric_metric (23)

rlang

caller_env (6), exec (2), expr (2), list2 (1)

purrr

map (4), chuck (1), map_dbl (1), map_lgl (1)

spdep

knearneigh (1), localC_perm (1), localG_perm (1), Szero (1)

hardhat

mold (2), default_formula_blueprint (1)

sf

st_bbox (1), st_geometry_type (1), st_intersects (1)

glue

glue (2)

graphics

grid (2)

rsample

analysis (1), assessment (1)

tidyselect

eval_select (2)

dplyr

summarise (1)

fields

rdist (1)

FNN

knn.dist (1)

Matrix

mean (1)

tibble

tibble (1)


3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 15 files) and
  • 1 authors
  • 3 vignettes
  • no internal data file
  • 15 imported packages
  • 95 exported functions (median 3 lines of code)
  • 173 non-exported functions in R (median 11 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 15 73.0
files_vignettes 3 92.4
files_tests 39 98.8
loc_R 1602 79.6
loc_vignettes 345 68.1
loc_tests 6814 98.8 TRUE
num_vignettes 3 94.2
n_fns_r 268 93.1
n_fns_r_exported 95 95.0 TRUE
n_fns_r_not_exported 173 91.9
n_fns_per_file_r 9 84.5
num_params_per_fn 4 54.6
loc_per_fn_r 9 24.3
loc_per_fn_r_exp 3 1.5 TRUE
loc_per_fn_r_not_exp 11 35.4
rel_whitespace_R 18 79.7
rel_whitespace_vignettes 22 54.6
rel_whitespace_tests 19 98.9 TRUE
doclines_per_fn_exp 109 92.6
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 99 79.1

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


4. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml

GitHub Workflow Results

id name conclusion sha run_number date
3898338974 Lock Threads success b88162 199 2023-01-12
3744601924 pages build and deployment success a23b40 50 2022-12-20
3744561926 pkgdown success b88162 118 2022-12-20
3744561927 R-CMD-check success b88162 116 2022-12-20
3744561920 R-CMD-check-hard success b88162 112 2022-12-20
3744561918 test-coverage success b88162 116 2022-12-20

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following error:

  1. Error in proc$get_built_file() : Build process failed

Test coverage with covr

ERROR: Test Coverage Failed

Cyclocomplexity with cyclocomp

Error : Build failed, unknown error, standard output:
�[33m* checking for file ‘waywiser/DESCRIPTION’ ... OK

  • preparing ‘waywiser’:
  • checking DESCRIPTION meta-information ... OK
  • installing the package to build vignettes
  • creating vignettes ... ERROR
    --- re-building ‘multi-scale-assessment.Rmd’ using rmarkdown
    Quitting from lines 52-58 (multi-scale-assessment.Rmd)
    Error: processing vignette 'multi-scale-assessment.Rmd' failed with diagnostics:
    OGRCreateCoordinateTransformation(): transformation not available
    --- failed re-building ‘multi-scale-assessment.Rmd’

--- re-building ‘residual-autocorrelation.Rmd’ using rmarkdown
Quitting from lines 78-94 (residual-autocorrelation.Rmd)
Error: processing vignette 'residual-autocorrelation.Rmd' failed with diagnostics:
OGRCreateCoordinateTransformation(): transformation not available
--- failed re-building ‘residual-autocorrelation.Rmd’

--- re-building ‘waywiser.Rmd’ using rmarkdown
--- finished re-building ‘waywiser.Rmd’

SUMMARY: processing the following files failed:
‘multi-scale-assessment.Rmd’ ‘residual-autocorrelation.Rmd’

Error: Vignette re-building failed.
Execution halted
double free or corruption (out)
Aborted (core dumped)
�[39m

Static code analyses with lintr

lintr found the following 601 potential issues:

message number of times
Avoid library() and require() calls in packages 14
Lines should not be more than 80 characters. 585
unexpected input 2


Package Versions

package version
pkgstats 0.1.3
pkgcheck 0.1.0.32
srr 0.0.1.186


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@annakrystalli
Copy link
Contributor

Hello @mikemahoney218 ,

The failures shown on the checks appear to be genuine and are pointing to some issues in the package.
It appears the package fails rcmdcheck because of the stuff shown at the bottom of "goodpractice and other checks", with vignettes producing a core dump.

We encourage you to try and reproduce in a clean docker container by cloning the repo and running rcmdcheck::rcmdcheck() and see what you get.

With respect to messages from statistical checks by srr, we are investigating as we thing those are issues on our end.

@annakrystalli
Copy link
Contributor

BTW @mikemahoney218, regarding testing locally in a docker container, this section of our devguide has some useful links to info (just in case).

@mpadge
Copy link
Member

mpadge commented Jan 12, 2023

@mikemahoney218 @annakrystalli The srr section of the checks has now also been updated. Sorry for any inconvenience. @mikemahoney218 Please call @ropensci-review-bot check package to re-generate the checks once you've addressed both the srr and the failing rcmdcheck issues.

@mikemahoney218
Copy link
Member Author

Hi @annakrystalli & @mpadge -- can I ask if there's more information about your CI server available anywhere? I'm wondering what results you get from sf::sf_extSoftVersion() (and sessionInfo()), as this issue seems to be local to your CI setup.

I can't reproduce the issue on CRAN:
image
(Currently at https://win-builder.r-project.org/CY6Z7In5rrks/, expect that link will break after 1/15 though)

On CI:
ropensci/waywiser#15

Or locally on Docker:
image

I notice that your link suggests the R-Hub docker images; it has been my experience that R-Hub has not been able to install most spatial software for a few years now. I checked using the rocker images, via the command:

docker run --rm -ti -v "$(pwd)":/home/rstudio rocker/geospatial R

(The volume attaches my code folder as the home directory in order to check the package.)

So it seems like I'm not able to reproduce this issue across a variety of environments.

@mpadge
Copy link
Member

mpadge commented Jan 12, 2023

@mikemahoney218 It's our own docker image used specifically for package checks. Current version gives this:

sf::sf_extSoftVersion()
#>           GEOS           GDAL         proj.4 GDAL_with_GEOS     USE_PROJ_H 
#>       "3.8.0"          "3.0.4"        "6.3.1"         "true"         "true" 
#>           PROJ 
#>        "6.3.1"

Created on 2023-01-12 with reprex v2.0.2

... but i can confirm that the issue is directly caused by sf, and not your package. Wrong linkage with compiled version of GEOS. I'll ping here once we've fixed that up and can run the check again. That might take a while, so in the meantime please ignore those fails and accept our aplogies. Thanks.

@mikemahoney218
Copy link
Member Author

Thanks! I think there's still likely going to be an issue from using PROJ 6 -- the vignettes assume you've got access to the PROJ CDN, which I believe was a PROJ 7/2020-release feature, so the resulting vignettes may be odd -- but it shouldn't segfault; glad to hear you've caught it.

@mikemahoney218
Copy link
Member Author

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Note: The following R packages were unable to be installed/upgraded on our system: [tigris, spatialsample, spdep]; some checks may be unreliable.

@mikemahoney218
Copy link
Member Author

Hi @annakrystalli ! I'm not sure how long the checks should take, but we're a bit past two hours now. I believe I've fixed the srr issue, and it sounds like fixing the CI system may take a while, but the package works on non-rOpenSci systems.

@mpadge
Copy link
Member

mpadge commented Jan 12, 2023

@mikemahoney218 The comment above was intended to imply that checks for your package would not work until the problem was rectified. As said,

I'll ping here once we've fixed that up

But given that you've already called the checks, i'll just get them to dump updated versions here when they're done. Please bear with us, as this could take a few days to get around to.

@mikemahoney218
Copy link
Member Author

Ah sorry, I had assumed the package checks would just fail again and I'd be able to get the bot to verify I'd finished the srr. Apologies!

@ropensci-review-bot
Copy link
Collaborator

Checks for waywiser (v0.2.0.9000)

git hash: 6c57cc85

  • ✔️ Package is already on CRAN.
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 100%.
  • ✔️ R CMD check found no errors.
  • ✔️ R CMD check found no warnings.

Package License: MIT + file LICENSE


1. rOpenSci Statistical Standards (srr package)

This package is in the following category:

  • Spatial

✔️ All applicable standards [v0.2.0] have been documented in this package (74 complied with; 39 N/A standards)

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.


2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal waywiser 81
internal base 80
internal utils 33
internal graphics 2
imports stats 42
imports yardstick 23
imports rlang 11
imports purrr 7
imports spdep 4
imports hardhat 3
imports sf 3
imports glue 2
imports rsample 2
imports tidyselect 2
imports dplyr 1
imports fields 1
imports FNN 1
imports Matrix 1
imports tibble 1
suggests applicable NA
suggests caret NA
suggests CAST NA
suggests covr NA
suggests ggplot2 NA
suggests knitr NA
suggests modeldata NA
suggests recipes NA
suggests rmarkdown NA
suggests spatialsample NA
suggests spelling NA
suggests testthat NA
suggests tidymodels NA
suggests tidyr NA
suggests tigris NA
suggests vip NA
suggests whisker NA
suggests withr NA
linking_to NA NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

waywiser

calc_ssd (4), check_for_missing (3), gmfr (3), calc_spdu (2), calc_spod (2), ww_area_of_applicability (2), calc_aoa (1), calc_d_bar (1), calc_di (1), calc_spds (1), check_di_columns_numeric (1), check_di_importance (1), check_di_testing (1), create_aoa (1), expand_grid (1), is_longlat (1), predict.ww_area_of_applicability (1), print.ww_area_of_applicability (1), spatial_yardstick_df (1), spatial_yardstick_vec (1), standardize_and_weight (1), tidy_importance (1), tidy_importance.data.frame (1), tidy_importance.default (1), tidy_importance.vi (1), ww_agreement_coefficient_impl (1), ww_agreement_coefficient_vec (1), ww_agreement_coefficient.data.frame (1), ww_area_of_applicability.data.frame (1), ww_area_of_applicability.default (1), ww_area_of_applicability.formula (1), ww_area_of_applicability.rset (1), ww_build_neighbors (1), ww_build_weights (1), ww_global_geary_c_impl (1), ww_global_geary_c_vec (1), ww_global_geary_c.data.frame (1), ww_global_geary_pvalue_impl (1), ww_global_geary_pvalue_vec (1), ww_global_geary_pvalue.data.frame (1), ww_global_moran_i_impl (1), ww_global_moran_i_vec (1), ww_global_moran_i.data.frame (1), ww_global_moran_pvalue_impl (1), ww_global_moran_pvalue_vec (1), ww_global_moran_pvalue.data.frame (1), ww_local_geary_c_impl (1), ww_local_geary_c_vec (1), ww_local_geary_c.data.frame (1), ww_local_geary_pvalue_impl (1), ww_local_geary_pvalue_vec (1), ww_local_geary_pvalue.data.frame (1), ww_local_getis_ord_g_impl (1), ww_local_getis_ord_g_pvalue_vec (1), ww_local_getis_ord_g_pvalue.data.frame (1), ww_local_getis_ord_g_vec (1), ww_local_getis_ord_g.data.frame (1), ww_local_getis_ord_pvalue_impl (1), ww_local_moran_i_impl (1), ww_local_moran_i_vec (1), ww_local_moran_i.data.frame (1), ww_local_moran_pvalue_impl (1), ww_local_moran_pvalue_vec (1), ww_local_moran_pvalue.data.frame (1), ww_make_point_neighbors (1), ww_make_polygon_neighbors (1), ww_multi_scale (1), ww_systematic_agreement_coefficient_impl (1), ww_systematic_agreement_coefficient_vec (1), ww_systematic_agreement_coefficient.data.frame (1), ww_systematic_mpd.data.frame (1)

base

c (10), call (7), data.frame (7), mean (7), list (6), class (4), sum (4), if (3), nrow (3), abs (2), all (2), identical (2), inherits (2), is.na (2), names (2), unlist (2), any (1), character (1), drop (1), get (1), integer (1), length (1), missing (1), ncol (1), paste0 (1), round (1), seq_len (1), setdiff (1), sign (1), sqrt (1), unique (1)

stats

resid (20), dt (10), na.fail (6), lm (2), predict (2), complete.cases (1), cor (1)

utils

data (33)

yardstick

new_numeric_metric (23)

rlang

caller_env (6), exec (2), expr (2), list2 (1)

purrr

map (4), chuck (1), map_dbl (1), map_lgl (1)

spdep

knearneigh (1), localC_perm (1), localG_perm (1), Szero (1)

hardhat

mold (2), default_formula_blueprint (1)

sf

st_bbox (1), st_geometry_type (1), st_intersects (1)

glue

glue (2)

graphics

grid (2)

rsample

analysis (1), assessment (1)

tidyselect

eval_select (2)

dplyr

summarise (1)

fields

rdist (1)

FNN

knn.dist (1)

Matrix

mean (1)

tibble

tibble (1)


3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in R (100% in 15 files) and
  • 1 authors
  • 3 vignettes
  • no internal data file
  • 15 imported packages
  • 95 exported functions (median 3 lines of code)
  • 173 non-exported functions in R (median 11 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 15 73.0
files_vignettes 3 92.4
files_tests 39 98.8
loc_R 1602 79.6
loc_vignettes 345 68.1
loc_tests 6814 98.8 TRUE
num_vignettes 3 94.2
n_fns_r 268 93.1
n_fns_r_exported 95 95.0 TRUE
n_fns_r_not_exported 173 91.9
n_fns_per_file_r 9 84.5
num_params_per_fn 4 54.6
loc_per_fn_r 9 24.3
loc_per_fn_r_exp 3 1.5 TRUE
loc_per_fn_r_not_exp 11 35.4
rel_whitespace_R 18 79.7
rel_whitespace_vignettes 22 54.6
rel_whitespace_tests 19 98.9 TRUE
doclines_per_fn_exp 109 92.6
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 99 79.1

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


4. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml

GitHub Workflow Results

id name conclusion sha run_number date
3898338974 Lock Threads success b88162 199 2023-01-12
3903744451 pages build and deployment success d4d305 51 2023-01-12
3903687426 pkgdown success 6c57cc 121 2023-01-12
3903687434 R-CMD-check success 6c57cc 119 2023-01-12
3903687436 R-CMD-check-hard success 6c57cc 115 2023-01-12
3903687429 test-coverage success 6c57cc 119 2023-01-12

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following notes:

  1. checking Rd cross-references ... NOTE
    Packages unavailable to check Rd xrefs: ‘raster’, ‘terra’
  2. checking data for non-ASCII characters ... NOTE
    Note: found 1 marked UTF-8 string

R CMD check generated the following check_fail:

  1. rcmdcheck_non_ascii_characters_in_data

Test coverage with covr

Package coverage: 100

Cyclocomplexity with cyclocomp

Error : �[4m�[33m
Build failed, standard output:

�[39m�[24m�[33m* checking for file ‘waywiser/DESCRIPTION’ ... OK

  • preparing ‘waywiser’:
  • checking DESCRIPTION meta-information ... OK
  • installing the package to build vignettes
  • creating vignettes ... OK
  • checking for LF line-endings in source and make files and shell scripts
  • checking for empty or unneeded directories
  • re-saving .R files as .rda
    �[39m
    �[4m�[31mStandard error:

�[39m�[24m�[31mError in loadNamespace(x) : there is no package called ‘waywiser’
Execution halted
�[39m

Static code analyses with lintr

lintr found the following 603 potential issues:

message number of times
Avoid library() and require() calls in packages 14
Lines should not be more than 80 characters. 587
unexpected input 2


Package Versions

package version
pkgstats 0.1.3
pkgcheck 0.1.1.3
srr 0.0.1.188


Editor-in-Chief Instructions:

This package is in top shape and may be passed on to a handling editor

@annakrystalli
Copy link
Contributor

@ropensci-review-bot assign @Paula-Moraga as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @Paula-Moraga is now the editor

@ropensci-review-bot
Copy link
Collaborator

Couldn't find entry for becarioprecario in the reviews log

@Paula-Moraga
Copy link

@ropensci-review-bot submit review #571 (comment) time 3

@ropensci-review-bot
Copy link
Collaborator

Couldn't find entry for Nowosad in the reviews log

@mikemahoney218
Copy link
Member Author

Thank you for your comments, @Nowosad ! I believe I've addressed all your comments in the current development version of the package. I've responded to your points with a bit more detail below.


waywiser.Rmd: Have you considered explaining the example data first in this vignette?

Added! (Commit, rendered)

waywiser.Rmd: Could you add a sentence or two better explaining where n came from in the Multi-scale model assessment section?; why is this a list (and also why cellsize is not a list in multi-scale-assessment.Rmd)?

Added! (Commit, rendered)

waywiser.Rmd: Area of Applicability: could you expand the description of the importance argument in this vignette?

Added! (Commit, rendered)

waywiser.Rmd: Area of Applicability: I would suggest also showing the result here (it is hard to think about an area of applicability without seeing it first)

Added! (Commit, rendered)

waywiser.Rmd: I like the Feature Matrix table, however, I am not sure if it should be at the end of this vignette. Have you considered moving it to a standalone vignette (for better visibility)? Also, could you replace DOI codes with DOI urls?

I moved it to be a standalone article (so on the pkgdown site, but not built on CRAN), which also let me move kableExtra out of Suggests. (Commit, rendered)

residual-autocorrelation.Rmd and multi-scale-assessment.Rmd: What is the reason for using the %>% pipe here? Why not use the native pipe (|>)?

The main reason is because this package supports R >= 4, while the pipe was added in 4.1. Using the native pipe would make CI runs for R 4.0 fail, or only be run conditionally; in order for the vignettes to build on every supported version of R, I've kept the %>% pipe for now.

residual-autocorrelation.Rmd: “This makes it easy to see what areas are poorly represented by our model” – could you elaborate on this sentence and explain which areas you are talking about?

I added a long discussion about what this means at the top of the vignette, and also elaborated a tiny bit at the end. (Commit, rendered)

GitHub Actions are mostly broken at the moment (I assume it is due to the recent dplyr changes)

These should be "fixed" for now, by tripping the new lifecycle warning in dplyr::summarise() at the top of each test file. The long-term fix will depend on if yardstick changes to use reframe().

Regarding your “p-value” question: I think option 2 is fine.

Thank you! I've gone ahead and added this documentation throughout. (Example of commit, example of rendered)

Question: How do you see the future of this package? Do you plan to add any new features (e.g., spatial explainers)?

I'm not sure I know what spatial explainers are! That said, this is my broad vision for the package:

  1. In the near term, I think I might extend ww_multi_scale() to accept raster inputs, for situations where you've predicted a large raster that won't fit entirely in memory as points. This is pretty much the only feature I have planned.
  2. I'm open to adding yardstick metrics (such as ww_agreement_coefficient, ww_local_geary_c) if any are requested, or if I run into any in the literature. That said, I don't have any plans here, and don't know of any that would be useful to add. If the metric isn't coming from the spatial literature, it should probably live in yardstick instead.
  3. My general goal is for waywiser to be a useful toolbox for assessing spatial models, and I view anything that falls under that headline as being "in scope". If an assessment method is coming from the spatial modeling world, then it's a good candidate for waywiser, even if it's not inherently spatial (so AOA, agreement coefficient, Willmott's D etc all fall under this). If it's not coming from the spatial modeling world, then I'd probably rather contribute techniques to vip, DALEX, applicable, or yardstick.

With all that said, I'm not actively looking for things to add -- I'm currently only adding things that are useful for my own work. But if there are requests or PRs for other features, that's my basic outline for whether something belongs in waywiser or not.

Hope that answers the question!

☒ Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer (“rev” role) in the package DESCRIPTION file.

Thanks! I'll add you as a reviewer in the DESCRIPTION once the package is accepted 😄

@maelle
Copy link
Member

maelle commented Feb 3, 2023

@Paula-Moraga I recorded the reviews information, sorry about the glitch.

@Nowosad
Copy link

Nowosad commented Feb 4, 2023

@mikemahoney218 thanks for all of the improvements made.

I'm not sure I know what spatial explainers are!

I've been thinking of something like https://geods.netlify.app/post/spatial-ml-model-diagnostics/.

@mikemahoney218
Copy link
Member Author

@Nowosad I think that would be in scope for waywiser. I'm not planning on implementing it in the near future (I need to focus on my dissertation 😅, so all the techniques I'm actively adding are things I'm going to use myself), but I could definitely see the package growing in that direction over time.

@ropensci-review-bot
Copy link
Collaborator

📆 @jakub_nowosad you have 2 days left before the due date for your review (2023-02-06).

@ropensci-review-bot
Copy link
Collaborator

📆 @Nowosad you have 2 days left before the due date for your review (2023-02-06).

@becarioprecario
Copy link

becarioprecario commented Feb 4, 2023 via email

@mikemahoney218
Copy link
Member Author

@becarioprecario Thank you -- your comments have been extremely useful 😄

My thinking (and experience) is that the p-value functions are called directly by users as a model diagnostic tool during the iteration process -- these p-values aren't being reported in a publication, but rather used to guide model development by highlighting hot-spots for model residuals (and hopefully helping to make a model misspecification clear, so it can be fixed before any publication).

There's not really a great way for functions using the yardstick infrastructure to return two different statistics (so here, the test statistic and p-value). The idiomatic way to do so is to use yardstick::metric_set() to combine two functions (here, the test statistic and p-value functions), but that's something that's best left to the user, as metric sets can't be "expanded" to include additional metrics.

For example, if you want to calculate (for instance) global Moran's I with a p-value, plus an agreement coefficient, you can run metrics <- yardstick::metric_set(ww_global_moran_i, ww_global_moran_p_value, ww_agreement_coefficient) and then use the metrics() function with your data to get all three outputs. If waywiser provided a metric set (which is what functions like ww_global_moran() did, but note those functions were never in the submitted version of the package) then you couldn't call yardstick::metric_set(ww_global_moran, ww_agreement_coefficient); you'd get an error.

That's why the "combined" functions were removed before this package was submitted; they don't work in a lot of places that users would expect them to be useful, and explaining the reason they work in a very different way than the rest of the package is pretty hard to communicate. Instead, all of the metrics provided by this package are pure yardstick metrics, without any of the weird edge cases. That means they're restricted to each returning a single type of statistic.

I've added documentation as described in (2) to these functions (see for instance ropensci/waywiser@333cf42#diff-45d2e91a37be2289564b4e1c987cbc8ac817ee874cc0ddcf19cfcdd8088c01feR6-R9 ). I could also add documentation about using yardstick::metric_set() to calculate both the test statistic and p-value at once, though I personally think it'd be better to not mention that; instead, the current documentation encourages people to use the spdep functions directly if they're looking to use p-values for other purposes than what I've described. This documentation (on using metric_set()) would probably look like the section on metric_set() that's in the Getting Started vignette.

Alternatively, I could remove the p-value functions, if you think that having them at all without a combination function is harmful. But I don't think it makes sense to add combination functions; they introduce too many weird edge cases and don't idiomatically fit into yardstick.

@Paula-Moraga
Copy link

Many thanks @becarioprecario and @Nowosad for your useful reviews, and @mikemahoney218 for taking into account all the comments and suggestions to improve the package.

I would like to ask @becarioprecario and @Nowosad if you are happy with the new version of the package and the package can be approved or you have additional comments.

@Nowosad
Copy link

Nowosad commented Feb 10, 2023

@Paula-Moraga I am happy the current version of the package.

@mikemahoney218
Copy link
Member Author

Hi @becarioprecario @Paula-Moraga , I just wanted to bump this thread: are there additional comments still to be resolved? Thank you!

@becarioprecario
Copy link

becarioprecario commented Feb 20, 2023 via email

@mikemahoney218
Copy link
Member Author

@becarioprecario Not a problem -- thank you for taking the time to review the package! It's highly appreciated. I'll add a new section to the vignettes tomorrow and follow up here with the commit.

@mikemahoney218
Copy link
Member Author

@becarioprecario I added a summary of this discussion to the "residual autocorrelation" vignette:
ropensci/waywiser@c0860a1

Thank you again for your time reviewing this package, it's been a real help.

@becarioprecario
Copy link

becarioprecario commented Feb 21, 2023 via email

@Paula-Moraga
Copy link

Many thanks @mikemahoney218 @becarioprecario @Nowosad for your time and work to improve the package. I am very pleased to approve it!

@Paula-Moraga
Copy link

@ropensci-review-bot approve waywiser

@ropensci-review-bot
Copy link
Collaborator

Approved! Thanks @mikemahoney218 for submitting and @becarioprecario, @jakub_nowosad, @Nowosad for your reviews! 😁

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You will need to enable two-factor authentication for your GitHub account.
    This invitation will expire after one week. If it happens write a comment @ropensci-review-bot invite me to ropensci/<package-name> which will re-send an invitation.
  • After transfer write a comment @ropensci-review-bot finalize transfer of <package-name> where <package-name> is the repo/package name. This will give you admin access back.
  • Fix all links to the GitHub repo to point to the repo under the ropensci organization.
  • Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
  • If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
    • deactivate the automatic deployment you might have set up
    • remove styling tweaks from your pkgdown config but keep that config file
    • replace the whole current pkgdown website with a redirecting page
    • replace your package docs URL with https://docs.ropensci.org/package_name
    • In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar, https://github.com/ropensci/foobar
  • Skim the docs of the pkgdown automatic deployment, in particular if your website needs MathJax.
  • Fix any links in badges for CI and coverage to point to the new repository URL.
  • Increment the package version to reflect the changes you made during review. In NEWS.md, add a heading for the new version and one bullet for each user-facing change, and each developer-facing change that you think is relevant.
  • We're starting to roll out software metadata files to all rOpenSci packages via the Codemeta initiative, see https://docs.ropensci.org/codemetar/ for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.
  • You can add this installation method to your package README install.packages("<package-name>", repos = "https://ropensci.r-universe.dev") thanks to R-universe.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

@mikemahoney218
Copy link
Member Author

@ropensci-review-bot invite me to ropensci/waywiser

@ropensci-review-bot
Copy link
Collaborator

Invitation sent!

@mikemahoney218
Copy link
Member Author

@ropensci-review-bot finalize transfer of waywiser

@ropensci-review-bot
Copy link
Collaborator

Transfer completed.
The waywiser team is now owner of the repository and the author has been invited to the team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants