Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eDNAjoint: R package for interpreting paired environmental DNA and traditional surveys #642

Open
14 of 21 tasks
abigailkeller opened this issue May 15, 2024 · 16 comments
Open
14 of 21 tasks
Assignees

Comments

@abigailkeller
Copy link

abigailkeller commented May 15, 2024

Submitting Author Name: Abigail Keller
Submitting Author Github Handle: @abigailkeller
Repository: https://github.com/abigailkeller/eDNAjoint
Version submitted: 0.1
Submission type: Stats
Badge grade: silver
Editor: @emitanaka
Reviewers: TBD

Archive: TBD
Version accepted: TBD
Language: en

  • Paste the full DESCRIPTION file inside a code block below:
Package: eDNAjoint
Title: Joint Modeling of Traditional and Environmental DNA Survey Data
Version: 0.1
Maintainer: Abigail G. Keller <agkeller@berkeley.edu>
Author: Abigail G. Keller
Authors@R: 
    c(person("Abigail G.", "Keller", role = c("aut", "cre"), email="agkeller@berkeley.edu"),
    person("Ryan P.", "Kelly", role = "ctb", email="rpkelly@uw.edu"))
Description: Models integrate environmental DNA (eDNA) detection data and traditional survey data to jointly estimate species catch rate (see package vignette: https://bookdown.org/abigailkeller/eDNAjoint_vignette/). Models can be used with count data via traditional survey methods (i.e., trapping, electrofishing, visual) and replicated eDNA detection/nondetection data via polymerase chain reaction (i.e., PCR or qPCR) from multiple survey locations. Estimated parameters include probability of a false positive eDNA detection, a site-level covariates that scale the sensitivity of eDNA surveys relative to traditional surveys, and catchability coefficients for traditional gear types. Models are implemented with a Bayesian framework (Markov chain Monte Carlo) using the 'Stan' probabilistic programming language.
License: GPL-3
URL: https://github.com/abigailkeller/eDNAjoint
BugReports: https://github.com/abigailkeller/eDNAjoint/issues
Encoding: UTF-8
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.3.1
Biarch: true
Depends: 
    R (>= 3.4.0)
Imports: 
    bayestestR,
    dplyr,
    ggplot2,
    loo,
    magrittr,
    methods,
    Rcpp (>= 0.12.0),
    RcppParallel (>= 5.0.1), 
    rlist,
    rstan (>= 2.26.23),
    rstantools (>= 2.3.1.1),
    tidyr
LinkingTo: 
    BH (>= 1.66.0),
    Rcpp (>= 0.12.0),
    RcppEigen (>= 0.3.3.3.0),
    RcppParallel (>= 5.0.1),
    rstan (>= 2.26.23),
    StanHeaders (>= 2.26.22)
SystemRequirements: GNU make
LazyData: true
Suggests: 
    bayesplot,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0)
VignetteBuilder: knitr
Config/testthat/edition: 3

Scope

  • Please indicate which of our statistical package categories this package falls under. (Please check one appropriate box below):

    Statistical Packages

    • Bayesian and Monte Carlo Routines
    • Dimensionality Reduction, Clustering, and Unsupervised Learning
    • Machine Learning
    • Regression and Supervised Learning
    • Exploratory Data Analysis (EDA) and Summary Statistics
    • Spatial Analyses
    • Time Series Analyses
    • Probability Distributions

Pre-submission Inquiry

  • A pre-submission inquiry has been approved in issue 628

General Information

  • Who is the target audience and what are scientific applications of this package?

The package eDNAjoint is useful for interpreting observations from paired environmental DNA (eDNA) and traditional surveys. The package runs a Bayesian model that integrates these two data streams to jointly estimate parameters like the false positive probability of eDNA detection and expected catch rate at a site. The package allows users to access pre-compiled models written in Stan. The target audience is environmental science researchers or managers who want to interpret environmental DNA data but do not have experience writing and implementing custom Bayesian models.

This is the first implementation of a model/algorithm developed in Keller et al. 2022.

Badging

Silver

Compliance with a good number of standards beyond those identified as minimally necessary.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you intend for this package to go on Bioconductor?

Code of conduct

@ropensci-review-bot
Copy link
Collaborator

Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help for help.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for eDNAjoint (v0.1)

git hash: 8aecadf9

  • ✔️ Package name is available
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 85.1%.
  • ✖️ R CMD check process failed with message: 'Build process failed'.
  • 👀 Function names are duplicated in other packages

Important: All failing checks above must be addressed prior to proceeding

(Checks marked with 👀 may be optionally addressed.)

Package License: GPL-3


1. rOpenSci Statistical Standards (srr package)

This package is in the following category:

  • Bayesian and Monte Carlo

✔️ All applicable standards [v0.2.0] have been documented in this package (345 complied with; 50 N/A standards)

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.


2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 302
internal stats 33
internal eDNAjoint 26
internal utils 7
internal graphics 3
internal parallel 2
imports rstan 30
imports Rcpp 18
imports ggplot2 6
imports tidyr 6
imports bayestestR 5
imports dplyr 4
imports magrittr 2
imports loo 1
imports methods 1
imports RcppParallel NA
imports rlist NA
imports rstantools NA
suggests bayesplot NA
suggests knitr NA
suggests rmarkdown NA
suggests testthat NA
linking_to rstan 30
linking_to Rcpp 18
linking_to BH NA
linking_to RcppEigen NA
linking_to RcppParallel NA
linking_to StanHeaders NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (28), sum (20), length (19), for (16), col (14), as.data.frame (13), c (13), if (10), paste0 (10), q (10), apply (9), unlist (9), all (7), any (7), is.na (7), nrow (7), seq_along (7), unique (7), lapply (6), paste (6), round (6), by (5), seq (5), as.integer (4), as.matrix (4), beta (4), ifelse (4), vector (4), warning (4), as.character (2), cbind (2), colMeans (2), dim (2), exp (2), is.double (2), is.null (2), log (2), match (2), min (2), names (2), plot (2), rep (2), sapply (2), sqrt (2), stopifnot (2), dir.exists (1), matrix (1), ncol (1), qr (1), summary (1)

stats

median (9), cov (8), pnbinom (6), dnbinom (4), family (4), ppois (2)

rstan

extract (15), get_sampler_params (6), summary (6), sampling (2), stanc_builder (1)

eDNAjoint

div_check (4), init_joint_catchability (2), init_trad_catchability (2), all_checks (1), catchability_checks (1), covariate_checks (1), detectionCalculate (1), detectionCalculate_input_checks (1), detectionPlot (1), detectionPlot_input_checks (1), init_joint (1), init_joint_cov (1), init_joint_cov_catchability (1), init_trad (1), initial_values_checks (1), initial_values_checks_trad (1), jointModel (1), jointSelect (1), jointSelect_input_checks (1), jointSummarize (1), jointSummarize_input_checks (1)

Rcpp

loadModule (18)

utils

data (7)

ggplot2

aes (2), geom_line (2), ggplot (2)

tidyr

pivot_longer (6)

bayestestR

ci (5)

dplyr

case_when (2), mutate (2)

graphics

par (3)

magrittr

`% (2)

parallel

detectCores (2)

loo

loo_compare (1)

methods

new (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in C++ (18% in 19 files), C/C++ Header (0% in 1 files) and R (82% in 13 files)
  • 1 authors
  • 1 vignette
  • 2 internal data files
  • 12 imported packages
  • 7 exported functions (median 139 lines of code)
  • 74 non-exported functions in R (median 25 lines of code)
  • 19 C/C++ functions (median 26 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 13 68.2
files_src 19 96.3
files_inst 1 97.7
files_vignettes 1 68.4
files_tests 8 88.2
loc_R 2126 85.3
loc_src 482 46.1
loc_inst 0 0.0 TRUE
loc_vignettes 388 71.5
loc_tests 2107 94.6
num_vignettes 1 64.8
data_size_total 4212 66.4
data_size_median 2106 69.0
n_fns_r 81 70.8
n_fns_r_exported 7 34.0
n_fns_r_not_exported 74 77.4
n_fns_src 19 44.7
n_fns_per_file_r 6 73.8
n_fns_per_file_src 0 0.0 TRUE
num_params_per_fn 5 69.6
loc_per_fn_r 29 75.3
loc_per_fn_r_exp 139 95.2 TRUE
loc_per_fn_r_not_exp 25 72.0
loc_per_fn_src 26 79.6
rel_whitespace_R 25 89.5
rel_whitespace_src 28 56.3
rel_whitespace_vignettes 51 84.4
rel_whitespace_tests 18 93.1
doclines_per_fn_exp 77 83.4
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 24 50.3

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


4. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml
pkgcheck

GitHub Workflow Results

id name conclusion sha run_number date
9005475404 .github/workflows/pkgcheck.yaml failure 7eb425 14 2024-05-08
9009886509 pkgcheck success 8aecad 17 2024-05-08
9009886508 R-CMD-check success 8aecad 39 2024-05-08
9009886511 test-coverage success 8aecad 39 2024-05-08

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following error:

  1. Error in proc$get_built_file() : Build process failed

R CMD check generated the following check_fail:

  1. no_import_package_as_a_whole

Test coverage with covr

Package coverage: 85.06

Cyclocomplexity with cyclocomp

Error : Build failed, unknown error, standard output:

  • checking for file ‘eDNAjoint/DESCRIPTION’ ... OK
  • preparing ‘eDNAjoint’:
  • checking DESCRIPTION meta-information ... OK
  • cleaning src
  • installing the package to build vignettes
  • creating vignettes ... ERROR
    --- re-building ‘eDNAjoint.Rmd’ using rmarkdown
    Killed

Static code analyses with lintr

lintr found the following 15 potential issues:

message number of times
Avoid library() and require() calls in packages 6
Avoid using sapply, consider vapply instead, that's type safe 1
Lines should not be more than 80 characters. 4
Use <-, not =, for assignment. 4


5. Other Checks

Details of other checks (click to open)

✖️ The following function name is duplicated in other packages:

    • jointModel from JM


Package Versions

package version
pkgstats 0.1.5.2
pkgcheck 0.1.2.22
srr 0.1.2.9


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@abigailkeller
Copy link
Author

Hi! I'm not sure who to reach out to about this, but I'm curious about the R CMD check build failures associated with rcmdcheck and cyclocomp. My github workflows of R-CMD-check and pkgcheck pass without error. Do you know what the discrepancy could be?

@mpadge
Copy link
Member

mpadge commented May 16, 2024

Sorry @abigailkeller I'm trying to resolve this. I've worked out that the build fails in building your vignette because of bayesplot, but that's clearly not your fault, and nor can I yet work out why. Main problem is that running checks on your package takes hours (on our system at least), so I've thus far only managed to that diagnosis and no further. Once I fix things, you should see a fresh dump of pkgcheck results appear here.

@mpadge
Copy link
Member

mpadge commented May 16, 2024

I think it was just an R4.4 update glitch which should be fixed now. @abigailkeller You can call check package yourself to start another one. (Checks take hours for your package, so you'll need to be patient ...)

@abigailkeller
Copy link
Author

@ropensci-review-bot check package

@ropensci-review-bot
Copy link
Collaborator

Thanks, about to send the query.

@ropensci-review-bot
Copy link
Collaborator

🚀

Editor check started

👋

@ropensci-review-bot
Copy link
Collaborator

Checks for eDNAjoint (v0.1)

git hash: 8aecadf9

  • ✔️ Package name is available
  • ✔️ has a 'codemeta.json' file.
  • ✔️ has a 'contributing' file.
  • ✔️ uses 'roxygen2'.
  • ✔️ 'DESCRIPTION' has a URL field.
  • ✔️ 'DESCRIPTION' has a BugReports field.
  • ✔️ Package has at least one HTML vignette
  • ✔️ All functions have examples.
  • ✔️ Package has continuous integration checks.
  • ✔️ Package coverage is 85.1%.
  • ✖️ R CMD check process failed with message: 'Build process failed'.
  • 👀 Function names are duplicated in other packages

Important: All failing checks above must be addressed prior to proceeding

(Checks marked with 👀 may be optionally addressed.)

Package License: GPL-3


1. rOpenSci Statistical Standards (srr package)

This package is in the following category:

  • Bayesian and Monte Carlo

✔️ All applicable standards [v0.2.0] have been documented in this package (345 complied with; 50 N/A standards)

Click to see the report of author-reported standards compliance of the package with links to associated lines of code, which can be re-generated locally by running the srr_report() function from within a local clone of the repository.


2. Package Dependencies

Details of Package Dependency Usage (click to open)

The table below tallies all function calls to all packages ('ncalls'), both internal (r-base + recommended, along with the package itself), and external (imported and suggested packages). 'NA' values indicate packages to which no identified calls to R functions could be found. Note that these results are generated by an automated code-tagging system which may not be entirely accurate.

type package ncalls
internal base 302
internal stats 33
internal eDNAjoint 26
internal utils 7
internal graphics 3
internal parallel 2
imports rstan 30
imports Rcpp 18
imports ggplot2 6
imports tidyr 6
imports bayestestR 5
imports dplyr 4
imports magrittr 2
imports loo 1
imports methods 1
imports RcppParallel NA
imports rlist NA
imports rstantools NA
suggests bayesplot NA
suggests knitr NA
suggests rmarkdown NA
suggests testthat NA
linking_to rstan 30
linking_to Rcpp 18
linking_to BH NA
linking_to RcppEigen NA
linking_to RcppParallel NA
linking_to StanHeaders NA

Click below for tallies of functions used in each package. Locations of each call within this package may be generated locally by running 's <- pkgstats::pkgstats(<path/to/repo>)', and examining the 'external_calls' table.

base

list (28), sum (20), length (19), for (16), col (14), as.data.frame (13), c (13), if (10), paste0 (10), q (10), apply (9), unlist (9), all (7), any (7), is.na (7), nrow (7), seq_along (7), unique (7), lapply (6), paste (6), round (6), by (5), seq (5), as.integer (4), as.matrix (4), beta (4), ifelse (4), vector (4), warning (4), as.character (2), cbind (2), colMeans (2), dim (2), exp (2), is.double (2), is.null (2), log (2), match (2), min (2), names (2), plot (2), rep (2), sapply (2), sqrt (2), stopifnot (2), dir.exists (1), matrix (1), ncol (1), qr (1), summary (1)

stats

median (9), cov (8), pnbinom (6), dnbinom (4), family (4), ppois (2)

rstan

extract (15), get_sampler_params (6), summary (6), sampling (2), stanc_builder (1)

eDNAjoint

div_check (4), init_joint_catchability (2), init_trad_catchability (2), all_checks (1), catchability_checks (1), covariate_checks (1), detectionCalculate (1), detectionCalculate_input_checks (1), detectionPlot (1), detectionPlot_input_checks (1), init_joint (1), init_joint_cov (1), init_joint_cov_catchability (1), init_trad (1), initial_values_checks (1), initial_values_checks_trad (1), jointModel (1), jointSelect (1), jointSelect_input_checks (1), jointSummarize (1), jointSummarize_input_checks (1)

Rcpp

loadModule (18)

utils

data (7)

ggplot2

aes (2), geom_line (2), ggplot (2)

tidyr

pivot_longer (6)

bayestestR

ci (5)

dplyr

case_when (2), mutate (2)

graphics

par (3)

magrittr

`% (2)

parallel

detectCores (2)

loo

loo_compare (1)

methods

new (1)

NOTE: Some imported packages appear to have no associated function calls; please ensure with author that these 'Imports' are listed appropriately.


3. Statistical Properties

This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.

Details of statistical properties (click to open)

The package has:

  • code in C++ (18% in 19 files), C/C++ Header (0% in 1 files) and R (82% in 13 files)
  • 1 authors
  • 1 vignette
  • 2 internal data files
  • 12 imported packages
  • 7 exported functions (median 139 lines of code)
  • 74 non-exported functions in R (median 25 lines of code)
  • 19 C/C++ functions (median 26 lines of code)

Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages
The following terminology is used:

  • loc = "Lines of Code"
  • fn = "function"
  • exp/not_exp = exported / not exported

All parameters are explained as tooltips in the locally-rendered HTML version of this report generated by the checks_to_markdown() function

The final measure (fn_call_network_size) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile.

measure value percentile noteworthy
files_R 13 68.2
files_src 19 96.3
files_inst 1 97.7
files_vignettes 1 68.4
files_tests 8 88.2
loc_R 2126 85.3
loc_src 482 46.1
loc_inst 0 0.0 TRUE
loc_vignettes 388 71.5
loc_tests 2107 94.6
num_vignettes 1 64.8
data_size_total 4212 66.4
data_size_median 2106 69.0
n_fns_r 81 70.8
n_fns_r_exported 7 34.0
n_fns_r_not_exported 74 77.4
n_fns_src 19 44.7
n_fns_per_file_r 6 73.8
n_fns_per_file_src 0 0.0 TRUE
num_params_per_fn 5 69.6
loc_per_fn_r 29 75.3
loc_per_fn_r_exp 139 95.2 TRUE
loc_per_fn_r_not_exp 25 72.0
loc_per_fn_src 26 79.6
rel_whitespace_R 25 89.5
rel_whitespace_src 28 56.3
rel_whitespace_vignettes 51 84.4
rel_whitespace_tests 18 93.1
doclines_per_fn_exp 77 83.4
doclines_per_fn_not_exp 0 0.0 TRUE
fn_call_network_size 24 50.3

3a. Network visualisation

Click to see the interactive network visualisation of calls between objects in package


4. goodpractice and other checks

Details of goodpractice checks (click to open)

3a. Continuous Integration Badges

R-CMD-check.yaml
pkgcheck

GitHub Workflow Results

id name conclusion sha run_number date
9005475404 .github/workflows/pkgcheck.yaml failure 7eb425 14 2024-05-08
9009886509 pkgcheck success 8aecad 17 2024-05-08
9009886508 R-CMD-check success 8aecad 39 2024-05-08
9009886511 test-coverage success 8aecad 39 2024-05-08

3b. goodpractice results

R CMD check with rcmdcheck

R CMD check generated the following error:

  1. Error in proc$get_built_file() : Build process failed

R CMD check generated the following check_fail:

  1. no_import_package_as_a_whole

Test coverage with covr

Package coverage: 85.06

Cyclocomplexity with cyclocomp

Error : Build failed, unknown error, standard output:

  • checking for file ‘eDNAjoint/DESCRIPTION’ ... OK
  • preparing ‘eDNAjoint’:
  • checking DESCRIPTION meta-information ... OK
  • cleaning src
  • installing the package to build vignettes
  • creating vignettes ... ERROR
    --- re-building ‘eDNAjoint.Rmd’ using rmarkdown
    Killed

Static code analyses with lintr

lintr found the following 15 potential issues:

message number of times
Avoid library() and require() calls in packages 6
Avoid using sapply, consider vapply instead, that's type safe 1
Lines should not be more than 80 characters. This line is 546 characters. 1
Lines should not be more than 80 characters. This line is 81 characters. 1
Lines should not be more than 80 characters. This line is 82 characters. 1
Lines should not be more than 80 characters. This line is 87 characters. 1
Use <-, not =, for assignment. 4


5. Other Checks

Details of other checks (click to open)

✖️ The following function name is duplicated in other packages:

    • jointModel from JM


Package Versions

package version
pkgstats 0.1.5.2
pkgcheck 0.1.2.34
srr 0.1.2.9


Editor-in-Chief Instructions:

Processing may not proceed until the items marked with ✖️ have been resolved.

@abigailkeller
Copy link
Author

@mpadge Thanks! I just ran the check again. It looks like the same issue is happening?

@abigailkeller abigailkeller changed the title eDNAjoint: an R package for interpreting paired environmental DNA and traditional surveys eDNAjoint: R package for interpreting paired environmental DNA and traditional surveys May 16, 2024
@jooolia
Copy link
Member

jooolia commented May 20, 2024

Thank you for your submission @abigailkeller ! And thanks @mpadge for looking into the issues running the checks on our system.
Based on the pre-submission issue and the checks (minus the failures that Mark is investigating), I think the only package will be ready to pass on to a handing editor. @abigailkeller do you plan on addressing the duplicated function name "jointModel from JM"?
Thanks, Julia

@abigailkeller
Copy link
Author

Hi @jooolia, sounds great!

As of now, I am not planning on changing the duplicated function name, but I'm also definitely open to addressing the duplication if you all suggest. The JM package seems like it is different enough that a user would not have both packages loaded at once, but I also would defer to suggestions from reviewers and editors here.

Thanks!
Abby

@abigailkeller
Copy link
Author

Update: since my vignette compilation time was causing build failures (see issue here), I removed the vignette from the package and instead will direct users/reviewers to a copy of my vignette outside the package in this book. This book is linked in the package DESCRIPTION, as well as in the documentation in all functions.

@jooolia
Copy link
Member

jooolia commented May 29, 2024

@ropensci-review-bot assign @emitanaka as editor

@ropensci-review-bot
Copy link
Collaborator

Assigned! @emitanaka is now the editor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants